Almide Dojo — MSR Dashboard

Modification Survival Rate across LLM models

Pass Rate Over Time

Cumulative Pass Rate by Retry Count (Latest Run)

Does the diagnostic loop converge with more retries? Flat tail = retries no longer help.

Failure Breakdown by Category (Latest Run)

Unrecoverable Diagnostic Codes — backlog for almide compiler (latest day, all models)

Codes that LLMs could not recover from in 3 retries. Each one is a candidate for clearer hint text or compiler fix.

Recoverable Diagnostic Codes — diagnostic worked (latest day, all models)

Codes the model fixed after seeing the diagnostic. These hints already do their job.

Pass Rate by Almide Feature × Model (Latest Run)

Per-Task Results (Latest Run)