Almide Dojo — MSR Dashboard
Modification Survival Rate across LLM models
Pass Rate Over Time
Cumulative Pass Rate by Retry Count (Latest Run)
Does the diagnostic loop converge with more retries? Flat tail = retries no longer help.
Failure Breakdown by Category (Latest Run)
Unrecoverable Diagnostic Codes — backlog for almide compiler (latest day, all models)
Codes that LLMs could not recover from in 3 retries. Each one is a candidate for clearer hint text or compiler fix.
Recoverable Diagnostic Codes — diagnostic worked (latest day, all models)
Codes the model fixed after seeing the diagnostic. These hints already do their job.
Pass Rate by Almide Feature × Model (Latest Run)
Per-Task Results (Latest Run)