| Category | Cases | Last run |
|---|---|---|
| startup Small commands where interpreter startup dominates runtime. | 4 | 0.053 ms bash median: 1.662 ms |
| strings String expansion, pattern handling, and text manipulation. | 8 | 0.057 ms bash median: 1.791 ms |
| variables Variable assignment, lookup, expansion, and environment handling. | 8 | 0.058 ms bash median: 1.688 ms |
| arrays Indexed array reads, writes, expansion, and iteration. | 6 | 0.059 ms bash median: 1.713 ms |
| subshell Command substitution and nested shell execution paths. | 6 | 0.061 ms bash median: 3.143 ms |
| arithmetic Integer math, substitutions, and expression-heavy shell snippets. | 6 | 0.062 ms bash median: 1.703 ms |
| pipes Pipeline construction, streaming, and command chaining. | 6 | 0.065 ms bash median: 3.131 ms |
| control Conditionals, loops, case statements, and branching scripts. | 9 | 0.076 ms bash median: 1.711 ms |
Benches
Latest benchmark snapshot
Static aggregate generated from repository result artifacts. Use the linked files for raw measurements and full eval traces.
| Category | Passed | Pass rate |
|---|---|---|
| system_info | 1/2 | 50% tasks passed |
| file_operations | 3/4 | 66.7% tasks passed |
| scripting | 5/7 | 68.6% tasks passed |
| json_processing | 8/8 | 100% tasks passed |
| data_transformation | 6/6 | 100% tasks passed |
| complex_tasks | 6/6 | 100% tasks passed |
| text_processing | 6/6 | 100% tasks passed |
| pipelines | 5/5 | 100% tasks passed |
| Run | Score | Tools |
|---|---|---|
| gpt-5.3-codex 2026-05-26 | 93% | 86.8% |
| gpt-5.5 2026-05-26 | 92.7% | 91.5% |
| claude-opus-4-7 2026-05-26 | 97.8% | 90.3% |
| claude-sonnet-4-6 2026-05-26 | 94% | 91% |
| claude-haiku-4-5-20251001 2026-05-26 | 98.4% | 92.3% |
| claude-sonnet-4-6 2026-02-28 | 92.5% | 85.1% |
Indexes