Benchmarks
All benchmarks run on an AMD Ryzen 9 9900X 12-Core Processor (24 cores, 121 GB RAM), averaged over 100 runs after 10 warmup iterations. Times in milliseconds (lower is better) unless noted otherwise.
Startup Latency
Section titled “Startup Latency”How fast each shell can execute a command and exit.
shell -c 'true' round-trip
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.24 | 0.24 | 0.05 | — |
| sh | 0.42 | 0.43 | 0.04 | — |
| bash | 0.43 | 0.43 | 0.04 | — |
| zsh | 0.54 | 0.54 | 0.05 | — |
| lash | 0.57 | 0.57 | 0.05 | — |
| fish | 5.75 | 5.77 | 1.26 | — |
shell -c 'echo x' round-trip
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.22 | 0.22 | 0.06 | — |
| bash | 0.41 | 0.40 | 0.08 | — |
| sh | 0.42 | 0.41 | 0.10 | — |
| zsh | 0.52 | 0.52 | 0.08 | — |
| lash | 0.60 | 0.57 | 0.15 | — |
| fish | 6.41 | 6.35 | 1.63 | — |
shell -c 'echo x | cat' round-trip
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.50 | 0.21 | 0.03 | — |
| lash | 0.73 | 0.60 | 0.04 | — |
| sh | 0.75 | 0.39 | 0.05 | — |
| bash | 0.76 | 0.40 | 0.06 | — |
| zsh | 0.86 | 0.52 | 0.10 | — |
| fish | 6.07 | 5.78 | 0.57 | — |
Pipe Throughput
Section titled “Pipe Throughput”Raw data throughput through pipes. MB/s charts are higher-is-better.
64MB through single pipe, minimal output
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.44 | 0.60 | 0.06 | 145125 |
| bash | 6.87 | 0.47 | 0.30 | 9315 |
| dash | 7.23 | 0.35 | 0.61 | 8847 |
| sh | 7.32 | 0.53 | 0.55 | 8740 |
| lash | 7.42 | 0.67 | 0.54 | 8623 |
| zsh | 7.64 | 0.62 | 0.60 | 8376 |
| fish | 12.16 | 5.63 | 0.63 | 5264 |
64MB through 3 cat stages
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.45 | 0.61 | 0.02 | 142222 |
| sh | 7.73 | 0.48 | 0.62 | 8277 |
| dash | 7.90 | 0.33 | 0.47 | 8101 |
| bash | 7.93 | 0.50 | 0.53 | 8072 |
| lash | 8.16 | 0.69 | 0.43 | 7842 |
| zsh | 8.54 | 0.72 | 2.69 | 7498 |
| fish | 14.02 | 6.88 | 0.66 | 4566 |
16MB output streamed to sink
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 2.28 | 0.63 | 0.32 | 7027 |
| dash | 2.34 | 0.26 | 0.34 | 6845 |
| lash | 2.55 | 0.63 | 0.25 | 6270 |
| bash | 2.63 | 0.47 | 0.14 | 6091 |
| zsh | 2.74 | 0.58 | 0.29 | 5835 |
| sh | 2.81 | 0.49 | 0.30 | 5689 |
| fish | 8.44 | 6.28 | 0.48 | 1896 |
echo|cat round-trip latency — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.45 | 0.21 | 0.06 | — |
| lash-turbo | 0.63 | 0.55 | 0.14 | — |
| bash | 0.69 | 0.39 | 0.07 | — |
| sh | 0.69 | 0.39 | 0.08 | — |
| lash | 0.71 | 0.60 | 0.10 | — |
| zsh | 0.84 | 0.51 | 0.07 | — |
| fish | 5.94 | 5.71 | 0.41 | — |
1GB through single pipe, minimal output
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.45 | 0.61 | 0.04 | 2255507 |
| dash | 121.57 | 0.52 | 9.62 | 8423 |
| lash | 126.99 | 1.17 | 6.18 | 8064 |
| fish | 130.03 | 8.01 | 11.16 | 7875 |
| sh | 155.11 | 0.96 | 10.01 | 6602 |
| bash | 157.26 | 1.12 | 9.26 | 6511 |
| zsh | 167.53 | 1.38 | 6.34 | 6112 |
seq 1M | sort | tail — turbo 7.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 38.10 | 1.29 | 1.64 | — |
| zsh | 288.38 | 0.82 | 5.70 | — |
| sh | 289.07 | 0.71 | 7.00 | — |
| lash | 290.07 | 0.99 | 6.53 | — |
| dash | 292.85 | 0.46 | 2.98 | — |
| bash | 302.89 | 1.05 | 9.46 | — |
| fish | 318.90 | 10.70 | 3.90 | — |
echo through 5 cat stages — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.85 | 0.33 | 0.14 | — |
| lash | 1.02 | 0.67 | 0.46 | — |
| lash-turbo | 1.04 | 0.67 | 0.29 | — |
| sh | 1.12 | 0.50 | 0.16 | — |
| bash | 1.14 | 0.50 | 0.21 | — |
| zsh | 1.34 | 0.54 | 0.26 | — |
| fish | 9.33 | 8.18 | 2.21 | — |
echo through 10 cat stages — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.82 | 0.24 | 0.14 | — |
| lash-turbo | 1.10 | 0.61 | 0.09 | — |
| lash | 1.12 | 0.62 | 0.08 | — |
| bash | 1.14 | 0.46 | 0.09 | — |
| sh | 1.15 | 0.44 | 0.11 | — |
| zsh | 1.78 | 0.55 | 0.16 | — |
| fish | 7.19 | 5.95 | 0.58 | — |
sort 100K lines — turbo 52.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.49 | 0.67 | 0.01 | — |
| dash | 25.52 | 0.39 | 0.33 | — |
| sh | 25.63 | 0.57 | 0.28 | — |
| bash | 25.80 | 0.55 | 0.24 | — |
| lash | 25.83 | 0.79 | 0.16 | — |
| zsh | 26.03 | 0.69 | 0.48 | — |
| fish | 31.82 | 6.26 | 0.55 | — |
sort | head from 100K lines — turbo 9.5x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 2.48 | 0.63 | 0.09 | — |
| dash | 23.04 | 0.37 | 0.26 | — |
| sh | 23.37 | 0.55 | 0.12 | — |
| bash | 23.45 | 0.59 | 0.25 | — |
| lash | 23.61 | 0.79 | 0.16 | — |
| zsh | 23.95 | 0.72 | 0.32 | — |
| fish | 29.79 | 6.33 | 0.72 | — |
sort | tail from 100K lines — turbo 6.2x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 4.11 | 0.61 | 0.05 | — |
| dash | 25.43 | 0.37 | 0.19 | — |
| lash | 25.65 | 0.74 | 0.63 | — |
| bash | 25.87 | 0.56 | 0.12 | — |
| zsh | 25.93 | 0.79 | 0.19 | — |
| sh | 26.76 | 0.65 | 0.47 | — |
| fish | 31.14 | 5.88 | 0.59 | — |
grep | sort | head from 100K — turbo 5.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 1.73 | 0.61 | 0.02 | — |
| lash | 10.02 | 0.63 | 0.14 | — |
| dash | 10.13 | 0.37 | 0.13 | — |
| sh | 10.25 | 0.49 | 0.21 | — |
| bash | 10.26 | 0.53 | 0.09 | — |
| zsh | 10.54 | 0.65 | 0.17 | — |
| fish | 16.29 | 6.58 | 0.16 | — |
16MB through 1 cat stage
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.42 | 0.57 | 0.03 | 37736 |
| dash | 2.51 | 0.27 | 0.26 | 6376 |
| bash | 2.79 | 0.52 | 0.35 | 5742 |
| zsh | 2.93 | 0.63 | 0.38 | 5456 |
| lash | 2.94 | 0.71 | 0.26 | 5451 |
| sh | 3.02 | 0.53 | 0.39 | 5304 |
| fish | 8.53 | 6.00 | 0.83 | 1877 |
16MB through 2 cat stages
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.49 | 0.64 | 0.01 | 32922 |
| dash | 2.83 | 0.36 | 0.14 | 5652 |
| zsh | 3.24 | 0.66 | 0.27 | 4940 |
| bash | 3.30 | 0.56 | 0.28 | 4853 |
| lash | 3.33 | 0.70 | 0.39 | 4811 |
| sh | 3.44 | 0.55 | 0.40 | 4654 |
| fish | 8.58 | 6.51 | 0.54 | 1864 |
16MB through 4 cat stages
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.44 | 0.62 | 0.08 | 36364 |
| dash | 3.04 | 0.35 | 0.43 | 5269 |
| lash | 3.08 | 0.65 | 0.20 | 5189 |
| sh | 3.33 | 0.52 | 0.22 | 4812 |
| bash | 3.41 | 0.56 | 0.17 | 4698 |
| zsh | 3.46 | 0.64 | 0.22 | 4623 |
| fish | 9.28 | 6.10 | 0.91 | 1724 |
16MB through 8 cat stages
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.44 | 0.58 | 0.03 | 36199 |
| dash | 3.40 | 0.35 | 0.21 | 4713 |
| sh | 3.75 | 0.52 | 0.41 | 4269 |
| bash | 3.75 | 0.54 | 0.15 | 4263 |
| lash | 3.81 | 0.72 | 0.32 | 4200 |
| zsh | 4.22 | 0.66 | 0.24 | 3788 |
| fish | 10.15 | 6.92 | 0.63 | 1576 |
16MB through 16 cat stages
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.50 | 0.64 | 0.03 | 32258 |
| dash | 5.90 | 0.36 | 0.45 | 2711 |
| bash | 6.04 | 0.61 | 0.37 | 2649 |
| lash | 6.19 | 0.76 | 0.24 | 2583 |
| sh | 6.32 | 0.55 | 0.71 | 2531 |
| zsh | 7.39 | 0.72 | 0.80 | 2164 |
| fish | 12.05 | 6.67 | 0.74 | 1328 |
16MB direct write to file
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 2.14 | 0.25 | 0.41 | 7468 |
| zsh | 2.56 | 0.56 | 0.10 | 6261 |
| sh | 2.67 | 0.47 | 0.31 | 5991 |
| lash-turbo | 2.85 | 0.64 | 0.34 | 5619 |
| bash | 2.88 | 0.45 | 0.35 | 5560 |
| lash | 2.94 | 0.61 | 0.38 | 5435 |
| fish | 9.02 | 6.06 | 0.76 | 1774 |
16MB through pipe then to file
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 3.44 | 0.72 | 0.27 | 4648 |
| dash | 3.48 | 0.36 | 0.29 | 4599 |
| zsh | 4.00 | 0.66 | 0.23 | 4005 |
| bash | 4.06 | 0.57 | 0.24 | 3939 |
| lash | 4.22 | 0.75 | 0.34 | 3787 |
| sh | 4.23 | 0.54 | 0.31 | 3784 |
| fish | 9.23 | 6.10 | 0.66 | 1733 |
16MB read from file through pipe
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 3.54 | 0.65 | 0.55 | 4513 |
| lash | 5.70 | 0.73 | 0.63 | 2808 |
| dash | 5.82 | 0.32 | 0.48 | 2748 |
| bash | 6.03 | 0.55 | 0.37 | 2654 |
| sh | 6.23 | 0.51 | 0.35 | 2568 |
| zsh | 6.46 | 0.70 | 0.56 | 2479 |
| fish | 14.02 | 6.84 | 1.29 | 1141 |
16MB to /dev/null (overhead baseline)
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.56 | 0.26 | 0.49 | 28319 |
| bash | 0.83 | 0.44 | 0.02 | 19196 |
| lash | 0.84 | 0.61 | 0.02 | 19025 |
| sh | 0.87 | 0.45 | 0.05 | 18486 |
| zsh | 0.87 | 0.56 | 0.04 | 18401 |
| lash-turbo | 0.88 | 0.63 | 0.28 | 18223 |
| fish | 6.79 | 6.26 | 1.28 | 2356 |
Scripting Operations
Section titled “Scripting Operations”Common data-processing patterns across shells.
Sort 1000 lines (reverse numeric) — turbo 2.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.44 | 0.63 | 0.02 | — |
| dash | 0.71 | 0.26 | 0.04 | — |
| lash | 0.90 | 0.63 | 0.12 | — |
| bash | 0.94 | 0.44 | 0.52 | — |
| sh | 0.97 | 0.45 | 0.46 | — |
| zsh | 1.13 | 0.57 | 0.12 | — |
| fish | 6.66 | 5.99 | 1.58 | — |
Sort 10000 lines (reverse numeric) — turbo 5.5x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.52 | 0.69 | 0.03 | — |
| dash | 2.83 | 0.28 | 0.22 | — |
| lash | 2.85 | 0.62 | 0.33 | — |
| bash | 2.97 | 0.44 | 0.43 | — |
| sh | 3.00 | 0.46 | 0.42 | — |
| zsh | 3.07 | 0.55 | 0.08 | — |
| fish | 11.19 | 6.74 | 1.73 | — |
Filter odd-ending numbers from 1K via grep — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.79 | 0.59 | 0.03 | — |
| lash | 0.82 | 0.61 | 0.04 | — |
| bash | 0.85 | 0.44 | 0.03 | — |
| sh | 0.87 | 0.43 | 0.13 | — |
| dash | 0.88 | 0.33 | 0.13 | — |
| zsh | 1.09 | 0.57 | 0.24 | — |
| fish | 7.42 | 6.88 | 1.00 | — |
Filter even numbers from 10K via awk — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 1.85 | 0.26 | 0.24 | — |
| lash | 1.98 | 0.63 | 0.10 | — |
| sh | 1.99 | 0.45 | 0.31 | — |
| bash | 2.00 | 0.44 | 0.07 | — |
| lash-turbo | 2.00 | 0.63 | 0.09 | — |
| zsh | 2.24 | 0.56 | 0.39 | — |
| fish | 9.46 | 7.51 | 1.10 | — |
Transform (x2) 1K lines via awk — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.96 | 0.26 | 0.04 | — |
| lash | 1.15 | 0.61 | 0.16 | — |
| lash-turbo | 1.15 | 0.62 | 0.04 | — |
| sh | 1.17 | 0.43 | 0.10 | — |
| bash | 1.19 | 0.45 | 0.12 | — |
| zsh | 1.33 | 0.58 | 0.47 | — |
| fish | 9.00 | 7.30 | 2.84 | — |
Transform (x2) 10K lines via awk — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 2.10 | 0.28 | 0.78 | — |
| lash | 2.21 | 0.62 | 0.13 | — |
| lash-turbo | 2.23 | 0.63 | 0.08 | — |
| sh | 2.36 | 0.45 | 0.38 | — |
| bash | 2.45 | 0.45 | 0.79 | — |
| zsh | 2.68 | 0.70 | 0.40 | — |
| fish | 9.13 | 6.53 | 1.86 | — |
Sort 1K then take first 10 (sort | head) — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.74 | 0.27 | 0.22 | — |
| lash-turbo | 0.81 | 0.78 | 0.38 | — |
| lash | 0.88 | 0.61 | 0.04 | — |
| bash | 0.93 | 0.43 | 0.03 | — |
| sh | 0.97 | 0.44 | 0.25 | — |
| zsh | 1.09 | 0.56 | 0.04 | — |
| fish | 8.31 | 7.64 | 1.67 | — |
Filter even then double from 1K (awk combo) — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 1.00 | 0.26 | 0.03 | — |
| lash-turbo | 1.16 | 0.62 | 0.26 | — |
| lash | 1.23 | 0.67 | 0.13 | — |
| bash | 1.26 | 0.47 | 0.03 | — |
| sh | 1.33 | 0.46 | 0.13 | — |
| zsh | 1.41 | 0.56 | 0.11 | — |
| fish | 7.38 | 6.54 | 2.25 | — |
Filter+sort+take pipeline from 1K — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 1.06 | 0.26 | 0.12 | — |
| lash-turbo | 1.08 | 0.60 | 0.04 | — |
| lash | 1.22 | 0.63 | 0.33 | — |
| bash | 1.28 | 0.46 | 0.14 | — |
| sh | 1.28 | 0.44 | 0.09 | — |
| zsh | 1.49 | 0.58 | 0.35 | — |
| fish | 7.36 | 6.06 | 2.12 | — |
Substring grep '42' in 10K lines — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.68 | 0.27 | 0.02 | — |
| lash | 0.82 | 0.62 | 0.15 | — |
| lash-turbo | 0.86 | 0.60 | 0.05 | — |
| bash | 0.89 | 0.45 | 0.35 | — |
| sh | 0.90 | 0.45 | 0.04 | — |
| zsh | 1.04 | 0.57 | 0.16 | — |
| fish | 7.73 | 6.14 | 1.83 | — |
Sort 1M lines (reverse numeric) — turbo 14.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 20.15 | 0.93 | 0.49 | — |
| bash | 299.44 | 0.72 | 3.82 | — |
| lash | 299.79 | 1.07 | 2.52 | — |
| dash | 299.98 | 0.64 | 2.66 | — |
| sh | 300.51 | 0.79 | 5.20 | — |
| zsh | 305.48 | 1.01 | 5.29 | — |
| fish | 321.15 | 10.48 | 2.79 | — |
Filter lines starting with even digit from 1M — turbo 20.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.58 | 0.79 | 0.03 | — |
| dash | 11.20 | 0.35 | 0.26 | — |
| bash | 11.48 | 0.49 | 0.20 | — |
| lash | 11.59 | 0.75 | 0.82 | — |
| sh | 11.61 | 0.58 | 1.26 | — |
| zsh | 12.16 | 0.66 | 1.50 | — |
| fish | 20.48 | 8.89 | 2.46 | — |
Prepend prefix to 1M lines via sed — turbo 75.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.47 | 0.61 | 0.04 | — |
| dash | 35.28 | 0.55 | 0.76 | — |
| bash | 35.40 | 1.05 | 1.12 | — |
| zsh | 35.50 | 1.03 | 1.66 | — |
| lash | 35.52 | 0.93 | 0.73 | — |
| sh | 38.32 | 0.68 | 1.31 | — |
| fish | 44.06 | 9.11 | 2.36 | — |
Sort+head+sort+tail pipeline from 100K — turbo 5.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 7.02 | 0.69 | 0.20 | — |
| dash | 38.79 | 0.41 | 0.34 | — |
| sh | 39.58 | 0.66 | 1.14 | — |
| zsh | 39.61 | 0.75 | 0.53 | — |
| lash | 40.71 | 1.23 | 0.78 | — |
| fish | 44.53 | 6.14 | 1.05 | — |
| bash | 45.04 | 0.86 | 7.15 | — |
5-stage pipeline: grep+sort+head+sort+wc on 500K — turbo 4.3x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 23.57 | 0.94 | 2.15 | — |
| lash | 101.64 | 0.99 | 2.73 | — |
| bash | 102.22 | 0.61 | 0.58 | — |
| sh | 102.27 | 0.64 | 0.42 | — |
| zsh | 102.79 | 0.76 | 1.09 | — |
| dash | 103.83 | 0.42 | 1.07 | — |
| fish | 110.06 | 6.97 | 0.76 | — |
Generate+sort+uniq+sort pipeline from 100K — turbo 8.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 22.54 | 0.85 | 0.81 | — |
| lash | 197.89 | 1.12 | 0.93 | — |
| bash | 200.91 | 0.74 | 4.42 | — |
| sh | 201.50 | 0.94 | 6.46 | — |
| dash | 202.53 | 2.10 | 3.46 | — |
| zsh | 205.52 | 1.26 | 3.10 | — |
| fish | 211.39 | 8.78 | 5.29 | — |
100k small lines through pipe — turbo 2.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.42 | 0.59 | 0.02 | — |
| dash | 0.90 | 0.24 | 0.07 | — |
| lash | 1.09 | 0.59 | 0.05 | — |
| bash | 1.11 | 0.43 | 0.03 | — |
| sh | 1.12 | 0.43 | 0.03 | — |
| zsh | 1.28 | 0.55 | 0.03 | — |
| fish | 6.70 | 5.62 | 0.53 | — |
1M small lines through pipe — turbo 10.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.43 | 0.59 | 0.04 | — |
| dash | 4.14 | 0.26 | 0.09 | — |
| lash | 4.29 | 0.61 | 0.10 | — |
| sh | 4.35 | 0.44 | 0.06 | — |
| bash | 4.41 | 0.44 | 4.56 | — |
| zsh | 4.50 | 0.55 | 0.18 | — |
| fish | 10.48 | 6.04 | 0.79 | — |
100k lines through grep filter — turbo 4.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.48 | 0.64 | 0.01 | — |
| dash | 1.84 | 0.27 | 0.08 | — |
| lash | 1.95 | 0.61 | 0.48 | — |
| sh | 2.03 | 0.45 | 0.09 | — |
| bash | 2.10 | 0.46 | 0.06 | — |
| zsh | 2.25 | 0.59 | 0.07 | — |
| fish | 8.00 | 5.70 | 0.54 | — |
single fork+exec (no pipe) — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.30 | 0.30 | 0.06 | — |
| bash | 0.44 | 0.47 | 0.04 | — |
| sh | 0.44 | 0.46 | 0.04 | — |
| zsh | 0.59 | 0.60 | 0.05 | — |
| lash-turbo | 0.61 | 0.61 | 0.04 | — |
| lash | 0.63 | 0.62 | 0.06 | — |
| fish | 5.69 | 5.68 | 0.46 | — |
2-stage no-op pipe setup — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.40 | 0.31 | 0.04 | — |
| lash-turbo | 0.60 | 0.56 | 0.05 | — |
| bash | 0.62 | 0.49 | 0.04 | — |
| sh | 0.62 | 0.49 | 0.04 | — |
| lash | 0.66 | 0.62 | 0.05 | — |
| zsh | 0.69 | 0.60 | 0.07 | — |
| fish | 5.82 | 5.77 | 0.58 | — |
5-stage no-op pipe setup — turbo 0.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.40 | 0.24 | 0.20 | — |
| bash | 0.77 | 0.49 | 0.05 | — |
| sh | 0.77 | 0.50 | 0.06 | — |
| lash | 0.78 | 0.61 | 0.04 | — |
| zsh | 0.81 | 0.55 | 0.14 | — |
| lash-turbo | 0.84 | 0.65 | 0.05 | — |
| fish | 5.83 | 5.82 | 0.77 | — |
10-stage no-op pipe setup — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 0.51 | 0.24 | 0.04 | — |
| sh | 0.79 | 0.42 | 0.12 | — |
| bash | 0.82 | 0.43 | 0.18 | — |
| lash | 0.99 | 0.59 | 0.05 | — |
| lash-turbo | 1.00 | 0.64 | 0.05 | — |
| zsh | 1.03 | 0.53 | 0.12 | — |
| fish | 5.77 | 5.72 | 0.56 | — |
Turbo Mode (lash vs lash-turbo)
Section titled “Turbo Mode (lash vs lash-turbo)”Turbo mode rewrites common pipelines into native array operations — no fork/exec overhead.
sort 1K lines — turbo 2.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.47 | 0.62 | 0.05 | — |
| lash | 0.94 | 0.63 | 0.03 | — |
sort -n 1K lines — turbo 2.3x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.40 | 0.54 | 0.05 | — |
| lash | 0.94 | 0.63 | 0.02 | — |
sort -rn 1K lines — turbo 1.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.49 | 0.66 | 0.03 | — |
| lash | 0.95 | 0.63 | 0.06 | — |
grep pattern from 1K lines — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.82 | 0.61 | 0.15 | — |
| lash | 0.92 | 0.68 | 0.11 | — |
grep -v pattern from 1K lines — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.90 | 0.64 | 0.06 | — |
| lash | 0.94 | 0.65 | 0.10 | — |
head -10 from 1K lines — turbo 2.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.39 | 0.56 | 0.05 | — |
| lash | 0.83 | 0.67 | 0.09 | — |
tail -10 from 1K lines — turbo 1.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.47 | 0.63 | 0.02 | — |
| lash | 0.81 | 0.65 | 0.19 | — |
uniq 1K sorted lines — turbo 1.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.91 | 0.56 | 0.05 | — |
| lash | 0.96 | 0.63 | 0.04 | — |
tac (reverse) 1K lines — turbo 1.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.47 | 0.62 | 0.03 | — |
| lash | 0.84 | 0.65 | 0.06 | — |
wc -l count 1K lines — turbo 1.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.46 | 0.62 | 0.01 | — |
| lash | 0.85 | 0.63 | 0.03 | — |
sort | head -10 from 1K — turbo 1.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.61 | 0.54 | 0.04 | — |
| lash | 1.00 | 0.63 | 0.06 | — |
grep | sort | tail from 1K — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.92 | 0.69 | 0.04 | — |
| lash | 0.94 | 0.62 | 0.05 | — |
sort | uniq from 1K — turbo 1.4x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.64 | 0.56 | 0.08 | — |
| lash | 0.90 | 0.63 | 0.05 | — |
sort | head -10 from 10K — turbo 2.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.87 | 0.57 | 0.07 | — |
| lash | 2.26 | 0.65 | 0.07 | — |
sort | tail -10 from 10K — turbo 2.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.87 | 0.57 | 0.01 | — |
| lash | 2.42 | 0.63 | 0.05 | — |
sort -n | head -10 from 10K — turbo 3.4x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.75 | 0.56 | 0.02 | — |
| lash | 2.57 | 0.63 | 0.05 | — |
sort -n | tail -10 from 10K — turbo 3.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.92 | 0.56 | 0.06 | — |
| lash | 2.78 | 0.64 | 0.06 | — |
sort -r | head -10 from 10K — turbo 2.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.86 | 0.55 | 0.02 | — |
| lash | 2.41 | 0.63 | 0.06 | — |
sort -r | tail -10 from 10K — turbo 3.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.84 | 0.55 | 0.08 | — |
| lash | 2.62 | 0.62 | 0.08 | — |
sort -rn | head -10 from 10K — turbo 3.1x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.91 | 0.55 | 0.02 | — |
| lash | 2.78 | 0.65 | 0.04 | — |
sort -rn | tail -10 from 10K — turbo 3.2x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.94 | 0.67 | 0.12 | — |
| lash | 3.04 | 0.69 | 0.17 | — |
sort | head -10 from 100K — turbo 5.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 3.51 | 0.60 | 0.11 | — |
| lash | 19.97 | 0.75 | 0.43 | — |
sort -n | head -10 from 100K — turbo 9.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 2.39 | 0.61 | 0.17 | — |
| lash | 23.23 | 0.72 | 0.27 | — |
sort -rn | head -10 from 100K — turbo 6.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 4.09 | 0.61 | 0.08 | — |
| lash | 24.70 | 0.80 | 0.20 | — |
sort | tail -10 from 100K — turbo 6.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 3.44 | 0.61 | 0.07 | — |
| lash | 22.65 | 0.73 | 0.23 | — |
grep | head -5 from 10K (early term) — turbo 1.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.50 | 0.64 | 0.02 | — |
| lash | 0.97 | 0.64 | 0.04 | — |
grep -v | head -5 from 10K (early term) — turbo 2.3x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.40 | 0.55 | 0.02 | — |
| lash | 0.91 | 0.66 | 0.09 | — |
grep | tail -5 from 10K (ring buffer) — turbo 1.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.41 | 0.57 | 0.12 | — |
| lash | 0.78 | 0.56 | 0.06 | — |
grep | wc -l from 10K (count) — turbo 1.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.50 | 0.71 | 0.14 | — |
| lash | 0.85 | 0.62 | 0.18 | — |
grep -v | wc -l from 10K (count) — turbo 1.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.52 | 0.68 | 0.11 | — |
| lash | 0.90 | 0.57 | 0.17 | — |
grep | head -5 from 100K (early term) — turbo 2.7x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.40 | 0.55 | 0.01 | — |
| lash | 1.08 | 0.60 | 0.12 | — |
grep | tail -5 from 100K (ring buffer) — turbo 2.8x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.48 | 0.62 | 0.01 | — |
| lash | 1.34 | 0.69 | 0.11 | — |
grep | wc -l from 100K (count) — turbo 2.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.44 | 0.61 | 0.07 | — |
| lash | 1.26 | 0.60 | 0.02 | — |
tac | head -10 from 10K (rewrite) — turbo 1.3x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.67 | 0.57 | 0.05 | — |
| lash | 0.88 | 0.65 | 0.04 | — |
tac | tail -10 from 10K (rewrite) — turbo 2.2x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.41 | 0.55 | 0.05 | — |
| lash | 0.92 | 0.64 | 0.06 | — |
sort 10K lines — turbo 5.9x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.41 | 0.58 | 0.04 | — |
| lash | 2.41 | 0.64 | 0.08 | — |
grep pattern from 10K lines — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash | 0.92 | 0.64 | 0.06 | — |
| lash-turbo | 0.93 | 0.65 | 0.05 | — |
sort | grep | head from 10K — turbo 2.3x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 1.28 | 0.58 | 0.06 | — |
| lash | 2.91 | 0.65 | 0.08 | — |
single true (no-op baseline) — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash | 0.62 | 0.62 | 0.05 | — |
| lash-turbo | 0.62 | 0.62 | 0.06 | — |
2-stage true pipe (not optimizable) — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 0.70 | 0.62 | 0.08 | — |
| lash | 0.72 | 0.62 | 0.05 | — |
10-stage true pipe (not optimizable) — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash | 1.05 | 0.64 | 0.08 | — |
| lash-turbo | 1.06 | 0.64 | 0.09 | — |
awk pipe (not optimizable) — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 1.18 | 0.63 | 0.13 | — |
| lash | 1.20 | 0.67 | 0.10 | — |
100k lines through grep filter — turbo 1.0x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash | 1.25 | 0.57 | 0.04 | — |
| lash-turbo | 1.29 | 0.74 | 0.06 | — |
sort+head | awk | sort+head from 100K — turbo 4.6x vs forked
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-turbo | 5.54 | 0.61 | 0.16 | — |
| lash | 25.21 | 0.78 | 1.30 | — |
Prompt Rendering
Section titled “Prompt Rendering”Prompt rendering latency with starship.
starship prompt render
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| bash | 5.85 | 0.51 | 0.57 | — |
| lash | 5.92 | 0.70 | 0.80 | — |
| dash | 6.23 | 0.35 | 0.77 | — |
| zsh | 6.65 | 0.70 | 0.72 | — |
| sh | 7.00 | 0.62 | 0.91 | — |
| fish | 13.99 | 7.56 | 1.67 | — |
starship prompt in git repo
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| dash | 3.46 | 0.32 | 0.50 | — |
| bash | 3.76 | 0.54 | 0.67 | — |
| sh | 3.85 | 0.50 | 0.63 | — |
| zsh | 3.92 | 0.63 | 0.53 | — |
| lash | 4.55 | 0.76 | 0.66 | — |
| fish | 11.06 | 7.03 | 1.61 | — |
Protocol Overhead
Section titled “Protocol Overhead”Internal protocol performance (lash-direct only).
10k small lines (protocol batching stress)
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 0.20 | 0.14 | 0.02 | — |
16 x 1MB writes (large chunk throughput)
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 2.89 | 0.18 | 0.49 | 5541 |
small lines then bulk data burst
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 1.98 | 0.15 | 0.25 | — |
16MB bulk data delivered to client
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 2.10 | 0.16 | 0.23 | 7628 |
1GB bulk data delivered to client
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 104.57 | 0.29 | 9.45 | 9792 |
~1MB as individual lines to client
| shell | median (ms) | startup (ms) | stddev (ms) | throughput (MB/s) |
|---|---|---|---|---|
| lash-direct | 0.62 | 0.16 | 0.07 | — |
How Turbo Mode Works
Section titled “How Turbo Mode Works”Turbo mode applies these optimizations automatically:
- Passthrough stripping — removes identity operations so they never execute
- Numeric sort key pre-computation — pre-computes keys in O(N) instead of parsing inside the comparator at O(N log N)
- Streaming
wc -l— counts newlines in the byte stream without collecting lines - C
strtodfor numeric conversion — calls C’sstrtoddirectly, avoiding D’sto!doubleexception overhead - Fused operations —
grep | head,grep | tail, andgrep | wcrun in a single pass over the data
Running Benchmarks
Section titled “Running Benchmarks”dub run :benchmarksOptions
Section titled “Options”| Flag | Description |
|---|---|
--runs N | Number of iterations per scenario |
--warmup N | Warmup iterations before measurement |
--scenario S | Run only the named scenario |
--json | Output results in JSON format |
--verbose | Print per-iteration timings |
To reproduce these numbers:
dub run :benchmarks -- --runs 100 --warmup 10 --verbose