Pipes and Chains

What’s a pipe?

A pipe connects the output of one command to the input of another. You write it with the | symbol:

ls | sort | head -5

Each command runs as a separate process. The first writes to standard output, the second reads from standard input, and so on. Bytes flow left-to-right; nothing is held in memory longer than necessary.

Pipes are the original Unix way to compose work. They’re great for glueing together existing tools — find finds files, grep filters lines, wc counts. Each does one thing.

What’s a chain?

A chain is lash’s functional alternative to a pipe. You wrap a command in backticks and follow it with .method() calls. Each line of the command’s output becomes an element in a list, and the methods operate on the list:

`ls`.filter(x => x.endsWith(".log")).take(5)

The advantages over pipes:

The methods are typed — .filter, .map, .take, .sort work the same way on every list, no surprises.
No extra processes: the chain runs in the lash interpreter.
Lambda syntax (x => ...) gives you arbitrary per-element logic without reaching for awk or sed.
The structure reads like English, top to bottom.

You can mix the two — pipes are great for the I/O bookends; chains are great for the middle logic.

Goal

Build a script that finds the largest files eating your disk space, sorted by size, with readable output. We’ll start with a plain pipe, move to a chain, add tests for the per-line transform, then put it all together as a script.

lash --test disk-hogs.lash      # runs the unittests only
lash disk-hogs.lash --help      # auto-generated help

The --test flag runs only the unittest { } blocks; the fn main body is skipped.

Step 1: A plain pipe

The simplest version uses standard POSIX pipes:

du -sh $HOME/* | sort -rh | head -20

That works, but sort -rh (human-readable sort) isn’t portable across systems, and the output is a wall of text that’s hard to filter further without another awk invocation. The chain form fixes both problems.

Step 2: Switch to a chain at the dot

Wrap the command in backticks and add a dot. Each line of the output becomes a list element:

`du -sh $HOME/*`
    .sortNumeric()
    .reverse()
    .take(20)

The du command still runs as an external process — only the part after the dot changes. Reading the chain top-to-bottom: take du’s output, sort numerically, reverse so the biggest is first, keep the top twenty.

Step 3: Parse and format individual files

du -sh * reports directory totals. For individual files, find is the right tool. Each line comes back as <size_bytes>\t<path>. We need to convert the bytes into a friendlier MB label.

That’s a pure function — exactly the kind of thing to lock down with a test before writing.

/// format_size renders a byte count as an MB label
unittest {
    format_size(1048576).must.equal("1 MB")     // exactly 1 MB
    format_size(1572864).must.equal("1.5 MB")
    format_size(0).must.equal("0 MB")
}

The middle row is the one to think about: 1.5 MB shouldn’t silently become “1 MB” or “2 MB” — the test pins down that we keep the fractional value. Run lash --test disk-hogs.lash — fails (no format_size yet). Now write it:

fn format_size(bytes) {
    let mb = bytes / 1024 / 1024
    return "$mb MB"
}

Re-run; passes.

The other piece is parsing the find -printf line into a record:

/// to_entry parses a "<bytes>\t<path>" line into a structured record
unittest {
    let line = "1048576\t/tmp/big.bin"
    let entry = to_entry(line)
    entry["size_bytes"].must.equal(1048576)
    entry["path"].must.equal("/tmp/big.bin")
}

fn to_entry(line) {
    let parts = line.split("\t")
    return { size_bytes: parts[0].toNumber(), path: parts[1] }
}

Step 4: Wire the chain together

`find $HOME -type f -size +100M -printf "%s\t%p\n" 2>/dev/null`
    .sortNumeric()
    .reverse()
    .take(15)
    .map(to_entry)
    .each(entry => echo "${format_size(entry["size_bytes"])}\t${entry["path"]}")

What each step does:

Step	Purpose
`.sortNumeric()`	Sort lines by the numeric prefix
`.reverse()`	Largest first
`.take(15)`	Stop after 15
`.map(to_entry)`	Convert each line into a record
`.each(entry => …)`	Print each result

.map(), .filter(), and .each() take lambda expressions — anonymous functions written inline with =>:

`ls`.filter(x => x.endsWith(".log"))
`cat counts.txt`.map(x => x.toNumber() * 2)

A lambda body can be a single expression (above), or a block:

.map(x => {
    let parts = x.split("\t")
    let mb = parts[0].toNumber() / 1024 / 1024
    "$mb MB\t${parts[1]}"
})

The last expression in a block lambda is its return value.

Step 5: JSON output for downstream tools

For machine-readable output, swap the per-element echo for a single list write:

let entries = `find $HOME -type f -size +100M -printf "%s\t%p\n" 2>/dev/null`
    .sortNumeric()
    .reverse()
    .take(15)
    .map(to_entry)

if format == "json" {
    entries.writeln(stdout)
} else {
    for entry in entries {
        echo "${format_size(entry["size_bytes"])}\t${entry["path"]}"
    }
}

Chains built from objects serialize as JSON arrays automatically. To go the other direction — parse JSON input from a command that produces it — use the .json accessor:

`cat servers.json`.json["hosts"]
    .filter(x => x["active"] == true)
    .map(x => x["hostname"])
    .each(x => echo $x)

Complete script

#!/usr/bin/env lash

/// Find the largest files under a directory.
fn main(path: string = ".", top: int = 15, min_mb: int = 10, format: string = "text") {
    let entries = `find $path -type f -size +${min_mb}M -printf "%s\t%p\n" 2>/dev/null`
        .sortNumeric()
        .reverse()
        .take(top)
        .map(to_entry)

    if format == "json" {
        entries.writeln(stdout)
    } else {
        for entry in entries {
            echo "${format_size(entry["size_bytes"])}\t${entry["path"]}"
        }
    }
}

fn format_size(bytes) {
    let mb = bytes / 1024 / 1024
    return "$mb MB"
}

fn to_entry(line) {
    let parts = line.split("\t")
    return { size_bytes: parts[0].toNumber(), path: parts[1] }
}

/// format_size renders a byte count as a whole-MB label
unittest {
    format_size(1048576).must.equal("1 MB")
    format_size(1572864).must.equal("1 MB")
    format_size(0).must.equal("0 MB")
}

/// to_entry parses a "<bytes>\t<path>" line into a structured record
unittest {
    let entry = to_entry("1048576\t/tmp/big.bin")
    entry["size_bytes"].must.equal(1048576)
    entry["path"].must.equal("/tmp/big.bin")
}

Run it:

lash --test disk-hogs.lash                       # the unittests
lash disk-hogs.lash --help                       # auto-generated help
lash disk-hogs.lash /home/alice 10 50            # top 10, at least 50 MB
lash disk-hogs.lash . 15 10 json                 # emit JSON

The doc-comment plus typed parameters become the script’s --help:

lash disk-hogs.lash --help

disk-hogs.lash — Find the largest files under a directory.

Arguments:
  path    (string)      default: "."
  top     (int)         default: 15
  min_mb  (int)         default: 10
  format  (string)      default: "text"

Method	Description
`.map(fn)`	Transform each element
`.filter(fn)`	Keep elements where fn returns true
`.each(fn)`	Run fn on each element, return nothing
`.take(n)`	Keep first n elements
`.last(n)`	Keep last n elements
`.sort()`	Lexicographic sort
`.sortNumeric()`	Sort by numeric prefix
`.reverse()`	Reverse order
`.unique()`	Remove duplicates
`.wordcount()`	Count occurrences, return `{word, count}` objects
`.length`	Number of elements
`.trim()`	Strip whitespace (on string result)