Script: Analyze Server Logs
What’s a log analyzer?
Section titled “What’s a log analyzer?”A web server writes one line per request to its access log. After a few days that’s tens of thousands of lines, and the questions you actually want answered (“which endpoints are hot? which IPs hit us hardest? how many errors today?”) are buried in there.
A log analyzer is just a script that reads the file once, groups lines by some key, counts the groups, and prints the top few. Two ingredients carry most of the weight:
awkto pull a field out of each line..wordcount()to count unique values in a list and sort by frequency.
We’ll wrap both behind small named functions so the analyzer’s main
reads as English.
Parse a Common Log Format access log and produce four summaries: the most-requested paths, the distribution of status codes, the top client IPs, and the count of 4xx/5xx error responses.
lash --test analyze-log.lash # the unittests onlylash analyze-log.lash --help # auto-generated helpThe --test flag runs only the unittest { } blocks; the fn main
body is skipped, so the tests don’t need a real log file.
Step 1: Decide what counts as an error
Section titled “Step 1: Decide what counts as an error”A lot of the script’s work comes back to “is this status code an error?” — and that’s a small pure function we should pin down before the rest of the wiring.
/// is_error: true for 4xx and 5xx codes, false otherwiseunittest { is_error("404").must.equal(true) is_error("500").must.equal(true) is_error("200").must.equal(false) is_error("301").must.equal(false)}fn is_error(status) { return status.startsWith("4") || status.startsWith("5")}Run lash --test analyze-log.lash; passes.
Step 2: Pretty-print one wordcount entry
Section titled “Step 2: Pretty-print one wordcount entry”The output for each top-N table is the same shape — <count> <value> — so factor it into a function and test it:
/// format_entry renders a {word, count} object as "<count> <word>"unittest { let row = { word: "/api/users", count: 12045 } format_entry(row).must.equal("12045 /api/users")}fn format_entry(entry) { return "${entry["count"]} ${entry["word"]}"}Tiny function, but pinning it down means the script’s three “top N”
sections all produce identical-looking rows. If a future contributor
swaps the order of count and word, the test catches it.
Step 3: Wire it into a main
Section titled “Step 3: Wire it into a main”The wiring is mostly chains over awk output and for loops over the
top-N results. We don’t unittest main — it touches the filesystem
and shells out to awk and wc. The pure helpers above already have
tests.
fn main(logfile: string, top: int = 10) { let check = `test -f $logfile`.capture if check.isFailure { exit "file not found: $logfile" }
let total = `wc -l < $logfile`.first.trim()
echo "Log Analysis: $logfile" echo "=================================" echo "Total requests: $total" echo ""
echo "Top $top requested paths:" let paths = `awk '{print $7}' $logfile`.wordcount().take(top) for entry in paths { echo " ${format_entry(entry)}" } echo ""
echo "Status codes:" let statuses = `awk '{print $9}' $logfile`.wordcount() for entry in statuses { echo " ${entry["word"]}: ${entry["count"]}" } echo ""
echo "Top $top IP addresses:" let ips = `awk '{print $1}' $logfile`.wordcount().take(top) for entry in ips { echo " ${entry["count"]} requests from ${entry["word"]}" } echo ""
let errors = `awk '{print $9}' $logfile`.filter(x => is_error(x)).length echo "Error responses (4xx/5xx): $errors"}.wordcount() counts occurrences of each unique line and returns a
list of { word, count } objects sorted by frequency. .take(top)
limits the result. The chain reads top-to-bottom: take awk’s output,
group it, keep the most-frequent.
The error-count line uses the function we tested in Step 1:
let errors = `awk '{print $9}' $logfile`.filter(x => is_error(x)).length.length on a list returns the element count — it’s not a method, no
parentheses.
Complete script
Section titled “Complete script”#!/usr/bin/env lash
/// Analyze a web server access log and print a summary.fn main(logfile: string, top: int = 10) { let check = `test -f $logfile`.capture if check.isFailure { exit "file not found: $logfile" }
let total = `wc -l < $logfile`.first.trim()
echo "Log Analysis: $logfile" echo "=================================" echo "Total requests: $total" echo ""
echo "Top $top requested paths:" let paths = `awk '{print $7}' $logfile`.wordcount().take(top) for entry in paths { echo " ${format_entry(entry)}" } echo ""
echo "Status codes:" let statuses = `awk '{print $9}' $logfile`.wordcount() for entry in statuses { echo " ${entry["word"]}: ${entry["count"]}" } echo ""
echo "Top $top IP addresses:" let ips = `awk '{print $1}' $logfile`.wordcount().take(top) for entry in ips { echo " ${entry["count"]} requests from ${entry["word"]}" } echo ""
let errors = `awk '{print $9}' $logfile`.filter(x => is_error(x)).length echo "Error responses (4xx/5xx): $errors"}
fn is_error(status) { return status.startsWith("4") || status.startsWith("5")}
fn format_entry(entry) { return "${entry["count"]} ${entry["word"]}"}
/// is_error: true for 4xx and 5xx codes, false otherwiseunittest { is_error("404").must.equal(true) is_error("500").must.equal(true) is_error("200").must.equal(false) is_error("301").must.equal(false)}
/// format_entry renders a {word, count} object as "<count> <word>"unittest { format_entry({ word: "/api/users", count: 12045 }).must.equal("12045 /api/users")}Run it:
lash --test analyze-log.lash # unittestslash analyze-log.lash --help # helplash analyze-log.lash /var/log/nginx/access.log # default top 10lash analyze-log.lash /var/log/nginx/access.log 20 # top 20Log Analysis: /var/log/nginx/access.log=================================Total requests: 48231
Top 10 requested paths: 12045 /api/v1/users 8923 / 4521 /static/app.js 3102 /api/v1/health ...
Status codes: 200: 41023 304: 3891 404: 2105 500: 212
Top 10 IP addresses: 3421 requests from 10.0.0.1 2918 requests from 10.0.0.5 ...
Error responses (4xx/5xx): 2317The doc-comment plus typed parameters become the --help:
analyze-log.lash — Analyze a web server access log and print a summary.
Arguments: logfile (string) top (int) default: 10What makes this readable
Section titled “What makes this readable”This script replaces what would otherwise be a stack of shell pipelines:
awk '{print $7}' access.log | sort | uniq -c | sort -rn | head -10For each summary, lash’s chain:
`awk '{print $7}' $logfile`.wordcount().take(top)Is the same idea, but every step is a method, every method has a name,
and the pure transforms (is_error, format_entry) get tested
independently. Long chains can be split across lines for readability —
a line starting with . continues the previous statement:
let paths = `awk '{print $7}' $logfile` .wordcount() .take(top)