Kafka Compacted Topic Dumps

Kafka compacted topics keep the latest value for each key logically. A file dump can still contain older values earlier in the file:

user-a|{"id":"a","version":1,"name":"Ada"}
user-b|{"id":"b","version":1,"name":"Bob"}
user-a|{"id":"a","version":2,"name":"Ada Lovelace"}
user-c|null

Here user-c|null is a tombstone. With jetrocli, query only the JSON payload after the separator and skip tombstones:

jetrocli --ndjson -i users.topic --payload-after '|' -e '$.id'

Latest N Unique Keys

Scan from the tail, keep the first row seen for each logical id, then project only the retained rows:

jetrocli --ndjson -i users.topic --payload-after '|' \
  -e '$.rows()
    .reverse()
    .distinct_by($.id)
    .take(100)
    .map({id: $.id, version: $.version, name: $.name})'

Why this works:

$.rows() switches from row-local mode to one stream over the file.
reverse() starts at the newest records.
distinct_by($.id) keeps the first row per key in that reverse order.
take(100) stops after 100 retained unique keys.
map(...) shapes only the rows that survived selection.

Find One Recent Record

jetrocli --ndjson -i users.topic --payload-after '|' \
  -e '$.rows().reverse().find($.id == "user-42").first()'

This can stop as soon as the newest matching record is found.

Keep Only Active Latest Records

Filter before de-duplication when the key should be unique among active rows:

jetrocli --ndjson -i users.topic --payload-after '|' \
  -e '$.rows()
    .reverse()
    .filter($.active)
    .distinct_by($.id)
    .take(500)
    .map({id: $.id, email: $.email})'

If tombstones carry important delete semantics for your workload, use --null-payload keep and handle null explicitly. The default skip policy is best when you only want live JSON payloads.

The Jetro Book

Kafka Compacted Topic Dumps

Latest N Unique Keys

Find One Recent Record

Keep Only Active Latest Records