Kafka Compacted Topic Dumps
Kafka compacted topics keep the latest value for each key logically. A file dump can still contain older values earlier in the file:
user-a|{"id":"a","version":1,"name":"Ada"}
user-b|{"id":"b","version":1,"name":"Bob"}
user-a|{"id":"a","version":2,"name":"Ada Lovelace"}
user-c|null
Here user-c|null is a tombstone. With jetrocli, query only the JSON
payload after the separator and skip tombstones:
jetrocli --ndjson -i users.topic --payload-after '|' -e '$.id'
Latest N Unique Keys
Scan from the tail, keep the first row seen for each logical id, then project only the retained rows:
jetrocli --ndjson -i users.topic --payload-after '|' \
-e '$.rows()
.reverse()
.distinct_by($.id)
.take(100)
.map({id: $.id, version: $.version, name: $.name})'
Why this works:
$.rows()switches from row-local mode to one stream over the file.reverse()starts at the newest records.distinct_by($.id)keeps the first row per key in that reverse order.take(100)stops after 100 retained unique keys.map(...)shapes only the rows that survived selection.
Find One Recent Record
jetrocli --ndjson -i users.topic --payload-after '|' \
-e '$.rows().reverse().find($.id == "user-42").first()'
This can stop as soon as the newest matching record is found.
Keep Only Active Latest Records
Filter before de-duplication when the key should be unique among active rows:
jetrocli --ndjson -i users.topic --payload-after '|' \
-e '$.rows()
.reverse()
.filter($.active)
.distinct_by($.id)
.take(500)
.map({id: $.id, email: $.email})'
If tombstones carry important delete semantics for your workload, use
--null-payload keep and handle null explicitly. The default skip policy is
best when you only want live JSON payloads.