See: Schema-Aware Compression with Searchable JSON

kodomonocch1 14 hours ago

SEE (Semantic Entropy Encoding) is a schema-aware JSON compression format. It keeps JSON searchable while compressed, cutting I/O and CPU cost. Benchmarks: ~19.5% of raw size, lookup p50 ≈ 0.18 ms.

Article: [Medium link] Slides: [SpeakerDeck link] GitHub: [see_proto repo link]

Curious about your thoughts — especially from those using Zstd or Parquet in production. What would be your biggest blocker for adopting schema-aware compression?