ts_zip: Text Compression using Large Language Models
SMRTR summary
The ts_zip utility uses a Large Language Model to compress text files more efficiently than conventional tools. It requires a GPU and 4GB RAM, processing up to 1 MB/s on an RTX 4090. While supporting various languages and source code, it's optimized for English texts.
Compression is measured in bits per byte (bpb). For instance, ts_zip compressed the 100MB "enwik8" file to 13.8MB (1.106 bpb), surpassing xz's 24.9MB (1.989 bpb). It employs the RWKV 169M v4 language model, quantized to
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article