v0.1.1 released on crates.io

Lossless LLM memory compression for state in motion.

QATQ compresses exported KV caches and tensor migration blocks into portable QATC artefacts, restores them bit-for-bit, and has proven smaller transfer sizes than raw, zstd, and lz4 on documented public and live-migration evidence.

Install with Cargo Read the docs

Restore: bit-for-bit
Live proof: 72.2%
Crate: 0.1.1
Licence: Apache

qatq encode

# exported KV tensor bytes in, lossless QATC out
qatq encode-chunked \
  --mode qatq-exact \
  --dtype bf16 \
  --max-values-per-chunk 65536 \
  kv-cache.bf16le kv-cache.qatc

# restore the same native bytes for runtime import
qatq decode kv-cache.qatc restored.bf16le

QATC v2 checksum validated

Corrupt or truncated payloads are rejected before restore.

Quaternion-chain path

Selected only when it beats simpler lossless transforms.

Measured evidence

A simple comparison table, with the claim scoped correctly.

The strongest production-facing result today is lossless exported-state compression. In the documented cloud live-migration proof, QATQ transferred fewer bytes than raw blocks, lz4, and zstd while preserving measured task continuation.

Live migration compression comparison between raw, lz4, zstd and QATQ
Codec	Transferred bytes	Ratio vs raw	Reduction	Bit-for-bit restore
Raw streamed blocks	50,331,648	1.0000	baseline	yes
LZ4 baseline	28,739,217	0.5709	43.0% smaller	yes
Zstd baseline	20,405,381	0.4054	59.5% smaller	yes
QATQ	14,004,990	0.2783	72.2% smaller	yes

Source: QATQ external runtime evidence, 2026-06-22 cloud live-migration proof. This proves the measured integration path, not a universal claim across every model, runtime, context length, dtype, or chunk layout.

How it works

Tensor-aware strategy search, wrapped in a portable QATC container.

QATQ is not a transparent GPU memory layer yet. It is a production-shaped codec for exported tensor bytes: f32, f16, and bf16 data that can be chunked, compressed, transferred, checked, and restored bit-for-bit.

Exported tensors

KV caches, migration blocks, and fixture captures leave the runtime as typed little-endian bytes.

Lossless candidates

QATQ chooses among raw, byte-plane, zstd-backed, delta-XOR, and reversible quaternion-chain candidates.

QATC transport

Large tensors are split into bounded chunks with ordered lengths and an aggregate checksum.

Bit-identical restore

Decode returns the same native tensor bytes for runtime import or evidence comparison.

Public fixtures

Small table, reproducible evidence.

Public fixture rows are generated inside the QATQ repository so anyone can reproduce the comparison without private runtime captures.

View source tables

Public fixture compression ratios for QATQ, zstd and lz4
Dataset	Winning QATQ strategy	QATQ	zstd	lz4	Restore
bf16-kv-ramp	byte-plane-zstd	0.3817	0.4665	0.6901	yes
bf16-kv-wave	quaternion-chain-zstd	0.1153	0.2900	0.4693	yes
f32 noisy fixture	byte-plane-zstd	0.6532	0.9061	1.0040	yes
NaN/Inf stress	quaternion-chain-zstd	0.0121	0.0413	0.0673	yes

Lossless first

Lossless claims apply to QATQ and QATC. Lossy research paths remain comparators, not the product claim.

Fast enough to matter

The current target is storage and transfer of exported KV tensors, with throughput gates and fuzzing in CI.

VRAM reduction later

Live GPU VRAM reduction is a roadmap goal that needs runtime KV paging and latency proof before it becomes a claim.

Build against the crate. Read the evidence. Then try your own tensors.

QATQ is open source, Rust-native, and ready for exported KV-cache storage and transfer experiments before larger production integrations broaden the evidence base.

GitHub Whitepaper