QATQInstall

v0.1.1 released on crates.io

Lossless LLM memory compression for state in motion.

QATQ compresses exported KV caches and tensor migration blocks into portable QATC artefacts, restores them bit-for-bit, and has proven smaller transfer sizes than raw, zstd, and lz4 on documented public and live-migration evidence.

Restore
bit-for-bit
Live proof
72.2%
Crate
0.1.1
Licence
Apache
qatq encode
# exported KV tensor bytes in, lossless QATC out
qatq encode-chunked \
  --mode qatq-exact \
  --dtype bf16 \
  --max-values-per-chunk 65536 \
  kv-cache.bf16le kv-cache.qatc

# restore the same native bytes for runtime import
qatq decode kv-cache.qatc restored.bf16le

QATC v2 checksum validated

Corrupt or truncated payloads are rejected before restore.

Quaternion-chain path

Selected only when it beats simpler lossless transforms.

Measured evidence

A simple comparison table, with the claim scoped correctly.

The strongest production-facing result today is lossless exported-state compression. In the documented cloud live-migration proof, QATQ transferred fewer bytes than raw blocks, lz4, and zstd while preserving measured task continuation.

Live migration compression comparison between raw, lz4, zstd and QATQ
CodecTransferred bytesRatio vs rawReductionBit-for-bit restore
Raw streamed blocks50,331,6481.0000baselineyes
LZ4 baseline28,739,2170.570943.0% smalleryes
Zstd baseline20,405,3810.405459.5% smalleryes
QATQ14,004,9900.278372.2% smalleryes

Source: QATQ external runtime evidence, 2026-06-22 cloud live-migration proof. This proves the measured integration path, not a universal claim across every model, runtime, context length, dtype, or chunk layout.

How it works

Tensor-aware strategy search, wrapped in a portable QATC container.

QATQ is not a transparent GPU memory layer yet. It is a production-shaped codec for exported tensor bytes: f32, f16, and bf16 data that can be chunked, compressed, transferred, checked, and restored bit-for-bit.

Exported tensors

KV caches, migration blocks, and fixture captures leave the runtime as typed little-endian bytes.

Lossless candidates

QATQ chooses among raw, byte-plane, zstd-backed, delta-XOR, and reversible quaternion-chain candidates.

QATC transport

Large tensors are split into bounded chunks with ordered lengths and an aggregate checksum.

Bit-identical restore

Decode returns the same native tensor bytes for runtime import or evidence comparison.

Diagram showing exported tensors flowing into QATQ strategy search, QATC transport and bit-identical restore

Public fixtures

Small table, reproducible evidence.

Public fixture rows are generated inside the QATQ repository so anyone can reproduce the comparison without private runtime captures.

View source tables
Public fixture compression ratios for QATQ, zstd and lz4
DatasetWinning QATQ strategyQATQzstdlz4Restore
bf16-kv-rampbyte-plane-zstd0.38170.46650.6901yes
bf16-kv-wavequaternion-chain-zstd0.11530.29000.4693yes
f32 noisy fixturebyte-plane-zstd0.65320.90611.0040yes
NaN/Inf stressquaternion-chain-zstd0.01210.04130.0673yes

Lossless first

Lossless claims apply to QATQ and QATC. Lossy research paths remain comparators, not the product claim.

Fast enough to matter

The current target is storage and transfer of exported KV tensors, with throughput gates and fuzzing in CI.

VRAM reduction later

Live GPU VRAM reduction is a roadmap goal that needs runtime KV paging and latency proof before it becomes a claim.

Build against the crate. Read the evidence. Then try your own tensors.

QATQ is open source, Rust-native, and ready for exported KV-cache storage and transfer experiments before larger production integrations broaden the evidence base.