Lossy Compression of LLM Weights in Safetensors Format
DOI:
https://doi.org/10.58445/rars.3284Keywords:
lossy compression, pysz, deepseek, huggingface, gpt, gemma, compression, safetensors, artificial intelligence, machine learning, large language modelAbstract
Large language models (LLMs) such as DeepSeek and Google’s Gemma require hundreds of gigabytes of storage when shared via Hugging Face. These models are typically split into many safetensor files--secure, fast tensor storage formats--each often 4-5 GB in size. Downloading or distributing such massive models is time-consuming. Common solutions like 8-bit or 4-bit quantization reduce size by lowering precision, but may still yield sizable files or require retraining. In contrast, error-bounded scientific compressors (for example, SZ) can achieve much higher compression ratios with controlled error. This work applies the Python SZ implementation (PySZ) to a DeepSeek-V3 safetensor shard. A compression ratio of approximately 13.33× was obtained (from ~4.4 GB to ~0.33 GB) while successfully reconstructing the tensor file for potential model use. Methodology, compression statistics, and implications for model fidelity versus quantization are described.
References
Hugging Face. safetensors — Hugging Face Documentation. Hugging Face, n.d. Web.
DeepSeek AI. DeepSeek-V3-0324 — Model Files. Hugging Face Model Hub, 2024. Web.
Google. google/gemma-3-27b-it — Model Files. Hugging Face Model Hub, 2024. Web.
PySZ. pysz — Python Package Index (PyPI). Python Software Foundation, 2024. Web.
Lim, Seung Moo, and Seunghyeon W. Jin. “Neural Network Compression Using Error-Bounded Lossy Compression Techniques.” Electronics, vol. 11, no. 6, 2022, article 858. Web.
Hugging Face. Quantization and bitsandbytes — Transformers Documentation. Hugging Face, 2024. Web.
ZipNN. ZipNN: Lossless Compression for AI Models (GitHub). ZipNN Project, 2024. Web.
Downloads
Posted
Categories
License
Copyright (c) 2025 Aaron Pinto

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.