Tag: quantization
All the articles with the tag "quantization".
-
Large Language Model Formats and Quantization
GGUF, GGML, AWQ, GPTQ — LLM file formats and quantization levels explained: trade-offs between model quality, size, and inference speed.
All the articles with the tag "quantization".
GGUF, GGML, AWQ, GPTQ — LLM file formats and quantization levels explained: trade-offs between model quality, size, and inference speed.