<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Aniss Djellal</title><description>I primarily work on LLM fine-tuning and distillation, generating synthetic data, and creating Agents.</description><link>https://djellalmohamedaniss.github.io/</link><item><title>Trading MatMuls for SRAM Lookups: A 3-Bit Edge Architecture</title><link>https://djellalmohamedaniss.github.io/posts/data-free-3bit-quantization/</link><guid isPermaLink="true">https://djellalmohamedaniss.github.io/posts/data-free-3bit-quantization/</guid><description>By trading heavy FP16 MatMuls for SRAM lookups and 1-bit additions, our custom quantization pipeline squeezes state-of-the-art models down to approx. 3 bits per weight with minimal accuracy loss. Here is how bypassing Tensor Cores could reshape the design of future edge AI chips.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate></item><item><title>An approach to calibrating LLM reasoning effort</title><link>https://djellalmohamedaniss.github.io/posts/calibrating-llm-reasoning/</link><guid isPermaLink="true">https://djellalmohamedaniss.github.io/posts/calibrating-llm-reasoning/</guid><description>In this blog post, we will discuss about Controlling reasoning effort in LLMs ( the gpt-oss-style ) and Calibrating LLM Reasoning effort via Label-Free Alignment.</description><pubDate>Tue, 17 Feb 2026 00:00:00 GMT</pubDate></item><item><title>Stop Using Embeddings for Everything in RAG</title><link>https://djellalmohamedaniss.github.io/posts/stop-using-embeddings-for-everything-in-rag/</link><guid isPermaLink="true">https://djellalmohamedaniss.github.io/posts/stop-using-embeddings-for-everything-in-rag/</guid><description>Why deterministic query translation should often come before embeddings in enterprise RAG systems, and how to combine both in a hybrid approach.</description><pubDate>Thu, 11 Dec 2025 01:34:00 GMT</pubDate></item></channel></rss>