Apple: Embarrassingly Simple Self-Distillation Improves Code Generation - CodeGurus

Computer Science > Computation and Language arXiv:2604.01193 (cs) [Submitted on 1 Apr 2026] Title:Embarrassingly Simple Self-Distillation Improves Code Generation Authors:Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang View a PDF of the paper titled Embarrassingly Simple Self-Distillation Improves Code Generation, by Ruixiang Zhang and 5 other authors View PDF HTML (experimental) Abstract:Can a large language model (LLM) improve at code generation using only its own raw outputs, without a verifier, a teacher model, or reinforcement learning? We answer in the affirmative with simple self-distillation (SSD): sample solutions from the model with certain temperature and truncation configurations, then fine-tune on those samples with…

Related Articles