Meta Superintelligence’s surprising first paper
Meta Superintelligence’s surprising first paperLong awaited first paper from Meta Superintelligence Labs is not a model layer innovation. What does this mean?rudyl.ai and Charles PierseSep 17, 20253ShareTL;DRMSI’s first paper, REFRAG, is about a new way to do RAG.This slightly modified LLM converts most retrieved document chunks into compact, LLM-aligned chunk embeddings that the LLM can consume directly.A lightweight policy (trained with RL) decides which chunk embeddings should be expanded back into full tokens under a budget; the LLM runs normally on this mixed input.The net effect is far less KV cache and attention cost, much faster first-byte latency and higher throughput, while preserving perplexity and task accuracy in benchmarks.Meta’s new Superintelligence labs made big…