Sophia: Scalable Stochastic 2nd-Order Optimizer for Language Model Pre-Training

Article URL: https://arxiv.org/abs/2305.14342

Comments URL: https://news.ycombinator.com/item?id=39959228

Points: 1

# Comments: 0