Versa AI Hub: Modifying gradient accumulation

Friday, January 3, 2025

Modifying gradient accumulation

Yesterday, our friends at Unsloth shared an issue with gradient accumulation affecting Transformers Trainer. The first report comes from @bnjmn_marie (kudos to him!). Gradient accumulation can be considered the mathematical equivalent of full batch training. However, the losses were not consistent between training runs with the setting turned on and off. Where did it come [...]

The post Modifying gradient accumulation first appeared on Versa AI hub.

from Blog - Versa AI hub https://versaaihub.com/modifying-gradient-accumulation/?utm_source=rss&utm_medium=rss&utm_campaign=modifying-gradient-accumulation
via IFTTT

Friday, January 3, 2025

Modifying gradient accumulation

No comments:

Post a Comment

Future AI Agent Business Ideas to Dominate the Market