MLmediumTraining instabilities~15m
Gradient Clipping vs Learning Rate for Exploding Gradients
Problem
Compare gradient clipping and reducing the learning rate as two approaches to handling exploding gradients. Explain when each is preferable and what their respective failure modes are.
Reference solution
Reference solution available after you attempt the question.
Ready to solve it?
Start a session on Mockbit #85. You'll get graded with specific critique when you submit.
Related ML questions
← Back homemockbit.io/q/85