Instruction: Discuss how the learning rate affects the convergence of gradient descent.
Context: This question tests the candidate's knowledge of the gradient descent optimization algorithm, particularly the role of the learning rate in achieving convergence to a minimum.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
The way I'd explain it in an interview is this: The learning rate controls how large each parameter update is during optimization, and that makes it one of the most important knobs in training. If it is too high, training can become unstable or bounce...