Explain how you would diagnose and address a sudden decrease in model accuracy in production.

Question

This question assesses the candidate's problem-solving skills and understanding of the complexities involved in maintaining ML model performance over time. Candidates should demonstrate their approach to troubleshooting, including potential areas of investigation (e.g., data quality, model drift, feature changes) and remediation strategies.

Accepted Answer

Example Answer

I would start by isolating whether the issue is model-related, data-related, or system-related. That means checking recent deployments, feature freshness, schema changes, label pipelines, upstream service behavior, and whether the drop is global or isolated to certain segments.

Once the likely cause is clear, the response could be rollback, traffic reduction, feature repair, retraining, or changing thresholds temporarily. The key is to avoid assuming the model itself is broken before ruling out instrumentation and data-pipeline failures.

What matters in an interview is not only knowing the definition, but being able to connect it back to how it changes modeling, evaluation, or deployment decisions in practice.

Common Poor Answer

A weak answer says retrain the model immediately, without first checking whether the accuracy drop came from labels, features, or deployment changes.

Explain how you would diagnose and address a sudden decrease in model accuracy in production.

Example Answer

Common Poor Answer

Related Questions