Instruction: Provide definitions of both terms and discuss their differences.
Context: This question assesses the candidate's understanding of basic text preprocessing techniques in NLP.
The way I'd explain it in an interview is this: Stemming and lemmatization both try to reduce words to a base form, but they do it differently. Stemming usually applies simple heuristic rules, so it may chop a word down to something that is not a real dictionary word. Lemmatization uses vocabulary and linguistic knowledge to return the proper base or lemma.
In practice, stemming is faster and rougher, while lemmatization is cleaner and more language-aware. I would choose based on the task. If interpretability and linguistic precision matter, lemmatization is usually the better fit.
What matters in an interview is not only knowing the definition, but being able to connect it back to how it changes modeling, evaluation, or deployment decisions in practice.
A weak answer says both techniques just shorten words, without explaining the rule-based versus language-aware difference.