Instruction: Explain what TF-IDF stands for and how it is used in NLP.
Context: This question is designed to test the candidate's knowledge of text representation techniques.
The way I'd think about it is this: TF-IDF is a weighting method that scores how important a term is to a document relative to a corpus. A word gets higher weight when it appears often in a specific document but not too often across all documents.
It is useful for search, document ranking, keyword extraction, and simple text classification baselines because it highlights discriminative terms without needing deep modeling. Even today, it is a strong baseline in many information-retrieval workflows.
What matters in an interview is not only knowing the definition, but being able to connect it back to how it changes modeling, evaluation, or deployment decisions in practice.
A weak answer says TF-IDF counts important words without explaining the balance between within-document frequency and corpus-wide rarity.