Describe the process of text normalization.

Instruction: Explain what text normalization is and why it is important in NLP.

Context: This question tests the candidate's knowledge of the preprocessing steps required to clean and standardize text data.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

The way I'd think about it is this: Text normalization is the process of converting raw text into a more consistent form before modeling. That can include lowercasing, removing extra whitespace, standardizing punctuation, expanding contractions, normalizing numbers, or handling...

Related Questions