Describe your process for feature selection in a high-dimensional dataset.

Instruction: Explain how you identify and select the most relevant features for your model.

Context: This question tests the candidate's ability to handle high-dimensional data and select features that are most informative for the task at hand.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I start feature selection by removing obvious problems first: leakage, duplicated signals, unstable fields, high-missingness columns, and features that will not exist reliably at inference time. There is no value in sophisticated selection on top of a broken feature set.

After that, I use a mix...

Related Questions