Instruction: Explain how you identify and select the most relevant features for your model.
Context: This question tests the candidate's ability to handle high-dimensional data and select features that are most informative for the task at hand.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I start feature selection by removing obvious problems first: leakage, duplicated signals, unstable fields, high-missingness columns, and features that will not exist reliably at inference time. There is no value in sophisticated selection on top of a broken feature set.
After that, I use a mix...