How would you explain chunking strategy to an interviewer?

Instruction: Describe chunking in plain language and explain why it changes RAG quality.

Context: Checks whether the candidate can explain the core concept clearly and connect it to real production decisions. Describe chunking in plain language and explain why it changes RAG quality.

Example Answer

The way I'd approach it in an interview is this: Chunking strategy is the decision about what unit of meaning the retriever is allowed to fetch. The goal is not to hit an arbitrary token count. The goal is to preserve enough context that a chunk can support an answer while still being small enough to retrieve precisely.

I usually start from document structure and query shape. Policies, manuals, tickets, and API docs want different boundaries. Headings, lists, table captions, and effective dates often matter more than raw length because that is what keeps the evidence interpretable.

A bad chunking strategy has a recognizable failure mode: the system finds the right document, but the retrieved passage is too narrow to support the claim or too bloated to be useful. Then teams try to compensate with larger prompts or heavier reranking. Good chunking keeps retrieval precise, citations clean, and synthesis easier.

Common Poor Answer

A weak answer is, "I would split documents every 500 tokens with overlap." That sounds mechanical and ignores document structure, query shape, and whether the chunk can actually support a real answer.

Related Questions