How Claude Code works in large codebases: Best practices and where to start

C claude.com ↗

▲ 247 points • 160 comments • by shenli3514 • 1w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is a mix of AI-generated, AI-assisted, and human-written content

64 %

AI likelihood · overall

Mixed

21% human-written 68% AI-generated

SEGMENTS · HUMAN 1 of 4

SEGMENTS · AI 3 of 4

WORD COUNT 1,434

PEAK AI % 99% · §3

Analyzed

May 15

backend: pangram/v3.3

Segments scanned

4 windows

avg 359 words each

Distribution

21 / 68%

human / AI fraction

Verdict

Mixed

Pangram v3.3

Article text · 1,434 words · 4 segments analyzed

Human AI-generated

§1 Human · 11%

Claude Code is running in production across multi-million-line monorepos, decades-old legacy systems, distributed architectures spanning dozens of repositories, and at organizations with thousands of developers. These environments present challenges that smaller, simpler codebases don’t, whether that’s build commands that differ across every subdirectory or legacy code spread across folders with no shared root.This article covers the patterns we've observed that have led to successful adoption of Claude Code at scale. We use “large codebase” to refer to a wide range of deployments: monorepos with millions of lines, legacy systems built over decades, dozens of microservices across separate repositories, or any combination of the above. That also includes codebases running on languages that teams don't always associate with AI coding tools, such as C, C++, C#, Java, PHP. (Claude Code performs better than most teams expect it to in those cases, particularly as of recent model releases.) While every large codebase deployment is shaped by its specific version control, team structure, and accumulated conventions, the patterns here generalize across them and are a good starting point for teams considering adopting Claude Code.How Claude Code navigates large codebasesClaude Code navigates a codebase the way a software engineer would: it traverses the file system, reads files, uses grep to find exactly what it needs, and follows references across the codebase. It operates locally on the developer’s machine and doesn’t require a codebase index to be built, maintained, or uploaded to a server. The AI coding tools relied on RAG-based retrieval by embedding the entire codebase and retrieving relevant chunks at query time. At large scale, those systems can fail because embedding pipelines can’t keep up with active engineering teams. By the time a developer queries the index, it reflects the codebase as it existed days, weeks, or even hours ago. Retrieval then returns a function the team renamed two weeks ago, or references a module that was deleted in the last sprint, with no indication that either is out of date.Agentic search avoids those failure modes. There's no embedding pipeline or centralized index to maintain as thousands of engineers commit new code. Each developer's instance works from the live codebase. But the approach has a tradeoff: it works best when Claude has enough starting context to know where to look.

§2 AI · 76%

This means the quality of Claude's navigation is shaped by how well the codebase is set up, layering context with CLAUDE.md files and skills. If you ask it to find all instances of a vague pattern across a billion-line codebase, you’ll hit context-window limits before the work begins. Teams that invest in codebase setup see better results.The harness matters as much as the modelOne of the most common misconceptions about Claude Code is that its capabilities are solely defined by the model used. Teams focus on a model’s benchmarks and how it performs on test tasks. In practice, the ecosystem built around the model—the harness—determines how Claude Code performs more than the model alone.The harness is built from five extension points—CLAUDE.md files, hooks, skills, plugins, and MCP servers—each serving a different function. The order in which teams build them matters, as each layer builds on what came before. Two additional capabilities, LSP integrations and subagents, round out the setup. Below, we explain what each of these components and capabilities do: CLAUDE.md files come first. These are context files that Claude reads automatically at the start of every session: root file for the big picture, subdirectory files for local conventions. They give Claude the codebase knowledge it needs to do anything well. Because they load in every session regardless of the task, keeping them focused on what applies broadly will prevent them from becoming a drag on performance.Hooks make the setup self-improving. Most teams think of hooks as scripts that prevent Claude from doing something wrong, but their more valuable use is continuous improvement. A stop hook can reflect on what happened during a session and propose CLAUDE.md updates while the context is fresh. A start hook can load team-specific context dynamically so every developer gets the right setup for their module without manual configuration. For automated checks like linting and formatting, hooks enforce the rules deterministically and produce more consistent results than relying on Claude to remember an instruction.Skills keep the right expertise available on-demand without bloating every session. In a large codebase with dozens of task types, not all expertise needs to be present in every session. Skills solve this through progressive disclosure, offloading specialized workflows and domain knowledge that would otherwise compete for context space and loading them only when the task calls for it.

§3 AI · 99%

For example, a security review skill loads when Claude is assessing code for vulnerabilities, while a document processing skill loads when a code change is made and documentation needs to be updated.Skills can also be scoped to specific paths so they only activate in the relevant part of the codebase. A team that owns a payments service can bind their deployment skill to that directory, so it never auto-loads when someone is working elsewhere in the monorepo.Plugins distribute what works. One challenge with large codebases is that good setups can stay tribal. A plugin bundles skills, hooks, and MCP configurations into a single installable package, so when a new engineer installs that plugin on day one, they will immediately have the same context and capabilities as those who have been using Claude already. Plugin updates can be distributed across the organization through managed marketplaces. For example, a large retail organization we work with built a skill connecting Claude to their internal analytics platform so that business analysts could pull performance data without leaving their workflow. They distributed it as a plugin before the broad rollout to the business.Language server protocol (LSP) integrations give Claude the same navigation a developer has in their IDE. Most large-codebase IDEs already have an LSP running, powering "go to definition" and "find all references." Surfacing this to Claude gives it symbol-level precision: it can follow a function call to its definition, trace references across files, and distinguish between identically named functions in different languages. Without it, Claude pattern-matches on text and can land on the wrong symbol. One enterprise software company we worked with deployed LSP integrations org-wide before their Claude Code rollout, specifically to make C and C++ navigation reliable at scale. For multi-language codebases, this is one of the highest-value investments.MCP servers extend everything. MCP servers are how Claude connects to internal tools, data sources, and APIs that it can't otherwise reach. The most sophisticated teams built MCP servers exposing structured search as a tool Claude can call directly. Others connect Claude to internal documentation, ticketing systems, or analytics platforms.Subagents split exploration from editing. A subagent is an isolated Claude instance with its own context window that takes a task, does the work, and returns only the final result to the parent.

§4 AI · 84%

Once the harness is in place, some teams spin up a read-only subagent to map a subsystem and write findings to a file, then have the main agent edit with the full picture. Claude Code’s extension layer at a glance.The table below summarizes what each component does, when it loads, and the most common mistakes we see with each: Component What it is When it loads Best for Common confusion CLAUDE.md Context file Claude reads automatically Every session Project-specific conventions, codebase knowledge Using it for reusable expertise that belongs in a skill Hooks Scripts that run at key moments Triggered by events Automating consistent behavior, capturing session learnings Using prompts for things that should run automatically Skills Packaged instructions for specific task types On demand, when relevant Reusable expertise across sessions and projects Loading everything into CLAUDE.md instead Plugins Bundled skills, hooks, MCP configs Always available once configured Distributing a working setup across the org Letting good setups stay tribal Language server protocol (LSP)* Real-time code intelligence via language specific servers Always available once configured Symbol-level navigation and automatic error detection in typed languages Assuming that it's automatic MCP servers Connections to external tools and data Always available once configured Giving Claude access to internal tools it can't otherwise reach Building MCP connections before the basics are working Subagents* Separate Claude instances for specific tasks When invoked Splitting exploration from editing, parallel work Running exploration and editing in the same session *LSP is accessed through the plugin layer. Subagents are a delegation capability rather than a configured extension point. Three configuration patterns from successful deploymentsHow you configure Claude Code for a large codebase depends heavily on how that codebase is structured. Still, three patterns appeared consistently across the deployments we observed.