AI, Ashby Engineering, and the Future

A ashbyhq.com ↗

▲ 62 points • 55 comments • by fredley • 3w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

0 %

AI likelihood · overall

Human

100% human-written 0% AI-generated

SEGMENTS · HUMAN 5 of 5

SEGMENTS · AI 0 of 5

WORD COUNT 1,845

PEAK AI % 0% · §5

Analyzed

Jun 4

backend: pangram/v3.3

Segments scanned

5 windows

avg 369 words each

Distribution

100 / 0%

human / AI fraction

Verdict

Human

Pangram v3.3

Article text · 1,845 words · 5 segments analyzed

Human AI-generated

§1 Human · 0%

Since August 2025, more than half of the new code hitting Ashby’s production systems has been AI-generated, yet customer issues remain broadly stable. See the graph below. More customers. More AI-written code. The sky didn’t fall.We have a blip in March / April every year; these cyclical patterns aren’t relevant to explain here. Cursor provides stats on how much of our code is generated by AI.We’ve also not seen any regressions in code quality, velocity, or onboarding time for engineers (anecdotally, we’ve seen comprehension of the codebase increase!).This isn’t a toy project. Ashby is a suite of talent acquisition software with over 100,000 weekly active users, millions of candidate applications per week, and features that resemble entire companies' worth of product (like Calendly and Looker). I’m Colin, Head of EMEA Engineering at Ashby. I want to share with you how Ashby Engineering is thinking about AI and the changes it brings to how we work. I’m going to assume you’re an engineer.Our thesis is that the cost of producing code is heading towards zero. AI isn’t coming for our jobs, it’s coming for the mechanical parts of them: syntax, glue code, and the tip-taps of keystrokes. The parts that are less interesting, less challenging.The part that matters for engineers - your judgment, your taste, your understanding of our customers - is getting more important, not less. Your value as an engineer was always weighted in your judgment. Every efficiency gain in producing high-quality code shifted the role further in that direction. AI will be a larger shift than we’ve seen before.That shift is already here. “Almost all my PRs are entirely AI-written now. I implemented an entire data ingestion via AI… It's ~40 PRs” - Tom, one of our engineers.Like any emerging technology, the industry is figuring out how to use AI effectively to build software. When to trust it, when to override it, and what needs to change in our systems so that "move fast" doesn't become "move recklessly." It's a shared mental model, and I expect it to evolve as we learn.

§2 Human · 0%

The Ground RulesAs we use LLMs more and the world around us shifts, we believe there are two ground rules:Empathy cannot be replaced by AIYou are responsible for what you shipEmpathy Cannot Be Replaced by AIBuilding products is a human endeavor. LLMs do not have taste. They do not know our customers. They cannot feel or understand the frustration of using a bad product or the delight of using an exceptional one. That still requires judgment, and, in a world where building a functional product is insanely fast, the ability to build a great one is even more important.We also value individual focus, so when we collaborate, it’s important we do it effectively. We don’t do mindless standups. We don’t do planning poker. We do write documents for our colleagues to read and understand. We do ask for help with reviewing changes.Empathy means remembering to write these documents for the humans who will read them. LLMs can help with writing. But, without guidance, LLMs will write documents that seem convincing yet are hard for humans to read, full of unimportant details, and lacking joy and wisdom. Here’s an excerpt of a PR description I had an LLM write:1Added .github/workflows/pr-relevant-test-coverage.yml: 2 - Triggers on pull_request (excluding master) and 3 workflow_dispatch with pr_number. 4 - Resolves PR number, collects changed files, and asks 5 Claude to output up to 15 relevant test files.This is all information that we can trivially figure out from reading the PR, and the full description was close to 30 lines. The most useful line still missed the mark:1Coverage is intentionally not full-suite coverage; it 2reflects only Claude-selected relevant tests against 3changed files.Why? Why does this not run full-suite coverage? This description does not respect our colleague’s time. It is devoid of empathy for us as reviewers and future maintainers of this code.1Coverage is intentionally not full-suite coverage. The 2full suite with coverage takes hours to run. We are 3using this to give guidance to engineers on where risks 4lie.Remember what empowers our colleagues to help us. Don’t cede writing documents for humans to LLMs.You are responsible for what you shipLLMs can be wonderfully wrong.

§3 Human · 0%

Confidently incorrect. Inexplicably careless. The biggest risk with AI isn’t that it’s wrong. It’s that it sounds right.“I didn’t mean to remove the tar-stream package - it was an accidental casualty when I was editing backend/package.json…” - Claude CodeYou are responsible for what you ship. Whether every line is handwritten or an LLM generated the entire PR. You are responsible for understanding what the code does, why it does it, and what happens when it breaks.As we use LLMs more, skepticism has to increase, not decrease. Ask for alternatives. Ask for edge cases. Ask it to critique itself. Understand the reasoning before accepting the output. Think More, Think HarderWe must think more - and think harder than before. LLMs make it easy to no-brain your way through something. Resist that urge. Stay vigilant. It is easy to throw an issue at an LLM, have the PR description auto-generated, get the LLM to write the tests, and throw the PR out for review… all whilst fixing the entirely wrong issue or building a subpar solution.A particularly nefarious manifestation of this is running lots of agents in parallel on disparate tasks. This is multi-tasking on steroids. Multitasking is ineffective because the human brain can’t focus on multiple high-level tasks simultaneously. It may feel super productive to have five agents working on five issues and flicking between them all, but are you really making your best decisions? Are you able to think deeply about the guidance each agent needs? Do you understand what is being built?The current hype cycle often emphasizes quantity and velocity above all else, while ignoring quality and ingenuity, or with the promise that somehow these outcomes will follow. At Ashby, we are not succumbing to this pressure. It’s a myopic view of the world where everything can and should be shortcutted. Shortcuts always existed, and many of them reduce the quality and ingenuity of our work:These external quotes reflect our own journey to success. Before AI, we could have always moved faster: we could have outsourced work to contractors, we could have built features instead of building blocks, we could have launched earlier. But the hours we spent hiring quality folks instead of outsourcing, thinking of abstractions instead of coding, and being patient with our product were often more impactful than the alternative.

§4 Human · 0%

Thinking deeply is part of why we’re a successful startup today, and we’re not stopping.Specs are Still for HumansOne of the shifts we’re seeing is the intention to feed specs to LLMs.We've always valued specs. They derisk development and ensure alignment. LLMs also benefit from the context that specs can provide.However, what a human needs from a spec and what an LLM needs from a spec are different. As humans, we need something that is mindful of our time, engages us as readers, and focuses our attention on the decisions that matter. E.g., something that tells us why you’ve decided to use Redis instead of Postgres vs a document detailing every single possible value for a new enum.We need something empathetic to us.We must continue writing specs for humans. Specs are focused on the expensive-to-change decisions. Specs reduce the risk that we build the wrong thing. They identify the abstractions that we’re going to need. For example, I was talking with one of our engineers about a requirement to perform an action on potentially millions of forms, and our framework doesn’t support that. Do they create a one-off implementation for their use case or figure out how to improve the bulk action building block? That’s the kind of decision we need to capture. It completely changes the implementation. It affects everyone else. Get it right, and we’ve created leverage for the next person.Specs are written for humans. An LLM might consume them as useful additional context. How to Think About LLMsWe’ve set the ground rules and discussed how we want to interact as humans. Now: how we think about operating with LLMs.There’s a lot of great material out there that gives an introduction to how LLMs work (this post by Sam Rose is a good one). First, LLMs are not lazy. They will produce code. And keep going. They won’t stop and ask, “Should I create an abstraction for this?” They’re great at summarizing swathes of information. They’re not novel thinkers. They won’t make the mental leaps that allow you to simplify something and delete thousands of lines of code.A simple model I use is to think of the LLM as a set of dice, not the hyped superintelligence. Some problems LLMs are good at, and you don’t need to roll a high score for.

§5 Human · 0%

They’re great at summarizing documents, finding patterns, and continuing them.Some problems they’re terrible at. You’ll never get enough sixes in a row. Counting the number of r's in “strawberry”, multiplying large numbers, or figuring out which direction you're facing after a series of turns. These feel like they should be easy, but they require a kind of precision that dice just can't reliably give you.And then there are ways to load the dice to tip the odds in your favor, like giving them an example of what good looks like.With that model in mind, here's how working with LLMs plays out in practice.Two Modes of Working With AII see two distinct ways to work with LLMs: as a sidekick or as a delegate. Recognizing which mode you're in and which you should be in is the key skill.Default to sidekick mode. You’re using AI to explore the codebase, find and digest large amounts of information, and implement the detailed specs you’ve written. You are making most of the decisions.This is the mode for anything high-risk: database migrations, candidate data handling, security-sensitive code, and architectural decisions. These are the places where “looks right” isn’t good enough. You need to be in the driver's seat.Switch to delegate mode when the blast radius is small. You review the output - or sometimes you don't. Prototyping, local tools, and operations tools are great candidates here. You can move fast because the cost of failure is low.Most engineers will over-delegate at first. Then they’ll over-correct and under-delegate. The question is never "should I use AI?" - it's "how much should I trust it here?" Think about what happens if the code is wrong. Is it embarrassing? Expensive? Existential?Blending between the two will be common. That’s where your judgment matters, and where breaking up the task pays off. You might start off by having the LLM build the scaffolding of the new feature you’re working on, hand-write some SQL queries, and finally jump back to full delegation for writing a few unit tests.How We’re Using AI TodayWe’re actively encouraging engineers to use AI tools to support their work. We’re doing this through education, workshops, pairing, and generous token budgets. We do not mandate the use of AI or measure token usage.