The Last Technical Interview

S steve-yegge.medium.com ↗

▲ 271 points • 275 comments • by headalgorithm • 4w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

0 %

AI likelihood · overall

Human

100% human-written 0% AI-generated

SEGMENTS · HUMAN 5 of 5

SEGMENTS · AI 0 of 5

WORD COUNT 1,908

PEAK AI % 0% · §2

Analyzed

May 29

backend: pangram/v3.3

Segments scanned

5 windows

avg 382 words each

Distribution

100 / 0%

human / AI fraction

Verdict

Human

Pangram v3.3

Article text · 1,908 words · 5 segments analyzed

Human AI-generated

§1 Human · 0%

18 min readJust now--Today we will pour one out for the vaunted technical interview process, which is on its last leg. And we’ll talk a little about what’s replacing it.This post has been almost 35 years in the making; that’s how long I have been conducting technical interviews. And for a few of those decades, I also worked to try to improve the process itself. I’ve had to care a lot about it, because it’s so broken.It turns out interviewing was broken long before I learned the trade, and despite the many attempts to band-aid it, it’s still broken today. It has managed to survive in spite of that. But it is finally dying on its own. People are a bit unclear on what’s next, so we’ll talk about some of our options.But it’s not an easy path I bring you, no silver bullet. Remember that, grasshopper, when you get to the end and come back to yell at me.The Two Big DogsBack at Amazon, we had this elite group, the Bar Raisers, which I hear is similar to a role at Microsoft they call As-Appropriate (AA). In both cases, a trusted interviewer (the BR/AA) is assigned to every interview loop, and has the power to veto any unqualified people the interviewers try to sneak through. I was a BR, and also part of “Bar Raisers Core”: a small group Bezos and Dalzell tasked with defining the BR role itself, choosing new bar raisers, training them, and reporting on the program’s efficacy.The BR and AA roles are a tacit acknowledgement that you can’t trust your interview teams to make good hiring decisions. Which even more broadly, suggests that if every single interview loop needs a babysitter, then it is a flawed process. But we were doing our best.Of course we don’t frame it as babysitting, naturally. We cheerlead and rah-rah about keeping a “high hiring bar.” And BR/AA aren’t the end of it. There are tons of other interview process band-aids, all variations on how you conduct four to six interviews in a single day. Everything revolves around fitting the assessment into a single round or two of interviews.

§2 Human · 0%

That part never changes.None of these band-aids really help — we still all hire tons of false positives (unqualified) and turn away false negatives (actually qualified), despite every attempt to make the process perfect, or even good.The reality is that talent evaluation, a corner of our industry that hasn’t changed much in five-ish decades, is embarrassingly busted. It just doesn’t work that well in practice.The people who know this best, and who feel it most day to day, and who ultimately have the least power to change it, are all in HR. The tech side of the tech industry doesn’t want to change it because of inertia: the process has worked just barely well enough all these years to resist an overhaul. So all HR can do is show us how bad it is, and then try to do damage control when it fails. I’ll share an amazing story about this in the next section.After Amazon, I kept on trying to improve the interviewing process while I was at Google. I spent years on Google Kirkland’s “Hiring Committee”, where we processed thousands of Microsoft résumés, would-be escapees from over the hill in Redmond. I published a 30-page internal résumé screening guide. I even wrote a blog post, Get That Job At Google, which is still handed out by Google recruiters, seventeen years later. I took this stuff pretty seriously back then!In short, there is very little that I don’t know about the tech-assessment business. And I’m about to pull a Kitchen Confidential on it.This post comes with a diagnosis and a prescription. But the important part is the diagnosis. I want to convince you that we need a radical change in how we assess people, and that tech interviews are on their last leg.Let’s see how well I do. As for the solution, we can figure that out once we agree that it’s a problem.Remember That Time We All Fired Ourselves?The outcomes from interviewing are statistically terrible. Google did wave upon wave of analysis over the years, and all the results were incredibly depressing.To name just a few off the top of my head: interviewers barely agreed with each other. Put the same candidate in front of two of our sharpest people and you’d routinely get a confident “strong hire” from one and a flat “no” from the other. There’s no oracle interviewer, not even Jeff Dean.

§3 Human · 0%

And once people were actually on the job, their interview scores told you next to nothing about how they’d do — at least the way we ran the loops. Hell, some of our star performers failed their Google interviews four or five times, finally got in after 2+ years, and then outshone everyone else.Interviewing, it turns out, is a big game of darts. A “do I like you” dating round.All we could find at Google, from all the statistics, is that there are a bunch of horrible unconscious interviewer biases keeping us from hiring people who ought to be working there. Google never published the results internally and I just got to hear rumors from HR friends, but yeah it was pretty bad.Gergely asked me about the tech-assessment situation in one of our podcasts, and I told him a story that I think is relevant so I’ll recount it here. You can hear the longer, funnier version over on his podcast.At Google, rather than putting a little black desk on every interview loop, they just assumed all the interviewers had their heads up their arses (an evidence-based assumption), and they formed a committee at each site called Hiring Committee (HC). This committee acted as the final arbiter and gatekeeper on all hiring for the site. And I was on it for quite a few years.Our HC group at the time was roughly fifteen strong, and included many local powerhouse Googler colleagues, including co-inventors of huge technologies, authors of interviewing book series, and people who are now very senior leaders.We felt like we knew our shit.One day, the recruiters gave us a special round of packets to review. In these special packets, we were able to read the interviewer notes and candidate responses. All personal details were stripped out, and we were told it was a “calibration exercise.” We had to do our regular voting job with these special packets, and see how it went. I think we may have assumed they were from another site, since cross-site calibration was common.Our group did our job, and voted not to hire about 2/3 of the packets. This was about par for the course.But surprise surprise, this time, those were our own packets from when we had all interviewed at Google. The recruiters had tricked us into reviewing our own interview packets, and we had voted not to hire most of our own group.

§4 Human · 0%

For that brief moment, we all had a glimpse into how utterly broken our process was. The people-team had rubbed our noses in it.But we never fixed it.Press enter or click to view image in full sizeThe moment we realized it was usChange is TabooI once challenged the interview process at Google in a public mailing list, and an Important Eng Leader took me aside and told me I’d “farted in church.” People are not allowed to question whether the interview process is valid. Challenging it is akin to casting aspersions on the entire engineering staff. First of all, how dare you?Moreover, many engineers and managers continue to defend the process, even knowing how broken it is, because it’s a gauntlet that they passed, so others must, too. Which sounds very silly when I say it like that, but you hear echoes of it whenever you suggest changing the process.When you add it all up, and putter around with it for about 35 years, you ultimately find that interviewing, which aims to determine “can this person do this job,” is fundamentally unable to answer that question with any degree of reliability. It’s bordering on pseudoscience.Bravo to Google for at least trying to measure it. They did find some actionable stuff. For instance, any more than four interviews and you’re just playin’ with your food. Stuff like that. But overall it was a bleak and mostly non-actionable picture.Most companies never introspect at all on their own interviewing processes — at least, not in any real “we’re aiming to change this” sense. They might shuffle around which competencies get covered, or what the debrief meeting might look like.The fundamental process — a short series of 1-hour interviews — hasn’t changed in fifty years. We have all been using the same stupid, broken interview process since before most of us were born. Dunno about you, but I’ve had enough of it.So in today’s blog post, I’m going to show you everything that’s been tried, then point in the direction of a better way. And as an industry, we can start moving in that direction. As long as enough of us are doing approximately the same thing, it should gain traction.

§5 Human · 0%

I believe that within a few years we will mostly stop conducting technical interviews, and they will fade into a cute historical footnote like phrenology (which I would not be shocked to hear some interviewers still use). Interviewing will become just another somethin’ weird we used to do.You ready? I am. Let’s goooo!The Signal ProblemTalent assessment is a problem of signal. You need a lot of it. You want to try to paint an increasingly clear picture of the candidate, and that takes a lot of data.The first signal source you get is a candidate’s résumé. “A résumé,” as Dave Barry tells us, “is much more than just a piece of paper. It is a piece of paper with lies written all over it.” Companies get thousands to millions of résumés per year, and the signal-to-noise ratio is awful. AI-assisted writing is exacerbating the problem, and résumés have become all but useless as a signal today.What we used to do first was a technical phone screen. They were super hard, for both sides. Everyone hated them. Imagine writing code for someone with only audio. Zoom made them obsolete. But even those awful phone screens were still a much better signal than a résumé, because you get to spend time with the person.The next level of signal is “work completed”, which includes credentials and assessments like coding academies and challenges, as well as the person’s OSS work. They are both weak signal sources because you don’t get to work with the person while they’re doing it. But they can be a useful tie-breaker or foot-in-the-door signal that boosts you to the next evaluation stage.The hiring signal doesn’t start to improve dramatically until you get to on-site interviews. But even in-person interviewing is infamously unlike real-world work. How does interviewing at a whiteboard compare to working with them in real life? I think we all know the answer to that.So you get a few minutes with their résumé, a half hour video screen, and a handful of in-person interviews. Today’s standard hiring process only involves collecting a few hours of signal, in order to make a decision that could last years.Provisional Employment to the RescueThe gold standard talent-assessment signal, when available, is an internship. This situates the candidate as a quasi-employee for typically around 3 months.