Skip to content
HN On Hacker News ↗

voice modems

▲ 74 points 16 comments by K7PJP 1mo ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

1 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 5 of 5
SEGMENTS · AI 0 of 5
WORD COUNT 1,859
PEAK AI % 6% · §1
Analyzed
Apr 27
backend: pangram/v3.3
Segments scanned
5 windows
avg 372 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 1,859 words · 5 segments analyzed

Human AI-generated
§1 Human · 6%

2026-04-26 If you've done much with modern cellphones, you've probably noticed just how odd the architecture can be around audio. Specifically, I mean call audio: modern smartphones have made call audio less of a special case (mostly by just becoming more complicated in general), but in older phones you would often find arrangements where the cellular modem 1 had direct analog audio to the microphone and speaker, perhaps via some switching to share amplifiers. That design meant that the cellular modem functioned basically as a completely independent device, a fully-capable "cellular phone" with the ability to make and receive voice calls. The role of the rest of the smartphone, and its operating system, was just to provide control messages for starting and ending calls. In modern phones the audio path to and from the modem is digital and it's more integrated into the operating system audio service, but still not fully. You might have noticed, for example, that it is excessively difficult to record call audio on most phones. Regulatory and liability pressures are one reason for this, but another is that it's actually kind of difficult: there may not be any physical way for software running on the main processor to receive audio from the cellular modem. The designer has to put in explicit effort to make that work, effort that only became common more recently to facilitate automatic transcription—and VoLTE, a whole complication that I will simply ignore for the sake of a cleaner historical narrative. You come here to read about old phones, not new ones. You've probably read enough of my writing to know where this is going: the design of cellular radios, which assume call audio to be part of Their Exclusive Domain, is a legacy of an age-old architectural decision traceable to the original Hayes Smartmodem. It relates to a feature of modems that was widely available, but sparsely used, for much of the PC revolution. The details are odd! First, for context, let's recede into our mind palaces and travel back to the 1980s. AT&T-designed modems like the Bell 103 had created a standardized family of protocols for data over voice lines, and a company called Hayes introduced a Bell 103-like implementation called the Smartmodem. The Smartmodem was quite successful on its own, but it was more significant for having introduced a common control interface between the modem and the computer.

§2 Human · 1%

Previous modems had acted as transparent devices that expected Something Else to perform call setup tasks, while the Hayes Smartmodem could pick up the line and dial all on its own. That required that the computer send commands to the modem to configure and start a call. Hayes designed a simple scheme for sending commands to the modem and switching it in and out of transparent data mode, and that protocol was then widely copied by other modem manufacturers. You could call it the "Hayes command set," and older documents often do, but these days it's more commonly known by the two characters that prefix most commands: the AT protocol. From its origin in 1981, AT has shown remarkable staying power. Virtually all computer-connected modems, to this very day, continue to use AT commands for basic configuration. Likewise, the basic architecture of the Smartmodem persists: the Smartmodem connected to the host computer using a single RS-232 link that switched between carrying control messages and data. The very latest 5G modems still work the same way, complicated by the addition of multiple separate UART serial channels (so that, for example, control commands, data, and GNSS data can each have their own separate channel) and the adoption of the USB communications device class "Abstract Control Model," a standard UART-over-USB implementation mostly intended to simplify modems. Plug a modern 5G modem into a Linux machine and you can easily observe this: virtually all cellular modems are USB-attached and will appear as a USB composite device with multiple serial adapters, usually attached as /dev/ttyACM* due to the USB-CDC ACM class. Courtesy of the V.250 standard (a formalization of AT commands) and considerable effort by driver implementers, USB-attached modems "Just Work" as network interfaces on modern Linux—but under the hood, the kernel is communicating with the modem over separate serial interfaces. Back in the olden days, it was common to run PPP (point-to-point protocol) over one of the serial interfaces to use the actual data (bearer) channel, but now PPP has mostly given way to "Direct IP" where you just push packets over the serial link. Just to complicate things a touch more, there are vendor-specific standards like QMI (Qualcomm) that completely replace AT and find use in modern smartphones, but they're messy with regards to Linux support.

§3 Human · 2%

If you are personally interacting at this layer, messing with modems or writing communications software or whatever, you are almost certainly going to stick to AT commands. Modem vendors continue to build on AT. If you look at LTE modems made for IoT applications, for example, it's common for them to provide a complete HTTP implementation (and sometimes MQTT, and sometimes some kind of proprietary message broker protocol) accessible via AT commands. That means you can implement an IoT device without a network stack at all, deferring all network operations to the modem itself. With a JSON-over-HTTP backend, for example, you might send AT commands with JSON payloads over the serial control channel and then get JSON back. You never interact with the network at all, the modem is a completely self-contained system. At the extreme, you might implement your entire device using exclusively the modem. This is a common approach for telematics devices like GPS trackers: they consist of nothing but a cellular modem, the telemetry application is built for the modem using an SDK from its vendor, and you interact with it using AT commands. IoT-class modems frequently provide GPIO and user flash for just this purpose. None of that is actually what this article is about, but I want to make clear how profound the implications of the Smartmodem heritage are. In 1981, the Smartmodem was a standalone device controlled over serial because the limitations of the era's computer made that a practical necessity. Processors weren't fast enough to run the modem DSP alongside other workloads, certification requirements for telephone-connected devices were stricter, etc. Despite the late-'90s detour into "winmodems," most of those constraints still exist, just in the different forms of the cellular network. Today's modems are less v.54 and more 5G, but they still act as standalone devices controlled over serial channels. Most telephone modems of the 1980s were exclusively data modems. You could use AT commands to make a call, switch into data mode, and then you basically had a very long serial cable from your device to the computer on the other end of the call. That was all these modems did; their only interaction with "The Telephone System" besides as a pair of wires was for basic call control like detecting dial tone and sending DTMF dialing.

§4 Human · 2%

That was quite natural considering their evolution from acoustic coupler modems (where you dialed the phone yourself and then set the handset on the modem), but by the late '80s, as devices like the Smartmodem with their own call control were common, it started to feel primitive. With Carterfone and the breakup of the AT&T monopoly, computers were starting to feel like first-class citizens on the telephone system. Shouldn't they have more complete support for, well, telephone things? From a modern perspective, it might seem odd that fax came to modems before voice, but it makes technical sense. Fax machines use a digital protocol that is loosely derived from Bell 103 and belongs to the same extended family as other telephone modems, so modems already had the hardware. Implementing fax support was just a matter of software. With some extensions to the AT command set, your computer became a fax machine. By the late 1980s, fax support was common in modems, usually distinguished by marketing the modem as "data/fax." For example, the command AT+FCLASS=1.0 changed the modem to T.31 fax mode (fax class 1.0). T.31/EIA-578 is a standard for sending and receiving faxes using a serial connection to a telephone modem, and it was widely implemented by commercial software packages. There were so many "PC fax" packages available in the 1990s that you could stock half an aisle of an OfficeMax with them, and indeed that's what happened. The legacy of this industry is that there are still dozens of "fax server" products built around data/fax modems, like the open source Hylafax. Fax modems also made a more general contribution to the modem state of the art: the concept of distinct modes. "Fax class 0" was data mode, while values like 1 and 2 and, oddly, 1.0 and 2.0 were used for different fax implementations. There was an obvious, and tantalizing, opportunity: more modes. Maybe, even, a modem mode for that most classic application of the telephone: voice. Could you use your computer for telephone calls? The idea is obvious, so it's no surprise that several vendors were working on it all at once.

§5 Human · 2%

Early efforts at telephone-on-computer could be quite comical, consisting of a telephone that was more or less glued to a computer, no electrical connectivity between them. The IBM Palm Top PC 110 is my favorite example of this form, a Japanese-market miniature laptop with a speaker and microphone on the front edge so that you could hold it up to your face to make a call. Besides amusement, it illustrates a fundamental challenge of merging computers with telephony: real-time media is hard. It seems very funny to build a telephone into a computer because computers are general-purpose devices defined by software. Putting a phone in the computer should not mean physically putting a phone in the computer; the phone should obviously be a software application. Well, obviously from our modern perspective, but real-time media has always been difficult for computers (which, for architectural reasons, are mostly seen today as fundamentally asynchronous, non-real-time devices). Modern computers get away with it by brute force; they're just so fast that they can be wildly inefficient with media and still keep up to real-time. But things were different in the 1990s. Real-time audio processing was a fairly demanding application and most of the computer industry preferred to leave it to hardware. Still, the voice modem was an inevitability. In 1991, the Los Angeles Times reported that at least three companies were working on some form of "modem with voice support" for 1992. They focused mainly on Rockwell International, which proved the right call. We don't remember Rockwell as a semiconductor company today, but in the 1990s they very much were—Rockwell Semiconductor later spun out into Conexant, now part of Synaptics. At the time, Rockwell was a major player in semiconductors, especially for communications. Rockwell had particular expertise in answering telephones. During the 1970s, the Rockwell Galaxy Automatic Call Distributor just about invented the modern call center. It was the first digitally-controlled system that answered calls on a pool of telephone lines, placed them on hold, and distributed them to a pool of operators.