Pangram verdict · v3.3
We believe that this document is primarily human-written, with a small amount of AI content detected
AI likelihood · overall
HumanArticle text · 1,658 words · 6 segments analyzed
TLDR - here is the PoC This write-up details a novel iPhone BootROM vulnerability discovered and exploited by our team. It covers the underlying bug, the associated exploitation techniques, and the post-exploitation steps required to achieve application processor's boot-chain compromise. The exploit leverages both a hardware bug in the USB controller and a specific configuration flaw present in the device firmware. Currently supported SoCs include Apple A12, S4/S5, and A13. While technical support for A12X/Z is possible, it is not currently implemented. We limited our implementation to these devices, as demonstrating successful exploitation across this range was sufficient to thoroughly validate both the vulnerability and the exploitation strategy. By publishing this research and the accompanying proof of concept, we aim to document the real-world impact of this class of hardware vulnerabilities, contribute to the broader understanding of modern BootROM security, and demonstrate that even recent SecureROM generations remain susceptible to subtle hardware flaws. As these vulnerabilities reside in immutable code, affected users should be aware that migrating to newer hardware remains the most effective mitigation. USB Setup Packets Anatomy In the context of the USB specification, every control transfer must initiate with a Setup transaction. This is the mechanism the host uses to issue any kind of request to the attached device. A proper Setup transaction consists of two packets sent by the host: (1) TOKEN PACKET host → device ┌───────┬──────────┬─────────┬─────────┬────────┬──────┐ │ SYNC │ PID │ ADDR │ ENDP │ CRC5 │ EOP │ │ 8 bit │ 8 bit │ 7 bit │ 4 bit │ 5 bit │ │ │ │ (SETUP) │ device │endpoint │ │ │ └───────┴──────────┴─────────┴─────────┴────────┴──────┘
(2) DATA PACKET host → device
┌────────────────────────┐ ┌───────┬──────────┤ DATA (8 bytes) ├────────┬──────┐ │ SYNC │ PID │ = USB Device Request │ CRC16 │ EOP │ │ 8 bit │ 8 bit │ │ 16 bit │ │ │ │ (DATA0) │ This is what the USB │ │ │ └───────┴──────────┤ driver receives ├────────┴──────┘ └────────────────────────┘ According to the specification, the data payload of a Setup transaction must be exactly 8 bytes and adhere to a strict format. The device request structure contained within this payload is passed verbatim to the software driver for handling. To maintain clarity in our analysis, we will refer to this specific payload as the "Setup packet." DWC2 Direct Memory Access The USB controller used by Apple in their SoCs is the DWC2 by Synopsys. Detailed information on how this controller works can be inferred by analyzing other existing driver implementations, such as those found in the Linux kernel. What we care about here is how the controller writes data to main memory. The AP configures DMA by allocating a memory region and writing its physical address to a specific register (DOEPDMA) in the controller's MMIO region. The controller uses this buffer to store data received in SETUP and OUT packets, which is then processed by the device. We did observe that when the USB controller writes data chunks, it directly increments the address stored in the DOEPDMA register. This is a crucial detail since it might mean that the register value acts as a direct source of truth, defining the physical address that will be used for the next DMA transfer, instead of just being a configuration facility. The Bug The DesignWare USB controller stores up to three consecutive Setup packets in memory. Upon receiving a fourth Setup transaction, the DMA base address gets reset to its starting position before writing, akin to a ring buffer mechanism. After writing each received packet, the controller increments DOEPDMA by the size of data written. The reset operation is implemented by decrementing DOEPDMA by 24.
The core issue arises because the controller also accepts smaller packets (though always stores in 4-byte chunks). Since the pointer increment does not match the fixed decrement amount, we end up with a buffer underflow primitive in 12-byte steps.
We believe this is an inherent bug within the USB controller itself. While potentially affecting many devices, the vulnerability works under specific circumstances only. As of today, we have confirmed that the A12 and A13 SecureROMs are vulnerable, whereas A11 is not. The difference is that the A11 USB driver manually resets the DMA address to its initial value after receiving each packet. On A12 and A13, USB DART is configured in bypass mode, allowing us to overwrite SRAM data freely. In contrast, A14 and later generations appear to configure the DART correctly in SecureROM, making the vulnerability unexploitable. PC control on A12 Achieving PC control on A12 is straightforward because the USB controller's DMA buffer is allocated on the heap shortly after the USB task's stack. The simplest approach is to overwrite a saved LR on the stack and obtain direct PC control when the scheduler performs a context switch back to the USB task. PC control on A13 Things aren't as easy on A13 SecureROM due to the introduction of PAC. From what we observed, PAC appears to be applied only to stack-stored LRs, which however is enough to prevent us from directly targeting the USB task's saved LR as we did on A12. Several mitigations had to be bypassed along the way. These include heap metadata checksums, which are verified during heap operations, and LR signing during context switches, which occur whenever the USB task is woken up to process USB packets. After some iterations, we came up with the following multi-step technique to achieve PC control. The first step is to overwrite some DART-related data located in the heap immediately before the USB controller's DMA buffer. This gives us very limited write primitives that can be triggered once when exiting the DFU loop. We leverage some of the cleanup routines, which use controlled data, to zero out the global pointer to the DART allocation. This step is necessary to prevent the corrupted allocation from being freed, which would otherwise trigger a panic when the heap checksum is checked.
void dart_stop(unsigned int dart_id) { // [1] we can fully overwrite the heap memory for this object dart = darts[dart_id]; mmio_base = dart->info->mmio_base; v4 = 16 * dart->ctx->field_11 + 0x200; for ( int i = 0; i < 16; i += 4 ) { // [2] this gives us a 16-byte zero write primitive *(_DWORD *)(mmio_base + v4 + i) = 0; } dart_flush_maybe(mmio_base); }
void dart_free(__int64 dart_id) { // [...] dart_stop(dart_id); enter_critical_section(); ref_count = info->ref_count - 1; info->ref_count = ref_count; if ( !ref_count ) irq_mask(info->int_irq_num); exit_critical_section(); // [3] we need to use the zero write primitive for these second deref to return 0 and make the free a no-op dart = darts[dart_id]; free(dart); darts[dart_id] = 0; } Next, as part of the same cleanup path, we use a 0xf write primitive to overwrite a global panic counter. In this way, the next panic will cause the CPU to enter an infinite loop rather than triggering a reboot. void dart_flush_maybe(__int64 mmio_base) { __dmb(); // [4] these gives us a 0xF write primitive targeting the panic depth counter *(_DWORD *)(mmio_base + 52) = 0xF; *(_DWORD *)(mmio_base + 32) = 0; ticks = get_ticks(); while ( 1 ) { v3 = get_ticks(); if ( (*(_DWORD *)(mmio_base + 32) & 4) == 0 ) break; if ( v3 - ticks >= 1000000
) panic(); } }
void __noreturn panic() { // [5] after overwritting this global we would enter an infinite loop on panic if ( ++panic_cnt >= 3 ) spin(); // [...] } The next step is to avoid breaking the USB task's context, in particular the LR and SP registers, which are saved to and restored from the task structure during context switches. To achieve this, we time DMA writes to happen when the the task is awake, so that when it yields, the correct register values overwrite the ones we corrupted. Task structure 0x000 ┌─────────────────────┐ │ │ │ Task state, │ │ scheduler data, │ │ crit. section depth │ │ │ 0x030 ├─────────────────────┤ │ │ │ Other registers │ │ │ <──┐ ├──────────┬──────────┤ │ This area needs to │ LR │ SP │ │ be overwritten while ├──────────┴──────────┤ │ USB task is running │ │ <──┘ │ Safe to overwrite │ │ │ 0x1b0 └─────────────────────┘ After this, we can target a field within the task structure itself that tracks the task's critical-section depth. This allows us to trigger a panic with IRQs enabled, causing execution to enter the infinite loop established in the first step while still allowing ISRs to run. Additionally, the USB controller remains in a state that allows us to continue writing data to memory. void enter_critical_section() { current_task = current_task(); critical_section_depth = current_task->critical_section_depth; if ( critical_section_depth < 0 || critical_section_depth >= 10000 ) panic(); // [1] on each call increments task's critical_section_depth current_task->critical_section_depth = critical_section_depth + 1; if ( !
critical_section_depth ) { irq_disable(); } }
void exit_critical_section() { current_task = current_task(); // [2] should be equal to the amount of entries to `enter_critical_section` // but we can overwrite it with a smaller count critical_section_depth = current_task->critical_section_depth; // [3] on first entry it will enable interrupts and on second entry it will panic if ( critical_section_depth <= 0 ) panic(); critical_section_depth = critical_section_depth - 1; current_task->critical_section_depth = v2; if ( !critical_section_depth ) irq_enable(); } After all of this setup, we can freely overwrite memory until we reach the global variable containing the USB IRQ handler itself. By overwriting it with an arbitrary value, we gain PC control. // [1] an array of `irq_handler_ctx` structs lives in the BSS section which we can reach with our bug 00000000 struct irq_handler_ctx // sizeof=0x18 00000000 { 00000000 void (*handler)(void *arg); 00000008 __int64 *arg; 00000010 _BYTE unk; 00000011 _BYTE shall_mask; 00000012 // padding byte 00000013 // padding byte 00000014 // padding byte 00000015 // padding byte 00000016 // padding byte 00000017 // padding byte 00000018 };
void handle_irq() { irq_num = MEMORY[0x23B102004]; if ( MEMORY[0x23B102004] ) { while ( irq_num ) { if ( (irq_num & 0x70000) !