Codex SQLite feedback logs can write ~640 TB/year and rapidly consume SSD endurance

G github.com ↗

▲ 503 points • 269 comments • by vantareed • 2d ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is a mix of AI-generated, AI-assisted, and human-written content

45 %

AI likelihood · overall

Mixed

49% human-written 24% AI-generated

SEGMENTS · HUMAN 3 of 5

SEGMENTS · AI 1 of 5

WORD COUNT 737

PEAK AI % 87% · §1

Analyzed

Jun 22

backend: pangram/v3.3

Segments scanned

5 windows

avg 147 words each

Distribution

49 / 24%

human / AI fraction

Verdict

Mixed

Pangram v3.3

Article text · 737 words · 5 segments analyzed

Human AI-generated

§1 AI · 87%

Issue Codex is continuously writing a large amount of data to the local SQLite feedback log database:

~/.codex/logs_2.sqlite ~/.codex/logs_2.sqlite-wal ~/.codex/logs_2.sqlite-shm

On my machine, after about 21 days of uptime, the main SSD has written about 37 TB. Process/file-level checks show Codex SQLite logs are the main continuous writer. That extrapolates to roughly 640 TB/year. On a 1 TB SSD, that is about 640 full-drive writes per year. Some consumer SSDs are rated around 600 TBW, so this could consume roughly a full drive's warranted write endurance in less than a year. Evidence Current retained rows in logs_2.sqlite:

metric value

retained rows 681,774

estimated retained log content 1,035.6 MiB

Level distribution:

level estimated MiB byte %

TRACE 732.5 70.7%

INFO 266.5 25.7%

DEBUG 30.6 3.0%

WARN 5.9 0.6%

Largest target+level pairs:

target level estimated MiB

codex_api::endpoint::responses_websocket TRACE 527.4

codex_otel.log_only INFO 141.2

codex_otel.trace_safe INFO 121.2

log TRACE 97.4

codex_client::transport TRACE 60.1

codex_core::stream_events_utils DEBUG 27.5

codex_api::sse::responses TRACE 19.1

The top sources are mostly global TRACE logs, mirrored telemetry logs, and raw websocket/SSE payload logging. TRACE alone is about 70.7% of retained bytes. codex_otel.log_only + codex_otel.trace_safe add another 25.3%. Filtering these categories should remove roughly 96% of retained log bytes in this sample without fully disabling feedback logs.

§2 Human · 14%

Sanitized examples from the most frequent TRACE source: target=log These are high-frequency retained samples. Raw websocket/SSE payload bodies are intentionally not included because they may contain private conversation content. 128,764x TRACE log: inotify event: ... mask: OPEN, name: Some("ld.so.cache") 37,982x TRACE log: inotify event: ... mask: OPEN, name: Some("locale.alias") 23,843x TRACE log: inotify event: ... mask: OPEN, name: Some("passwd") 3,639x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:131 AllowStd.with_context 3,505x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:245 WebSocketStream.with_context 3,362x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:154 Read.read 3,356x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:157 Read.with_context read -> poll_read 3,230x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:294 Stream.poll_next 3,227x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:304 Stream.with_context poll_next -> read() 3,213x TRACE log: inotify event: ... mask: OPEN, name: Some("nsswitch.conf") 2,001x TRACE log: WouldBlock 1,217x TRACE log: Masked: false 1,169x TRACE log: Opcode: Data(Text) 1,169x TRACE log: First: 11000001

Sanitized examples from frequent INFO sources The dominant INFO sources are mostly repeated OpenTelemetry mirror events.

§3 Human · 15%

IDs are redacted. 843x INFO codex_client::custom_ca: using system root certificates because no CA override environment variable was selected ...

334x INFO codex_otel.trace_safe: session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id=<redacted> codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}

333x INFO codex_otel.log_only: session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id=<redacted> codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}

332x INFO codex_otel.log_only: session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input_with_turn_context" submission.id=<redacted> codex.op="user_input_with_turn_context"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}

332x INFO codex_otel.trace_safe: session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input_with_turn_context" submission.id=<redacted> codex.op="user_input_with_turn_context"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}

Write amplification The retained DB size hides the real write volume. In a 15-second sample:

metric before after

retained rows 681,774 681,774

max row id 5,003,347,015 5,003,383,226

About 36,211 rows were inserted in 15 seconds, while retained row count stayed flat.

§4 Mixed · 64%

This suggests continuous insert-and-prune write amplification: rows are inserted, indexed, written to WAL, then pruned. Likely cause The SQLite feedback log sink is installed with a global TRACE default: Targets::new().with_default(Level::TRACE) This persists all targets at TRACE level by default, including dependency/internal logs and large raw protocol payloads. Proposed fix Keep feedback logs enabled, but narrow what is persisted by default:

Do not use global TRACE for the SQLite feedback log sink. Drop or raise thresholds for low-value dependency noise, especially target=log, hyper_util, tokio-tungstenite internals, inotify spam, and low-level OpenTelemetry SDK logs. Avoid persisting full raw websocket/SSE payloads by default. Store summaries instead: event kind, duration, success/error, token usage, and payload byte length. Avoid persisting mirrored codex_otel.log_only / codex_otel.trace_safe events unless they are explicitly useful for feedback debugging. Add a global logs DB size/write cap. Per-thread caps are not enough when many threads/processes exist.

An optional escape hatch such as sqlite_logs_enabled = false would still be useful, but the main fix should be better default filtering. Related issues and discussions

Excessive SQLite WAL writes during streaming due to TRACE logs ignoring RUST_LOG #17320 Codex Desktop rapidly grows logs_2.sqlite / WAL during normal active use #24275 app-server: feedback log sqlite (logs_N.sqlite) grows unbounded — ~0.75 GB/day, no retention/rotation #26374 logs_2.sqlite-wal grows indefinitely and remains allocated after deletion because stale/suspended Codex TUI processes keep the deleted WAL open #22444 Heavy I/O activity from idle codex processes.

§5 Human · 15%

#20563 Severe disk I/O / 100% disk active time on Windows WSL2 when using Codex extension / CLI #27020 goals_1.sqlite write amplification: ~11 MB/s sustained writes (11 GB lifetime) on a 4 KB database #27911 Codex Desktop becomes unusable on long active threads due to app-server/renderer memory and TRACE log churn #21134 app-server: source /feedback logs from sqlite at trace level #12969