cost of enum-to-string: C++26 reflection vs the old ways

V vittorioromeo.com ↗

▲ 96 points • 179 comments • by sagacity • 2w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is primarily human-written, with some AI-generated content detected

17 %

AI likelihood · overall

Mixed

89% human-written 11% AI-generated

SEGMENTS · HUMAN 6 of 7

SEGMENTS · AI 1 of 7

WORD COUNT 1,651

PEAK AI % 100% · §2

Analyzed

May 13

backend: pangram/v3.3

Segments scanned

7 windows

avg 236 words each

Distribution

89 / 11%

human / AI fraction

Verdict

Mixed

Pangram v3.3

Article text · 1,651 words · 7 segments analyzed

Human AI-generated

§1 Human · 16%

Two months ago I published “the hidden compile-time cost of C++26 reflection”, where I measured what including <meta> and doing some basic reflection actually costs per translation unit. If you haven’t read it, start there – this post builds directly on top of it.That article used a prerelease GCC 16 snapshot. Since then, GCC 16 has been officially released1 and is now widely available, which seemed like a good excuse to revisit the topic with a more realistic example: enum-to-string conversion.Enum-to-string is the “hello world” of reflection – but it’s also genuinely useful in real projects, for things like logging, serialization, debugging, and so on. If you adopt reflection in a real codebase, it might be the first thing you’ll write.So: how much does reflection-based enum-to-string actually cost, in compile time, compared to the alternatives?the three approachesI benchmarked three implementations of the same operation: given an enum value, return a std::string_view with its enumerator name.1. reflection (c++26)No macros, no boilerplate, works for any enum:#include <meta> #include <string_view> #include <type_traits>

template <typename T> requires std::is_enum_v<T> constexpr std::string_view to_enum_string(T val) { template for (constexpr auto e : std::define_static_array(std::meta::enumerators_of(^^T))) { if (val == [:e:]) return std::meta::identifier_of(e); }

return "<unknown>"; }The code was taken from “What the heck is Reflection?” by Murat Hepeyiler. I think it’s a pretty idiomatic example.2. enchantum (c++17)A C++17 header-only library by ZXShady that achieves enum reflection through __PRETTY_FUNCTION__ parsing tricks. No macros at the call site, no reflection flag needed:#include <enchantum/enchantum.hpp>

enum class E { V0, V1, V2, V3 };

std::string_view s = enchantum::to_string(E::V0);3.

§2 AI · 100%

x-macro (preprocessor)The C-style solution. You list the enumerators once, and a single macro expands into both the enum class definition and a to_string function that uses a switch:#include "xmacro_enum.hpp"

#define E_LIST(X) \ X(V0) X(V1) X(V2) X(V3)

DEFINE_ENUM(E, E_LIST)

// Generates: // enum class E { V0, V1, V2, V3 }; // constexpr std::string_view to_string(E e) { ... }The implementation of DEFINE_ENUM is a few lines of preprocessor glue:#include <string_view>

#define XMACRO_VALUE_(name) name,

#define XMACRO_CASE_(name) \ case EnumType_::name: return #name;

#define DEFINE_ENUM(x_enum_name, x_list) \ enum class x_enum_name { x_list(XMACRO_VALUE_) }; \ constexpr std::string_view to_string(x_enum_name e) \ { \ using EnumType_ = x_enum_name; \ switch (e) { x_list(XMACRO_CASE_) } \ return "<unknown>"; \ }The only header it pulls in is <string_view>. We’ll also test a variant of this approach that returns const char* instead, with zero standard library includes.the benchmarkFor each approach, I created several translation units that: Define a single enum class E with N enumerators (V0, V1, …, V(N-1)); Include the enum-to-string header; Call the conversion function once with a runtime value to force instantiation. I varied N across 4, 16, 64, 256, and 1024 to see how cost scales with enum size.

§3 Human · 30%

For example, the reflection variant for N=4 looks like this:#include "enum_to_string.hpp"

enum class E { V0, V1, V2, V3 };

volatile int runtime_idx = 0;

int main() { const E val = static_cast<E>(runtime_idx); const auto sv = to_enum_string(val); return static_cast<int>(sv.size()); }The enchantum and X-macro variants are structurally identical – only the header and the function call change. I also separately measured the cost of just including each header, to isolate the header tax from the reflection work itself. ⚠️ Note on enchantum’s default range: enchantum scans enum values across a configurable range (default [-256, 256]). For N=1024, I had to bump ENCHANTUM_MAX_RANGE to 1024, which also slows down every other enum in the same TU. Keep that in mind when reading the N=1024 row. All benchmark files are available on GitHub.benchmarking setupSame as the previous article – hyperfine inside a Fedora 44 Docker container on a 13th Gen i9-13900K. Two things are different this time: Compiler: gcc 16.1.1 20260501 (Red Hat 16.1.1-1, release build) – the officially released GCC 16, not a prerelease snapshot. Noise control: the container ran with --cpuset-cpus=0-7, the host was set to the performance CPU governor, and the compiler process was pinned to a P-core with taskset -c 0. hyperfine was run with --warmup 5 --min-runs 20. Usual disclaimer: measurements aren’t strictly rigorous, my hardware is beefy (YMMV), and single-TU numbers undersell project-wide cost.

§4 Human · 25%

Also, these measurements are specific to GCC 16’s current reflection and module implementation; other compilers may exhibit very different behavior.benchmark resultstotal per-TU compile time

N X-macro (const char*) X-macro (string_view) enchantum Reflection

Baseline (int main()) 25.8 ms 25.7 ms 25.8 ms 25.7 ms

Header include only 25.7 ms 136.0 ms 147.1 ms 180.8 ms

4 26.6 ms 137.6 ms 170.6 ms 186.7 ms

16 26.9 ms 138.1 ms 170.9 ms 187.7 ms

64 28.0 ms 141.2 ms 172.8 ms 191.1 ms

256 32.5 ms 153.0 ms 184.1 ms 215.0 ms

1024 54.7 ms 204.5 ms 272.0 ms2 255.0 ms

algorithm-only cost (TU time minus include-only time)This approximates the additional reflection work beyond header inclusion:

N X-macro (const char*) X-macro (string_view) enchantum Reflection

4 0.9 ms 1.6 ms 23.5 ms 5.9 ms

16 1.2 ms 2.1 ms 23.8 ms 6.9 ms

64 2.3 ms 5.2 ms 25.7 ms 10.3 ms

256 6.8 ms 17.0 ms 37.0 ms 34.2 ms

1024 29.0 ms 68.5 ms 124.9 ms 74.2 ms

§5 Human · 14%

per-enumerator scaling

Approach ms / enumerator

X-macro (const char*) ~0.027

X-macro (string_view) ~0.06

Reflection ~0.07

enchantum O(scan range), not O(N)

enchantum does not scale with the actual enum size – it scales with the configured scan range, since it has to probe every possible value in that range. That’s why an N=4 enum costs almost as much as N=64.reflection with PCH and modulesSince the reflection variant pays a ~155 ms header tax for <meta>, the obvious question is: does precompiling the header or switching to C++20 modules eliminate it?I re-ran the reflection benchmark with two extra configurations: PCH: precompiled <meta>, <string_view>, <type_traits> once, then compiled the TUs with -include pch.hpp. Modules: pre-built the std and std.compat modules and the <bits/stdc++.h> header unit once via GCC 16’s new --compile-std-module flag (this cost is not included in the measurement), then compiled the TUs with -fmodules so that #include <meta> is transparently translated into import <bits/stdc++.h>.

N Reflection (plain #include) Reflection + PCH Reflection + modules

Header include only 180.8 ms 73.8 ms 397.4 ms

4 186.7 ms 80.6 ms 403.4 ms

16 187.7 ms 81.0 ms 403.1 ms

64 191.1 ms 84.4 ms 409.4 ms

256 215.0 ms 97.5 ms 423.2 ms

1024 255.0 ms 147.9 ms 482.5 ms

PCH is the clear winner – about ~2.3x speedup at every enum size, dropping N=4 from 187 ms to 81 ms.

§6 Human · 13%

With PCH in place, reflection beats both enchantum and the string_view X-macro variant outright.Modules are the opposite: about ~2.2x slowdown. I verified that the std module artifacts were genuinely cached (mtime unchanged across runs) and that GCC was loading them (confirmed via -flang-info-module-cmi and -ftime-report, which attributed ~190 ms to module import plus another ~190 ms to template instantiation work the module triggered).3Explicit import std; performs essentially the same as the transparent translation, because GCC’s std module is currently implemented as a thin wrapper around the <bits/stdc++.h> header unit – both routes end up loading the same ~34 MB artifact.insights The header is the cost. Not the reflection. The reflection algorithm is fast – asymptotically ~0.07 ms per enumerator, essentially the same as the hand-rolled switch in the X-macro version (~0.06 ms). What makes reflection look expensive is <meta>: just including it costs ~155 ms per TU over the baseline. The X-macro with const char* is the fastest tested approach. With zero standard library headers, an N=4 enum compiles in 26.6 ms – within the noise of the baseline. Even N=1024 (54.7 ms) is faster than just #include <meta> with no reflection work at all. Most of what we call “slow C++ compilation” is really slow standard library compilation. enchantum has the smallest include cost of the non-trivial approaches4 (~147 ms vs reflection’s ~181 ms), but the heaviest per-call work (~24 ms even for tiny enums, because it always scans the full configured range, regardless of how many enumerators you actually have). That’s why it wins on small enums and loses on large ones. Reflection has the best ergonomics but the highest header tax. It works for any enum – sparse, scoped, unscoped – with no special setup at the declaration site. But every TU that touches <meta> pays ~155 ms before any reflection happens.

§7 Human · 5%

PCH closes the gap, modules widen it. Precompiling <meta> cuts reflection compile time by ~2.3x and makes it the fastest of the three approaches. C++20 modules in GCC 16, surprisingly, go the other way – ~2.2x slower than the plain include path. what this means in a real codebaseThe single-TU numbers look small. They are not, at scale.A large C++ codebase can easily have a few hundred translation units that pull in the enum-to-string header, perhaps transitively. Picking 500 TUs as a round number, and an N=16 enum as a typical size:

Approach Per-TU cost Project-wide cost (500 TUs)

X-macro (const char*) 26.9 ms ~13 seconds

X-macro (string_view) 138.1 ms ~69 seconds

enchantum 170.9 ms ~85 seconds

Reflection 187.7 ms ~94 seconds

A few hundred milliseconds per TU turns into over a minute of compile time at the project level. That’s the difference between a sub-15-second clean build and a minute and a half. Incremental builds won’t always save you, because every TU that includes an affected header pays the full price.This is multiplied by every header in your project that has similar overhead. Real codebases don’t have one heavy header – they have dozens. The few hundred milliseconds you see in a microbenchmark become minutes once you multiply.On the other hand, the numbers shown here do not take parallelism into account. E.g. the ~94 CPU-seconds would be ~6s on a 16-core machine, assuming perfect parallelism.what to do about itIf you’re adopting reflection-based enum-to-string in a large codebase: Use PCH for <meta> – not modules. As shown above, a PCH cuts the header cost by ~2.3x and makes reflection the fastest of the three approaches.