Why your 'correct' C code is probably undefined behavior

What happened

A new blog post from Thomas Habets, making the rounds on Hacker News with 427 points, makes a provocative claim in its title: everything in C is undefined behavior. The argument isn't literal hyperbole — it's a careful walk through how the C standard's UB rules, combined with modern optimizing compilers, mean that the line between "working code" and "compiler-permitted nasal demons" is far thinner than most C programmers assume.

Habets enumerates the surface area: signed integer overflow, shifting by widths ≥ the type size, dereferencing a pointer one-past-end, comparing pointers from different objects, violating strict aliasing, reading uninitialized memory, calling `memcpy` with a NULL source (even when `n 0`), and modifying the same scalar twice between sequence points.== Each of these is something working C programmers do — or accidentally do — every day. The standard says the compiler is allowed to assume these never happen. GCC and Clang increasingly take that permission and run with it.

The post lands at a moment when this debate is no longer academic. CISA has explicitly told vendors to plan migrations off memory-unsafe languages. The Linux kernel has accumulated a long tail of patches forced by compiler UB exploitation — the most famous being Kees Cook's running battle to keep null checks from being optimized away after a pointer dereference. Habets's contribution isn't novel research; it's a clean consolidation of the indictment, with code examples that compile to surprising assembly.

Why it matters

The gap between "what the standard says" and "what C programmers think it says" is the entire story. Most working engineers learned C from K&R, a colleague, or by reading existing code — none of which trains you to treat the abstract machine as the source of truth. When you write `int x = a + b;` you mean "add two integers, possibly overflowing." The standard means "if `a + b` overflows the signed range, the program's behavior is undefined and the compiler may assume it does not."

That assumption isn't theoretical. GCC famously uses signed-overflow-is-UB to optimize loop induction variables, turning `for (int i = 0; i <= n; i++)` into infinite loops when `n INT_MAX` — and the standard says this is correct.== Clang's `-fsanitize=undefined` exists precisely because there is no other way to find these bugs at runtime; they're invisible to ordinary testing because the compiler's output "works" until the day it doesn't.

The community reaction on HN split along familiar lines. The C defenders argue — fairly — that UB exists because the standard had to accommodate trap-on-overflow CPUs, segmented memory models, and non-IEEE float hardware that existed in 1989. The critics counter that none of that hardware has shipped in volume for thirty years, and the cost of keeping the abstraction is now paid in CVEs. Both sides are right, which is why the resolution isn't "fix C" — the resolution is that new code increasingly isn't being written in C.

Rust, Zig, and even modern C++ (with `-fwrapv`, sanitizers, bounds-checked containers, and `std::span`) have meaningfully smaller UB surfaces. The Linux kernel's Rust experiment, now four years in, has shipped Rust drivers in mainline. Android's media stack rewrite in Rust took memory-safety bugs in that subsystem to roughly zero. The data on counterfactuals is no longer ambiguous.

What's interesting about Habets's framing is that it sidesteps the language wars and just shows you the C. The point isn't "Rust is better" — the point is that the C you think you're writing is not the C the compiler is compiling. Once you internalize that, every line of pointer arithmetic in your codebase becomes a small bet on the compiler's mood.

What this means for your stack

If you maintain C — and most infrastructure engineers do, even if it's just OpenSSL, glibc, or the kernel headers your service depends on — there are concrete moves worth making in 2026.

Turn on UBSan in CI, not just locally. `-fsanitize=undefined,address` catches a wide class of UB at runtime, and the overhead is acceptable for test suites. The cost of leaving it off is that you ship UB to production and find it via crash reports. Several large codebases — PostgreSQL, SQLite, curl — now gate merges on sanitizer-clean test runs. If your C project's CI doesn't run a sanitizer build, that's the single highest-ROI change you can make this quarter.

Use `-fwrapv` and `-fno-strict-aliasing` if you can't audit every cast. These flags trade some optimization for behavior most programmers expect. The Linux kernel has used both for years. Yes, you give up a few percent on integer-heavy code. You also stop the compiler from deleting your overflow check.

For new code: pick the smallest amount of C you can get away with. A 200-line `.c` file wrapping a stable C API and a 50KLOC Rust application around it is now a normal architecture. The boundary is the dangerous part; minimize it.

And if you're an SRE rather than a C author: the relevant question for your stack isn't "is C unsafe" — it's "which of my dependencies' release notes mention UBSan, fuzzing, or Rust ports?" That's the leading indicator of which transitive deps will eat you in 2027.

Looking ahead

The long arc here is unmistakable: C will remain the substrate of operating systems and embedded software for decades, but the volume of new C being written is in steady decline, and the willingness of compiler authors to interpret the standard aggressively is in steady incline. Those two trends compose badly. Habets's post is best read not as an attack on C — he clearly loves the language — but as a practitioner's plea to take the standard literally, because your compiler already does. The C you're actually compiling is the C in the spec, not the C in your head. Acting otherwise has a name in the industry now, and it's pronounced "CVE."

Why your 'correct' C code is probably undefined behavior

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

Everything in C is undefined behavior

// community takes

Why your 'correct' C code is probably undefined behavior

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

Everything in C is undefined behavior

// community takes

// share this