Making your own programming language is easier than you think (but also harder)
Pangram verdict · v3.3
We believe that this document is fully human-written
AI likelihood · overall
HumanArticle text · 1,763 words · 5 segments analyzed
Making your own programming language is easier than you think (but also harder)2026 May 6In mid-December last year I started making my own programming language. It's waaay far from any production quality yet (though I did manage to write a working 1k LOC Monte-Carlo path tracer in it), but the project is on pause right now, so I figured it's a good time to write something about it.Disclaimer #1: I'm not a professional PL designer or compiler implementor. Even though I do feel like I know what I'm talking about for the most part of this post, I might still end up talking some nonsense.Disclaimer #2: it's not another C/C++/Rust/etc killer, and I doubt it'll ever be actually used to any noticeable extent. I'm just having fun and talking about me having fun.Disclaimer #3: if you have some strong opinions about programming languages, please, keep in mind that I'm not forcing you to use this language, and that it's a bit rude to be telling random people on the internet what they should do. If, on the other hand, you have constructive feedback and suggestions, I'm all ears!ContentsIntroductionWhy now?I mean, most programmers dream of their own perfect programming language. I've been programming for about 17 years, so why did I decide to make a language at this specific point in time?It just so happens that 3 different things converged in my mind. Of course, I always wanted to make my own programming language as well. I made a bunch of silly interpreters for some esoteric languages in the past (FALSE is probably my favourite), as well as interpreters for various flavours of lambda calculus, but that doesn't scratch the itch of making a real language, one that is at least somewhat production-oriented and doesn't feel like a toy. As you might probably know, I'm working on a big game which is highly susceptible to modding, and I've been thinking about how to approach modding since the start of this project. I've analyzed a ton of options, and it just so happens that making a custom programming language is actually one of the simplest solutions.
In December 2025, the amazing Matt Godbolt introduced the Advent of Compiler Optimisations, where he'd post some fun examples of what C++ compilers are capable of, walking through the generated assembly. Apart from this being an excellent series, it really made me want to mess with some assembly once again. Of course, making a non-toy programming language is a gargantuan endeavour, but somehow after looking at assembly for a few weeks, I felt like it shouldn't be that bad.ModdingI want to elaborate on the modding thing. Essentially, I have 3 main concerns with respect to modding: My game is highly simulation-heavy. There are hundreds of thousands of entities simulated via a custom ECS engine. Ideally, I'd want the modding language to be able to just take a bunch of component pointers and iterate over them like you would in a C for loop. It's hard to control what's going on in mods, so some level of protection for the player would be nice to have. Ideally, I'd want the modding language to be easily sandboxable – i.e. I want to be able to disable all IO and similar stuff with a single switch. I want modding to be as easy as it can be. Ideally you'd throw a script in a certain folder and there you have it, a mod can be used. It was somewhat a surprise to me that there doesn't seem to exist a solution satisfying these two requirements. Let's go over common possibilities.Lua(or any other JIT-compiled scripting language for that matter). That's a standard choice, but it turns out that it's really hard to sandbox it. Apparently you need to prepend any untrusted Lua code with some kind of prelude that explicitly deletes all known standard library functions that can be used for IO and such. There are even lists of these functions online in the forms of github gists. Even if this probably does work, it doesn't sound like a reliable solution to me.Furthermore, Lua is a high-level dynamically-typed language that doesn't know anything about C pointers. Bridging ECS entity iteration into it will either force per-entity native \(\leftrightarrow\) Lua \(\leftrightarrow\) native jumps with nonzero overhead, or constructing a Lua array from the native entities, and then deconstructing it back.
Either way, this doesn't sound good.Not to mention that standard Lua and LuaJIT have diverged some versions ago, which might make it extremely confusing both for modders and myself.C++There's always the option to make mods "natively". All the iteration problems are gone, but distributing mods becomes a nightmare. If they'd be distributed in binary, I'd have to provide some sort of a dev environment for all platforms, and a centralized storage for binary artifacts. If they would otherwise be distributed as source code, I'd have to bundle a C++ compiler with the game, which are known to be heavy and slow (a basic LLVM installation takes about 10-20 times more disk space than my current version of the game).Oh, and sandboxing becomes impossible. If you're loading a native DLL which declares and uses int open();, you're doomed – there's basically no way to prevent it from accessing the filesystem, network, etc.And, – that goes without saying, – even though I personally do enjoy writing C++, I'd rather not force the modders to do that.All this applies to a bunch of other languages like Rust, by the way.Please not that while I do put modding as one of the goals for the language, I'm still very much unsure whether I'm going to actually use it this way, and I don't want to over-specialize the language to this use case. As I've said, I'm mostly messing around and having fun.Design goalsOk, so what do I want from my programming language? Quite a lot, actually: Seamless C interop – so that bridging between native game code and modding code would be as simple as a function call Low level – which is mostly a consequence of having to handle raw arrays of entities Practical and ergonomic – I want the modders to be able to write code with reasonable ease Easy sandboxing – for reasons outlined earlier Small compiler footprint – I don't want to embed a 1Gb compiler into a 50Mb game Fast compilation – I don't want to force players to wait hours for mod compilation (though this can be partially solved by extensive caching) Cross-platform for real – I'm fine
with supporting only a few widespread desktop platforms and making certain assumptions (like being 64-bit or having IEEE754 support) Reasonably fast – which is a relatively low bar compared to most dynamic languages Try not to just recreate C++ – I cannot but acknowledge that C++, being my favourite and primary language for years, has had a heavy influence on my views on programming languages; I really want to try to steer away from it when I can (spoiler: I don't think I succeeded much in that) Honestly, if I were just making a programming language strictly for fun, I'd start with System F and then iterate from there. But, given the above constraints, that's not really an option.The languageLet's have a look at what I've come up with. It's a weird blend of C++, Rust, Python, Zig, and maybe a few other languages.OverviewThe working title is pslang, from my pet game engine psemek. It is an imperative, eager-evaluated, call-by-value, low-level programming language with a static, strict and nominal type system. It looks something like this:func min(x: i32, y: i32) -> i32: return if x < y then x else y struct vec3i: x: i32 y: i32 z: i32 func apply(f: i32 -> i32, v: vec3i) -> vec3i: return vec3i(f(v.x), f(v.y), f(v.z)) func as_array(v: vec3i) -> i32[3]: return [v.x, v.y, v.z]Let's unpack that.ScopingAs you can see, the language uses indentation-based scoping, mostly so that the language feels somewhat like a scripting language and thus looks more friendly to newcomers. Also there's less visual noise thanks to that.Right now I'm using tab characters for indentation. Might replace them with spaces later, we'll see.Each function, loop body, if body, etc creates a new scope. Functions and structs can be defined inside any scope, and they are only visible within that scope. Note that local functions don't have access to variables in the scope they are defined in: they are not closures, the scoping only affects name resolution.
The top-level scope (the one not inside any function) is treated just like any other scope, and it contains the file's entry point, i.e. code that runs when the file is loaded/initialized. It's the equivalent of main(), and allows initializing global variables at module import or writing scripts that simply consist of a sequence of commands to run. (Internally, the top-level scope is wrapped into an anonymous function.)Primitive typesThere are *checks notes* 13 primitive types: bool, 4 signed integer types, 4 unsigned integer types, 3 floating-point types, and unit. The numeric types fit nicely in a table:i8 i16 i32 i64 u8 u16 u32 u64 f16 f32 f64The iNN types are signed integers, the uNN types are unsigned integers, and fNN are floating-point. As you can see, there's no f8 type, as it isn't supported by most desktop CPUs and there isn't a consensus about what 8-bit floating-point even means (afaik there are a bunch of competing standards for that).f16 isn't useful for most people, but we use it routinely in graphics (for HDR colors, vertex attributes, etc), and not having it in the host language is always a noticeable inconvenience. Most desktop CPUs these days implement IEEE754 f16, so it doesn't really cost me anything to support this type out of the box.Some people had very strong opinions that I should exclude unsigned types altogether. Having been using specifically unsigned types in graphics and computations my whole life, I simply cannot fathom how that would even work.Btw all integer arithmetic is two's complement with overflow, no UB here.The unit type is a bit special. It has a single value called unit(), and it is the formal return type of functions not returning anything. If you omit the return type of a function, it automatically returns unit. If you omit a return statement in the end of such function, it automatically inserts it (otherwise it is an error not to return anything from a non-unit function). It can also be used for opaque pointers, though it's better to create empty structs for that.Numeric literalsBy default, numbers like 10 mean i32.