Making a Shading Language for my Offline Renderer

A agraphicsguynotes.com ↗

▲ 45 points • 3 comments • by ibobev • 2w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

0 %

AI likelihood · overall

Human

100% human-written 0% AI-generated

SEGMENTS · HUMAN 5 of 5

SEGMENTS · AI 0 of 5

WORD COUNT 1,984

PEAK AI % 0% · §1

Analyzed

Jun 11

backend: pangram/v3.3

Segments scanned

5 windows

avg 397 words each

Distribution

100 / 0%

human / AI fraction

Verdict

Human

Pangram v3.3

Article text · 1,984 words · 5 segments analyzed

Human AI-generated

§1 Human · 0%

As a graphics programmer, I don’t usually spend too much time on something that is not strictly related to computer graphics or game engine. However, I did spend four months in my spare time last year building a shading language for my renderer SORT, which I call Tiny Shading Language (TSL). In the beginning, I didn’t know how it would end up eventually due to the lack of knowledge about how compilers work in general, this is not something graphics programmers touch regularly. However, it does turn out without too much work, this thing can be done by one person in a few months. In this blog, I will briefly cover some of my thoughts in designing this shading language library. To be more specific, this blog is about how the system is designed and how it works with an offline CPU renderer, instead of the detailed language implementation. The following is a screenshot of the example tutorial that comes with the TSL library. The patterns on the surface of the two spheres are procedurally generated in TSL. Motivation Whenever people heard about my new shading language, the first thing they asked was always, ‘why do you want to make your own shading language since there is already OSL’. This is a fairly good question, I had the same doubts for more than half a year before putting my hands on it. There are generally several reasons, By working on my own shading language, I can learn everything from scratch by myself. This is clearly the biggest reason that I chose to work on it. The knowledge gained in the process of making it would be valuable to my career in the future, at least it should have some indirect impact on my work. It should allow me to have a much deeper understanding of how a programming language compiler works. Having my own code base will allow me to change the library anyway I see fit. This alone offers me a lot more flexibility than OSL since I’m not familiar with their implementation. Since Apple is currently in the transition from Intel chips to ARM, future MacOSs will be shipped on ARM architecture. Building OSL on ARM will require building all its dependencies on ARM too. If there is any implementation that is x86 specific in any of its dependencies, I will have to find a workaround implementation on ARM too. Without OSL on ARM, there is no way to port my renderer on Apple Silicon.

§2 Human · 0%

Supporting Apple Silicon is in OSL’s roadmap, but it is unclear when it will be available by the time this blog was written. OSL heavily uses this library called Open Image IO, which is another open-source project. While OIIO further depends on several other small libraries like, OpenExr, libpng, libtiff and a few others. Some of the basic data structures are only defined in OIIO, so decoupling OSL with OIIO would require quite some work. This was one of the options that I have considered, but after a second thought, it will also make updating OSL in my renderer very hard since it is locally modified. Having too many dependencies does make OSL a bit ‘heavy’ than expected. My original expectation was just to have one OSL lib as dependencies, it certainly didn’t end up the way I planned. Eventually, there are other libs too and on some platforms, some of the libraries have to be dynamically linked too. Ideally, I could have spent much more time to make sure all of the dependencies are statically compiled, but it is way more sophisticated than it sounds. I wouldn’t want to spend too much time building those libraries from source code. Also, some of the pre-compiled libraries are OS-dependent on Ubuntu, which means that I will have to compile it another time on another version of Ubuntu. At the end of the day, after solving all these problems, features in OIIO are not even used in my renderer at all and I do not have a plan to use them in my renderer since doing it from scratch would be lots of fun, it is a dependency purely because OSL needs it. There are other reasons that motivate me to implement my own shading language. However, those are mostly related to OSL. I would like to avoid having comments about OSL since at the time this blog was written, I was an employee of Sony (Naughty Dog). With the above-mentioned reasons, I hope I have made it clear why I decided to implement my own shading language. Of course, I am fully aware that my own implementation will be way less robust than OSL since there is a team behind it and this tech has been involved for more than ten years.

§3 Human · 0%

So my next question after finalizing my decision was whether this is doable by myself in a few months, I definitely didn’t want to deviate from my trail too much, I’m a graphics programmer anyway, no one will expect me to know too many details in designing a compiler. Taking Advantage of Existed Work The image below demonstrates some basic stages of compiling a programming language into machine code Starting from scratch and making everything by myself sounds crazy and it is not likely that I can finish everything in a few months. After some basic research and learning, I found some useful tools that could help me make my shading language a reality. Flex Flex is one of the commonly used tools as a lexical analyzer. Taking a string stream, it will basically tokenize all of the string in a pre-defined manner. Bison Bison is a syntax analyzer. It will take tokens generated from Flex and generate an abstract syntax tree. LLVM LLVM is short for low level virtual machine. It is a complex infrastructure that could help convert LLVM IR to machine code on different target architectures, like PC, X86, etc. It also does optimization too. With the these tools, it sounds like I only need to do the following things Make a configuration file for Flex to tokenize my shading language. Make a configuration file for Bison to use the tokens generated from Flex and generate AST. Generate LLVM IR with the AST generated from Bison. Compile the IR into JITed machine code with LLVM. This is way more manageable than writing everything by myself. And it did give me some confidence that this is doable by one person. However, apart from the basic language support, which is to convert my shading language to JITed machine code, this is far from enough since a programmable shading language for CPU ray tracers is fundamentally different from its GPU counterpart. A large proportion of the project is to design a user-friendly interface and let it fit well in my renderer. This is no less than the amount of work to be done just to convert high level language to machine code. This blog is mainly about the latter. It won’t mention anything about the former part since there is plenty of resources on the internet that could help in those topics. Kaleidoscope is a very good example that comes with the LLVM library.

§4 Human · 0%

Where does It Fit in a Ray Tracer Before we dive into the details of this shading language, I would like to put down some notes to offer the big picture of how this library fit in a ray tracer so that readers should have a brief idea about how it works with the rest of the system. A barebone path tracing algorithm could work like this, Spawn primary ray for each pixel sample. Find the nearest intersection with the scene. Use the material information to construct a BSDF. Evaluate the BSDF and update the throughput and accumulate it in the result. Importance sampling the BSDF and generate secondary rays if needed. Get back to step 2 and loop. The loop stops when there is no intersection found in step 2. Of course, this is also a simplified workflow since there is no volumetric rendering and subsurface scattering in it. But this is more than good enough for me to explain where TSL should fit. I guess all readers could have guessed at this point. The TSL execution should happen in step 3, which is exactly what the shader language is for. In PBRT 3rd, there are pre-defined materials with hard-coded BSDF and parameters of BXDF are exposed through pbrt input file with limited flexibility. With TSL, a ray tracer can translate shader script during runtime without itself being recompiled and this offers great flexibility. The purpose of TSL shader execution is to reconstruct BRDFs in the target BSDF, this purpose is very similar with what PBRT does with its material implementation, except that it is not hard coded in the ray tracer itself. Shader authoring can happen later when artists are working on assets. Language System Design TSL is designed to be simple and easy to use. The syntax of the language is very C-like, just like GLSL, HLSL, OSL. This is user friendly to even new shader authors. However, a CPU ray tracer shading language has lots of differences that are fundamentally different from a GPU shading language, which has a big impact on its language system design. TSL works in a very simliar manner with OSL, which also works on CPU. The followings are some of the major differences that TSL has compared with a GPU shading language.

§5 Human · 0%

In a graphics API, before issuing a draw call, we need to send information from host(CPU) to GPU, commonly including constant buffer, vertex buffer, index buffer, etc. Once all of the shader inputs are set, along with other render states, a draw call is issued. Depending on the exact situation, the number of shader executions could be up to a million or even more. In a nutshell, with everything setup once, shaders are usually executed on a massive scale. While TSL’s shader setup and execution is almost always one vs one. This is defined by the nature of CPU ray tracers. CPU’s parallelism is not utilized in the same way as GPU since different threads are not synchronized at all, there is simply nothing like Warp or Wavefront. Without synchronization, since different threads almost never run the same instructions like GPU does, executing shaders multiple times makes almost no sense at all. SIMD optimization would be a fairly bad fit for shaders since there are no four instances of shader executions at the same time. This is only true in my renderer. OSL has SIMD in its roadmap and some commercial renderers do take advantage of it to run batched shader execution for better performance. Modern game engines all support material graph, which is more of a visual programming language tool for technical artists to ‘code’. However, the GPU shader compilers are not aware of such a thing at all. There is something called ‘shader builder’ that builds the finalized shader source code from different shader fragments collected from the material graph. These shaders are usually called material shaders. The other type of shaders in game engines is commonly called in-game shader, which is usually programmed by graphics programers. And the GPU shader compiler will take the shader source code without even knowning whether it is an in-game shader or a composited material shader. There is generally no ‘in-game’ shader in an offline renderer. But material shader concept is a necessarity to support a wide variety of material appearance on the surfaces. Different from GPU shader compiler, I chose to implement the ‘shader builder’ algorithm inside TSL library just like OSL did so that renderers with TSL integrated only need to take shader fragments, which I call shader unit template, and TSL will be responsible for building the finalized shader code, which the renderer doesn’t even get a chance to see.