Pangram verdict · v3.3
We believe that this document is fully human-written
AI likelihood · overall
HumanArticle text · 1,830 words · 5 segments analyzed
Fil-C achieves memory safety even for programs that behave adversarially. That includes casting function pointers to the wrong signature and then calling them, exporting a function with one signature in one module and then importing it with a different signature in another, or even exporting a symbol as a function in one module and importing it as data in another (and vice-versa). Passing too few arguments, arguments of the wrong type, misusing va_list (including escaping it), expecting too many values to be returned - these are all things that the Fil-C calling convention either catches with a panic or ascribes safe behavior to. But in the common case - like when the programmer is behaving themselves - Fil-C generates reasonably efficient code for the call. For example, a call like this: int x = 42; const char* y = "hello"; int z = foo(x, y); in one module (say caller.c) with foo defined in another module (say foo.c): int foo(int x, const char* y) { ... /* whatever */ } will be compiled at the callsite exactly as if you had done the following call in Yolo-C with an optimized arguments-in-registers ABI: foo(my_thread, x, y); Where my_thread is a pointer to the current Fil-C thread, which Fil-C passes around as the first argument in all calls. So, my_thread, x, and y will be passed in registers. The implementation of foo will not check that x is an int and that y is a const char* (though if you use y, it will check that the pointer is in bounds of the capability and that the capability allows whatever kind of access you do). The return value will be passed in a register, too. In this regard, Fil-C is almost as efficient as Yolo-C! And yet, if we changed foo to take extra arguments, we would get a panic. And if we changed the signature in any way (maybe x becomes a pointer and y becomes a double), we would either get a panic or a well-defined bitwise cast of the value to the other type. This document explains how Fil-C manages to avoid doing any safety checks for the common case of calls while either panicking or strictly following well-defined GIMSO semantics in case calls are misused in some type-violating way. First, the generic calling convention is explained.
All optimizations obey identical semantics to the generic calling convention and the generic calling convention is the fallback when the optimizations would not be legal under those semantics. Second, the register calling convention optimization is described. This is what allows arguments and return values to be passed in registers in the common case. Finally, the direct call optimizations are described. These optimizations make it possible for the caller to avoid doing any checks about whether the callee agrees on the function signature. Generic Calling Convention This section is almost identical to the call section in the GIMSO document, except it combines how to get the callee with executing the call. In the generic case, calls proceed as follows. The callee is resolved. For indirect calls, the callee is a flight pointer (tuple of capability pointer and pointer intval) we already have in hand, so this step is a no-op in that case. For direct calls, the callee is a symbol name. ELF linkers provide a built-in facility to automatically resolve symbol names to function pointers. But to support memory-safe linking and loading in Fil-C, we need symbol names to resolve to a flight pointer, so that we can then check that the thing that the pointer points at is suitable for whatever we want to do to it (the next step for calls is to check that we have a function capability; for global variable accesses we would check that the global is a data capability and that the access is in bounds). Hence, the Fil-C compiler lowers symbol resolution to a getter call. The getter returns the callee flight pointer. The callee is checked. The following requirements must be met, or else a panic occurs: Capability must not be null. Capability must be a function capability. The pointer's intval must match the capability's callable pointer value. The size of the argument buffer is computed by rounding up each argument's size to 8. Additionally, argument type alignment is obeyed, which may mean adding padding. Note that byref arguments have their value copied into the argument buffer, so the argument's type for the purpose of the computation is the reference'd type, not ptr. Two thread-local CC (calling convention) buffers are allocated of that size. These buffers live only long enough for the callee to retrieve the arguments. One buffer is for the payload, and the other is for capabilities. Each argument is copied into the CC buffers.
For byref arguments, the pointed-at value is copied into the buffers. Control is transferred to the callee's prologue and the callsite address is saved to a private callstack. The stack where the callsite address is stored is outside of Fil-C memory and cannot be accessed with any capability. The callee is told about the size of the arguments as well as the function capability. Passing the function capability is useful for libffi implementing closures, but is otherwise unused. The callee's prologue heap-allocates (as if with alloca) any byref parameters. All arguments are copied out of the CC buffers. For non-byref parameters, the arguments are copied into local data flow. For byref parameters, the arguments are copied into the allocations from step 6. If the callee uses any argument introspection (like va_arg or zargs), then the CC buffers are copied into a newly created readonly heap object. At this point, the CC buffer is dead. In practice, the implementation may reuse the same CC buffer repeatedly. The callee executes. If an exception throw happens, then we return to the callsite with a flag indicating that an exception is in flight. When the callee returns normally, an almost identical process to argument passing happens, except for the return value. First the size of the return buffer is computed by rounding up the return type's size to 8. The CC buffer is allocated of that size. It will live until the callsite finishes retrieving the result. The return value is copied into the CC buffers. Control is transferred back to the callsite with a flag indicating that an exception is NOT in flight, as well as the size of the return value. The callsite loads the return value from the CC buffers and produces it in local data flow. If the callsite observes the exception flag being set, then the caller returns with the exception flag set. Let's consider an example of an indirect call like: int arg1 = ...; char* arg2 = ...; double arg3 = ...; char* result = function_pointer(arg1, arg2, arg3); The generic calling convention - before we did any of the optimizations in this document - would look like: check_function_call(function_pointer);
/* all of the capability checks */ (int*)(my_thread->cc_inline_buffer + 0) = arg1; (void**)(my_thread->cc_inline_aux_buffer + 0) = NULL; (void**)(my_thread->cc_inline_buffer + 8) = arg2.intval; (void**)(my_thread->cc_inline_aux_buffer + 8) = arg2.lower; (double*)(my_thread->cc_inline_buffer + 16) = arg3; (void**)(my_thread->cc_inline_aux_buffer + 16) = NULL; struct pizlonated_return_value { bool has_exception; size_t return_size; }; struct pizlonated_return_value rv = ((pizlonated_function_type)function_pointer.intval)( my_thread, function_pointer.lower, 24); if (rv.has_exception) goto unwind_handler; if (rv.return_size < 8) goto panic; flight_ptr result; result.intval = *(void**)(my_thread->cc_inline_buffer + 0); result.lower = *(void**)(my_thread->cc_inline_aux_buffer + 0); This calling convention is inefficient in three major ways: Arguments and return values are passed using thread-local CC buffers rather than in registers. The callee's capability must be checked. Direct calls require calling a getter to get a capability to the callee. The next two sections describe the optimizations that eliminate this overhead in the common case. The section that immediately follows describes how to pass arguments and return values in registers in the common case. The section after that describes how to avoid checking the callee's capability or even calling the getter. Register Calling Convention Using Arithmetically Encoded Signatures And Generic Call Thunks Fil-C function pointers are quite rich: The pointer value seen by the user (the intval) can be whatever we (the implementors of Fil-C) want it to be, so long as it's consistent. We can make it just be a pointer to the base of some kind of object rather than an actual code pointer, so long as the implementation of the LLVM call and invoke opcodes knows what to do with it. All pointers have an invisible lower pointer, which points to just above the capability object.
For special objects like functions, lower pointer also points to the bottom of an internal object that Fil-C controls and the user is not allowed to edit (all reads and writes are disallowed because special objects have the upper bound set to exactly the lower bound). Fil-C supports closures, which are function pointers that carry extra state that can be retrieved by the caller. Given any defined function, the user can create as many closure objects as they like, each with different data attached to them. This is necessary for supporting libffi closures without using JIT privileges. The presence of this feature means that when calling a Fil-C function, we are already passing it a pointer to the function object (aka the function capability) as one of the arguments. This power gives us a lot of opportunities! This first optimization makes the function object have these fields. Remember - these fields cannot be accessed directly by the Fil-C program, so they can make use of raw pointers. fast_entrypoint - this is a raw pointer to a function entrypoint that uses a native, register-based calling convention for whatever signature the function was defined to use. The only ways that the calling convention differs from the Yolo-C one is that the first two arguments are the thread and the function object, the return value is a struct that includes a bit that tells if there was an exception, and pointers are passed as tuples of lower and intval. generic_entrypoint - this is a raw pointer to a function entrypoint that uses the generic calling convention based on thread-local CC buffers. Note that this entrypoint takes the function object as one of its arguments. We will use this fact! signature - a 64-bit arithmetic encoding of the function signature. Think of this as a perfect hash of the signature. If this value is 0 then it means that the function only has a generic entrypoint (so fast_entrypoint will be NULL). In case the function object is a closure, there's one more field: the data_ptr, which is a user-controlled flight pointer (a tuple of lower and intval). We know that a function object is a closure if the READONLY object flag is not set. Let's talk about this optimization as follows. First, what does the callsite do. Second, what thunks are emitted by the caller and callee to rescue cases where the signature doesn't match. Finally, how the arithmetic encoding of signatures works.