A popular technique modern compilers use to improve the runtime performance of compiled code is to perform computations at compile time instead of at runtime. However, constant expressions need to be evaluated at compile time for a variety of reasons. To help resolve this problem, I've been working on improving the clang's constant interpreter. Here's a look at just how much work has been done since my previous article in November 2022:

$ git log --after=2022-11-23 | grep -i "\[clang\]\[Interp\]" | wc -l
$ 308

A good chunk of those are NFC (no functional change) commits, which I like to split out from functional changes. All of those patches contain tons of small changes and refactorings, so in this article I concentrate on the largest and most important changes. The new constant expression interpreter is still experimental and under development. To use it, you must pass -fexperimental-new-constant-interpreter to clang.

Uninitialized local variables

This change feels like it landed an eternity ago, but it was in December 2022. Before C++20, local variables in constexpr functions had to have an initializer: https://godbolt.org/z/Mh893MoKq

In C++20, the constant interpreter needs to handle uninitialized local variables and diagnose if the program tries to read from them.

In the new interpreter, this information is handled via an InlineDescriptor struct, which simply contains one bit for the initialization state of the variable. Every local variable is preceded by an InlineDescriptor now, so checking whether a local variable is initialized is as simple as reading that one bit.

┌─────────────────┬─────────┐
│InlineDescriptor │ Data... │
└─────────────────┴─────────┘

This sounds simple enough, but as you can see from the Phabricator review linked above, it required quite a few changes in a lot of places because the assumption so far has been that local variables are always initialized.

Support for floating-point numbers

This change seems like it's been supported for even longer, but the new interpreter now supports floating-point numbers.

To support this, there is now a PrimType called PT_Float, which is used for all floating types, most typically float and double. It is backed by a new Floating class, which represents one such value. A Floating variable is basically a wrapper around LLVM's APFloat class, which does exactly the same thing.

For integer primitives, the PrimType fully defines the type. It specifies both the bit width as well as the signedness; e.g., PT_Sitn32 is a signed 32-bit integer. For floating-point values, that is not the case, so we need more data when working with floating-point values. Typically, that means we need to pass the fltSemantics around so we know if we have a double or float value (or any of the many other floating-point types). In other cases, we need to pass on the RoundingMode. If you've worked with LLVM's APFloat before, both of those are probably well-known to you.

In practice, the new interpreter gained new opcodes for floating-point operations, like Addf, Subf, etc:

class FloatOpcode : Opcode {
  let Types = [];
  let Args = [ArgRoundingMode];
}
def Addf : FloatOpcode;
def Subf : FloatOpcode;

And when generating bytecode, we need to check whether we're dealing with a floating-point type or not.

The "AP" in APFloat means "arbitrary precision" and to support this use case, each APFloat variable may heap-allocate memory to save excessively large floating-point values. This poses a particular problem for the new interpreter, since values are allocated in a stack or into a char array, the byte code. So without special support, this results in either uninitialized reads or memory leaks. To support this, the new interpreter has special (de)serialization code to handle Floating variables.

Handling floating-point values correctly was an important step forward, since parts that make them special will also apply to other types that are yet to come, like arbitrary-bitwidth integers (think _BitInt or 128-bit integers on hosts that don't support them).

Initializer rework

One of the larger changes I implemented since the last blog post is reworking initializers.

Previously, we had visitRecordInitializer() and visitArrayInitializer() functions which initialized the pointer on top of the stack. For _Complex support, I've added an additional visitComplexInitializer() function, but that never got merged. These functions all handled a few types of expressions differently than the normal visit() function. In short, the difference was that visit() created a new value, while visit*Initializer() initialized an already existing pointer with the values from the provided expression.

However, this caused problems in some cases, when the AST contained an expression of a composite type that was not initializing an already existing pointer. We had no way of differentiating these cases when generating byte code.

In the new world, the byte code generator contains more fine-grained functions to control code generation:

  • visitInitializer(): This sets an internal Initializing flag to true. When generating bytecode, we can check that flag and act accordingly. If it is true, we can assume that the top of the stack is a Pointer, which we can initialize.
  • discard(): Evaluates the given expression for side-effects, but does not leave any value on the stack behind.
  • visit(): The old visit() function is still being used but will now automatically create a temporary variable and call visitInitializer() to initialize it instead, if the given expression is of composite type (and a few other restrictions). This ensures that visit() always pushes valid PrimType to the stack.
  • delegate(): Simply passes the expression on, keeping all the internal flags intact. This is a replacement for the previous pattern of return DiscardResult ? this->discard(E) : this->visit(E).

Invalid expressions

Even though every new C++ version supports more and more constructs in constant contexts, there are still some constructs that aren't supported. For those, we've added a new Invalid opcode that simply reports an error when interpreted.

Such an opcode is necessary since we can't reject a constexpr function right away when generating bytecode for it and encountering such an expression. For example, the following function can be executed just fine in a constant context, even though the throw statement is not supported in a constant context:

constexpr int f(bool b) {
    if (b)
        throw;

    return 1;
}

static_assert(f(false) == 1);  // Works
static_assert(f(true) == 1);   // Doesn't

Godbolt

Builtin functions

Clang has tons of builtin functions (starting with __builtin), many of which are also supported during compile time. Since the last blog post, the new interpreter has gained support for quite a few of them, mostly floating-point builtins like __builtin_fmin():

static_assert(__builtin_fmin(1.0, 2.0) == 1.0);

Godbolt

Most of the builtin functions are not hard to implement, but they go a bit against what the new interpreter does: generate (target-dependent) byte code. Instead, we have to do the computations on target-independent values and then convert them to target-dependent values again. This is most interesting for the size of types (e.g., int isn't always 4 bytes).

Support for complex numbers

C and C++ have a commonly supported language extension called "complex numbers." You might remember them from math class. For our purposes, the most interesting part is that they consist of two components: real and imaginary.

Here's a small demo in case you've never seen them:

constexpr _Complex float F = {1.0, 2.0};

static_assert(__real(F) == 1.0);
static_assert(__imag(F) == 2.0);

Godbolt

Because they always consist of exactly two elements, we model them as a two-element array and don't create a special PrimType. As an example, implementing the __real unary operator from the example above can be done by simply returning the first element of the array:

case UO_Real: { // __real x
  if (!this->visit(SubExpr))
    return false;
  if (!this->emitConstUint8(0, E))
    return false;
  return this->emitArrayElemPtrPopUint8(E);
}

This will push a floating-point value equal to the first element in the array on the stack.

Of course, these are the simple operations that need to be supported for complex types. Arithmetic operations are still a work in progress. I have a series of patches for complex types that are already finished and approved to be pushed, but I'm trying to hold them back until I'm sure the design works out. This is important because the design carries over to the implementation of vector types and fixed-point types.

Google Summer of Code

As a side note, I have also been busy this past year mentoring a GSoC student, Takuya Shimizu, who improved Clang's diagnostic output.

You can read more about his changes and the improvements in Clang 17's diagnostic output in general in his blog post.

Smaller additions and future work

The remaining changes aren't as interesting, but here are a few:

  • Global variables of record and array type are now (recursively) checked for initialization
  • Implement missing mul, div and rem compound assign operators
  • Implement switch statements
  • Implemented builtin functions: __builtin_is_constant_evaluated(), __builtin_assume(), __builtin_strcmp(), __builtin_strlen(), __builtin_nan(), __builtin_nans(), __builtin_huge_val(), __builtin_inf(), __builtin_copysign(), __builtin_fmin(), __builtin_fmax(), __builtin_isnan(), __builtin_isinf(), __builtin_isinf_sign(), __builtin_isfinite(), __builtin_isfpclass(), __builtin_fpclassify(), __builtin_fabs(), __builtin_offsetof
  • Support for logical and/or operators
  • Support for C++ range-for loops
  • Support for destructors
  • Support for function pointers (and calling them)
  • Track frame depth (including diagnostics)
  • Support for virtual function calls
  • Support for lambda expressions
  • Support for SourceLocExprs

My work in the following months will concentrate on supporting more constructs we need to support standard headers. This includes in particular 128-bit integers and IntegralToPointer casts. As always, I'd like to use this opportunity to thank all the reviewers who spend so much time reviewing my many patches. This includes especially, but not exclusively, Aaron Ballmann, Corentin Jabot, Erich Kaene and Shafik Yaghmour.