A popular technique modern compilers use to improve the runtime performance of compiled code is to perform computations at compile time instead of at runtime. However, constant expressions need to be evaluated at compile time for a variety of reasons. To help resolve this problem, I've been working on improving the clang's constant interpreter. Here's a look at just how much work has been done since my previous article in November 2022:
$ git log --after=2022-11-23 | grep -i "\[clang\]\[Interp\]" | wc -l $ 308
A good chunk of those are NFC (no functional change) commits, which I like to split out from functional changes. All of those patches contain tons of small changes and refactorings, so in this article I concentrate on the largest and most important changes. The new constant expression interpreter is still experimental and under development. To use it, you must pass -fexperimental-new-constant-interpreter
to clang.
Uninitialized local variables
This change feels like it landed an eternity ago, but it was in December 2022. Before C++20, local variables in constexpr
functions had to have an initializer: https://godbolt.org/z/Mh893MoKq
In C++20, the constant interpreter needs to handle uninitialized local variables and diagnose if the program tries to read from them.
In the new interpreter, this information is handled via an InlineDescriptor
struct, which simply contains one bit for the initialization state of the variable. Every local variable is preceded by an InlineDescriptor
now, so checking whether a local variable is initialized is as simple as reading that one bit.
┌─────────────────┬─────────┐ │InlineDescriptor │ Data... │ └─────────────────┴─────────┘
This sounds simple enough, but as you can see from the Phabricator review linked above, it required quite a few changes in a lot of places because the assumption so far has been that local variables are always initialized.
Support for floating-point numbers
This change seems like it's been supported for even longer, but the new interpreter now supports floating-point numbers.
To support this, there is now a PrimType
called PT_Float
, which is used for all floating types, most typically float
and double
. It is backed by a new Floating
class, which represents one such value. A Floating
variable is basically a wrapper around LLVM's APFloat
class, which does exactly the same thing.
For integer primitives, the PrimType
fully defines the type. It specifies both the bit width as well as the signedness; e.g., PT_Sitn32
is a signed 32-bit integer. For floating-point values, that is not the case, so we need more data when working with floating-point values. Typically, that means we need to pass the fltSemantics
around so we know if we have a double
or float
value (or any of the many other floating-point types). In other cases, we need to pass on the RoundingMode
. If you've worked with LLVM's APFloat
before, both of those are probably well-known to you.
In practice, the new interpreter gained new opcodes for floating-point operations, like Addf
, Subf
, etc:
class FloatOpcode : Opcode { let Types = []; let Args = [ArgRoundingMode]; } def Addf : FloatOpcode; def Subf : FloatOpcode;
And when generating bytecode, we need to check whether we're dealing with a floating-point type or not.
The "AP" in APFloat
means "arbitrary precision" and to support this use case, each APFloat
variable may heap-allocate memory to save excessively large floating-point values. This poses a particular problem for the new interpreter, since values are allocated in a stack or into a char array, the byte code. So without special support, this results in either uninitialized reads or memory leaks. To support this, the new interpreter has special (de)serialization code to handle Floating
variables.
Handling floating-point values correctly was an important step forward, since parts that make them special will also apply to other types that are yet to come, like arbitrary-bitwidth integers (think _BitInt
or 128-bit integers on hosts that don't support them).
Initializer rework
One of the larger changes I implemented since the last blog post is reworking initializers.
Previously, we had visitRecordInitializer()
and visitArrayInitializer()
functions which initialized the pointer on top of the stack. For _Complex
support, I've added an additional visitComplexInitializer()
function, but that never got merged. These functions all handled a few types of expressions differently than the normal visit()
function. In short, the difference was that visit()
created a new value, while visit*Initializer()
initialized an already existing pointer with the values from the provided expression.
However, this caused problems in some cases, when the AST contained an expression of a composite type that was not initializing an already existing pointer. We had no way of differentiating these cases when generating byte code.
In the new world, the byte code generator contains more fine-grained functions to control code generation:
visitInitializer()
: This sets an internalInitializing
flag totrue
. When generating bytecode, we can check that flag and act accordingly. If it is true, we can assume that the top of the stack is aPointer
, which we can initialize.discard()
: Evaluates the given expression for side-effects, but does not leave any value on the stack behind.visit()
: The oldvisit()
function is still being used but will now automatically create a temporary variable and callvisitInitializer()
to initialize it instead, if the given expression is of composite type (and a few other restrictions). This ensures thatvisit()
always pushes validPrimType
to the stack.delegate()
: Simply passes the expression on, keeping all the internal flags intact. This is a replacement for the previous pattern ofreturn DiscardResult ? this->discard(E) : this->visit(E)
.
Invalid expressions
Even though every new C++ version supports more and more constructs in constant contexts, there are still some constructs that aren't supported. For those, we've added a new Invalid opcode that simply reports an error when interpreted.
Such an opcode is necessary since we can't reject a constexpr
function right away when generating bytecode for it and encountering such an expression. For example, the following function can be executed just fine in a constant context, even though the throw statement is not supported in a constant context:
constexpr int f(bool b) { if (b) throw; return 1; } static_assert(f(false) == 1); // Works static_assert(f(true) == 1); // Doesn't
Builtin functions
Clang has tons of builtin functions (starting with __builtin
), many of which are also supported during compile time. Since the last blog post, the new interpreter has gained support for quite a few of them, mostly floating-point builtins like __builtin_fmin()
:
static_assert(__builtin_fmin(1.0, 2.0) == 1.0);
Most of the builtin functions are not hard to implement, but they go a bit against what the new interpreter does: generate (target-dependent) byte code. Instead, we have to do the computations on target-independent values and then convert them to target-dependent values again. This is most interesting for the size of types (e.g., int
isn't always 4 bytes).
Support for complex numbers
C and C++ have a commonly supported language extension called "complex numbers." You might remember them from math class. For our purposes, the most interesting part is that they consist of two components: real and imaginary.
Here's a small demo in case you've never seen them:
constexpr _Complex float F = {1.0, 2.0}; static_assert(__real(F) == 1.0); static_assert(__imag(F) == 2.0);
Because they always consist of exactly two elements, we model them as a two-element array and don't create a special PrimType
. As an example, implementing the __real
unary operator from the example above can be done by simply returning the first element of the array:
case UO_Real: { // __real x if (!this->visit(SubExpr)) return false; if (!this->emitConstUint8(0, E)) return false; return this->emitArrayElemPtrPopUint8(E); }
This will push a floating-point value equal to the first element in the array on the stack.
Of course, these are the simple operations that need to be supported for complex types. Arithmetic operations are still a work in progress. I have a series of patches for complex types that are already finished and approved to be pushed, but I'm trying to hold them back until I'm sure the design works out. This is important because the design carries over to the implementation of vector types and fixed-point types.
Google Summer of Code
As a side note, I have also been busy this past year mentoring a GSoC student, Takuya Shimizu, who improved Clang's diagnostic output.
You can read more about his changes and the improvements in Clang 17's diagnostic output in general in his blog post.
Smaller additions and future work
The remaining changes aren't as interesting, but here are a few:
- Global variables of record and array type are now (recursively) checked for initialization
- Implement missing mul, div and rem compound assign operators
- Implement switch statements
- Implemented builtin functions:
__builtin_is_constant_evaluated()
,__builtin_assume()
,__builtin_strcmp()
,__builtin_strlen()
,__builtin_nan()
,__builtin_nans()
,__builtin_huge_val()
,__builtin_inf()
,__builtin_copysign()
,__builtin_fmin()
,__builtin_fmax()
,__builtin_isnan()
,__builtin_isinf()
,__builtin_isinf_sign()
,__builtin_isfinite()
,__builtin_isfpclass()
,__builtin_fpclassify()
,__builtin_fabs()
,__builtin_offsetof
- Support for logical and/or operators
- Support for C++ range-for loops
- Support for destructors
- Support for function pointers (and calling them)
- Track frame depth (including diagnostics)
- Support for virtual function calls
- Support for lambda expressions
- Support for
SourceLocExprs
My work in the following months will concentrate on supporting more constructs we need to support standard headers. This includes in particular 128-bit integers and IntegralToPointer
casts. As always, I'd like to use this opportunity to thank all the reviewers who spend so much time reviewing my many patches. This includes especially, but not exclusively, Aaron Ballmann, Corentin Jabot, Erich Kaene and Shafik Yaghmour.
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.