C++ Reflection of Function Parameters: Which Way is Right? • Architect in Slippers

In the Core C++ conference that took place a few days ago, Inbal presented to the audience a dilemma the committee was facing. I’d like to share my view about it. But first – a short background.

Reflection

Many languages implement mechanisms of reflection, i.e., the ability of the code to refer to its own internal structure. In fully interpreted languages such as Python or JavaScript, this is almost trivial to implement. In languages like Java or C# (that use sort of partial compilation) it is not trivial, but it also not too big of a challenge, and they have presented such mechanisms long ago. In a fully compiled languages, however, this is a big issue. Using reflection at run time means that the compiler has to capture the structure of the code and put it into the machine code it produces. Doing that will increase the size and decrease the performance of the program, which is against the very purpose of compiled languages.

In C++, there already is the infamous RTTI (Real-time Type Information) mechanism, that allows knowing the dynamic type and name of a variable at runtime. But it is limited to type names, and using it is considered a bad practice in most cases. So what type of reflection is C++ going to present? The C++26 standard will include a wide range of reflection abilities at compile time. The more the language progresses, the more ways it provides for writing type-generic code. And the more it progresses, the more ways it provides for writing logic that is being activated at compile time. The increasing in C++ code that relies on those two abilities is motivating the development of compile time reflection.

The Challenges

While runtime reflection is a common programming concept — defining the exact logic of compile time reflection is not an easy task. Unlike in runtime reflection, the knowledge of the compiler at compile time is very limited. It is also highly dependent on the specific location of the code it is currently compiling:

It knows nothing about types that it still did not see.
It knows very little about declared types who were not yet defined.
It has none of the details that will only be resolved by the linker.

And this is the case for each and every single translation unit (a compiled file, like .c or .cpp): the compiler knows nothing at the beginning of the compilation, and will collect more and more data as it goes over the file.

The C++ committee is working on the reflection these days; and, as Inbal presented in her great lecture, they ran into a question. Considering the challenges described above, how should the compiler treat the names of function parameters? The language does not require the parameter names in a function declaration to match those of the function definition. It even allows not naming them at all, or provide several declarations, each with possibly different parameter names. For example, consider a file with the following code:

int sum(int, int, int);
...
void f1() { ... }
...
int sum(int a, int b, int c);
...
void f2() { ... }
...
int sum(int x, int y, int z) { ... }
...
int f3() { ... }
... 
int sum(int i, int j, int k); 
... 
void f4() { ... }

Suppose that within each of the functions f1(), f2(), f3() and f4() the code refers to the function sum() and would like to log the names of its arguments. What will it print?

During the discussion in the lecture, there were several suggestions. For example, one suggestion was to use the names in the definition (so f4() will see the names x, y, z). Another one was to always let the latest declaration override former ones (so f4() will see the names i, j, k). I’m not sure that all the suggestions are possible at all, but I’d like to suggest my own thoughts about the right behavior.

A Methodological Approach

In her lecture, Inbal presented two concepts that the committee has already decided upon regarding the reflection:

The reflection data reflects the compiler’s knowledge at the specific point, and can therefore return different data for the same query, when called from different locations.
By definition, the reflection knowledge might be partial; for example, calling reflection functions to query about a class that is already declared but is not yet defined will be able to provide its name but not much more than that.

Following these two guidelines, and accepting the fact that C++ allows multiple function declarations with different parameter names, I think that reflecting a function’s data should be able to provide:

A tuple of the parameter names in its definition, if it is already defined
A collection (tuple) in which for any declaration that the compiler has already passed, there is tuple with its parameter names

So, if we refer to the former example, the data will be (schematically):

In f1(): defNames: not exist, declNames: { {"", "", ""} }.
In f2(): defNames: not exist, declNames: { {"", "", ""}, {"a", "b", "c"} }.
In f3(): defNames: {"x", "y", "z"}, declNames: { {"", "", ""}, {"a", "b", "c"} }.
In f4(): defNames: {"x", "y", "z"}, declNames: { {"", "", ""}, {"a", "b", "c"}, {"i", "j", "k"} }.

It seems, in my opinion, to be the most natural behavior that obeys to the guideline of the reflection behavior.

What do You Think?

Does that make sense to you?
Do you agree with it?
Do you think that its behavior should be different?
Let me know in the comments!