How the most upvoted C question was another question

I’m a fairly honest guy. I don’t mind admitting that I have taken most of my posts from Stack Overflow or Code Golf. I might be one of the only developers that actually read these websites, instead of only ending up there after pasting my error in the Google search bar.

However, I also think of myself as a somewhat knowledgeable developer. I’m no expert at all, but I’ve read a lot on and about C, including large parts of the specification itself. Because of this, I think I know almost every single operator and functionality. Which is why the following question surprised me a lot:

What is the “–>” operator in C++?

GManNickG

Well, I was wondering to myself: what does this operator do? I do know that the arrow operator is reasonably typical. It is widely used in functional programming, and C also has an arrow operator. However, that one uses only one -.

The functionality that was described was perhaps even more impressive. This is the code that was used as an example:

#include <stdio.h>
int main()
{
    int x = 10;
    while (x --> 0) // x goes to 0
    {
        printf("%d ", x);
    }
}

The output of this code is the following:

9 8 7 6 5 4 3 2 1 0

Because this also happens in gcc, the question asker was wondering whether this was an official operator. Based on the code given and the result, he had a feeling about what the operator does.

To start things off: he is wrong. The --> operator does not exist in C, especially not with this functionality. This raises the question: why does this even compile? Clearly, if the operator does not exist, the compiler should throw an error, right?

Well, this has to do with the preprocessing of C. I have already written about preprocessors before. Namely, when `??!` in a comment broke code

To understand this, we have to jump back in the C specification. When we look at the C11 standard, we find the following at 5.1.1.2.7:

White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting tokens aresyntactically and semantically analyzed and translated as a translation unit.

Perhaps the most interesting part here is no longer. Apparently, this has not always been the case. However, this part in the standard was directly taken from the C99 standard, so it is not that new. Sadly, older standards are no longer available on the internet. Therefore I can’t check in the older versions of the standard when this was added.

Most of the times when I want to know what happened, I look at the rationale. In the C99 rationale, I could not find anything about the change, so I assume that it was already in there. I do have bad news for anyone that is bothered by that: it is also still in the C18 standard

Sadly, we can’t change this. We do, however, know this has been the case for quite a while, and most likely won’t change. The fact that white spaces are not relevant to the tokenization results in this:

{P:whitespace.c;F:<NULL>;L:5;C:5;S:0;M:0x7fb35bc734e0;E:0,LOC:13748324,R:13748324}while {P:whitespace.c;F:<NULL>;L:5;C:11;S:0;M:0x7fb35bc734e0;E:0,LOC:13748512,R:13748512}({P:whitespace.c;F:<NULL>;L:5;C:12;S:0;M:0x7fb35bc734e0;E:0,LOC:13748544,R:13748544}x {P:whitespace.c;F:<NULL>;L:5;C:14;S:0;M:0x7fb35bc734e0;E:0,LOC:13748609,R:13748609}--{P:whitespace.c;F:<NULL>;L:5;C:16;S:0;M:0x7fb35bc734e0;E:0,LOC:13748672,R:13748672}> {P:whitespace.c;F:<NULL>;L:5;C:18;S:0;M:0x7fb35bc734e0;E:0,LOC:13748738,R:13748738}0{P:whitespace.c;F:<NULL>;L:5;C:21;S:0;M:0x7fb35bc734e0;E:0,LOC:13748832,R:13748832}){P:whitespace.c;F:<NULL>;L:6;C:5;S:0;M:0x7fb35bc734e0;E:0,LOC:13752416,R:13752416}

When I strip all of the unreadable information, we get this:

while x -- > 10

And this helps us a lot to understand what is going on. The --> operator does not exist. Instead, we are looking at two operators here, that are intertwisted with each other because C does not care about whitespaces.

It seems to be a total coincidence that the code does what you would expect from it if you don’t know any C at all. What it’s actually doing, is this:

#include <stdio.h>
int main()
{
    int x = 10;
    while (x-- > 0) // x goes to 0
    {
        printf("%d ", x);
    }
}

I do have good news for everyone who likes this way of using loops: it almost flawlessly also works the other way around:

#include <stdio.h>
int main()
{
    int x = 10;
    while (0 <-- x) // x goes to 0
    {
        printf("%d ", x);
    }
}

The only downside to this is that it will never print 0, which the first piece of code did.

Post scriptum: I was later notified by an attentive lobste.rs user that I made a huge mistake in interpreting the standard. “No longer” does not reference a previous version of the standard, but instead refers to the translation phases.

2 thoughts on “How the most upvoted C question was another question

  1. Just another example of cleverness that really only serves to obfuscate how the code actually works. I mean, I get it, the programmer is trying to have fun, but is not making sure one’s code is obvious central to the art of programming?

    1. Hi Darren,

      Yes, we agree about that completely. There are, however, two things that should be noted here:

      1. The person that asked the original question did so because he honestly wondered, not because he found a cool obfuscation.

      2. Many of the things I write about are to explain more of the details of the language, rather than showing the cleverness of some developers. Writing about this kind of examples does generate more interest in the story behind it.

      Nonetheless, I agree with you that code should be primarily easy to read. According to Robert C. Martin:

      Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.

Leave a Reply

Your email address will not be published. Required fields are marked *