The beauty of creative problem solving

I read about a question that was given to a programmer in an Oracle Interview. I hope he did not get the job, because he simply posted his question on Stack Overflow, but the question itself, looked like a very interesting one.

How would you divide a number by 3 without using *, /, +, -, %, operators?

The question, as given on Stack Overflow.

Think about this question for a while. Perhaps your favourite programming language has a built in solution. Well, C does, so that would make the job as easy as using div:

The div() function computes the value numerator/denominator and returns the quotient and remainder in a structure named div_t that contains two integer members (in unspecified order) named quot and rem.

man page of div

This is, of course, the simplest solution available. If your interviewer is trying to look at your knowledge of the language, this would be a fit solution too. I do hope however, that you won’t meet many interviewers who are looking for solutions like these.

It is more interesting to see what you actually know about algorithms and creative problem solving. That’s why someone else responded with this piece of brilliant code:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    FILE * fp=fopen("temp.dat","w+b");
    int number=12346;
    int divisor=3;
    char * buf = calloc(number,1);
    fwrite(buf,number,1,fp);
    rewind(fp);
    int result=fread(buf,divisor,number,fp);
    printf("%d / %d = %d", number, divisor, result);
    free(buf);
    fclose(fp);
    return 0;
}

This actually writes 123456 bytes to a file, which can be of course, any number you are trying to check. After that, it reads chunks of 3 bytes each, and returns the amount of times it was able to read a chunk. Therefore, it divides the original number by 3.

Of course, this is a horrible solution, but that’s fine, according to the one who posted the code, because it fits the question.

Let’s look at another creative piece of code:

#include <stdio.h>

int main() {
  int dividend = -42, divisor = 5, quotient, remainder;

  __asm__ ( "cdq; idivl %%ebx;"
          : "=a" (quotient), "=d" (remainder)
          : "a"  (dividend), "b"  (divisor)
          : );

  printf("%i / %i = %i, remainder: %i\n", dividend, divisor, quotient, remainder);
  return 0;
}

This one is creative enough to use another language into C. He decided to use assembly to solve this problem, because even assembly has division instructions built in.

An even better solution came from someone who suggested to use the Setun computer. This was a computer from Moscow that used a ternary system instead of a binary one. This allows you to do one right shift to divide a number by 3.

Obviously, the answer would be to use left and right bit shifting to replace the + and - operators, of course:

// replaces the + operator
int add(int x, int y)
{
    while (x) {
        int t = (x & y) << 1;
        y ^= x;
        x = t;
    }
    return y;
}

int divideby3(int num)
{
    int sum = 0;
    while (num > 3) {
        sum = add(num >> 2, sum);
        num = add(num >> 2, num & 3);
    }
    if (num == 3)
        sum = add(sum, 1);
    return sum; 
}

But if that is too complicated for you, and you only need a demo, you can always try this:

if (number == 0) return 0;
if (number == 1) return 0;
if (number == 2) return 0;
if (number == 3) return 1;
if (number == 4) return 1;
if (number == 5) return 1;
if (number == 6) return 2;

Syntactic sugar in C: accessing arrays

Arrays in C are very easy to use, and that’s mostly because of the implementation behind the scenes. One of the things that makes this extremely clear, is the following line of code:

a[10] == 10[a]

This is always true, because C uses what is called syntactic sugar. This isn’t a real concept, but rather a name for syntax that makes a programming language sweeter to humans. This expression was coined for the first time by the computer scientist Peter John Landin, to express easy syntax to make programming easier.

Not everyone likes the term syntactic sugar. One of them is Jen Chan, who wrote a shitpost about it, calling almost everything in a programming language sugar. Perhaps he is right about that, but there are a few things that clearly are sweeter than others.

The best example in C is without a doubt a[10] == 10[a]. This can be tested with simple code like this:

int a[20];
if (a[10] == 10[a])
	printf("These are equal!\n");

This will print These are equal!. The reason for this is fairly simple: this syntax is only meant to hide the real syntax for accessing:

a[10] = *(a + 10)

Sadly, all good things must have a downside. The downside of this sugar, is that the compiler is unable to validate the left hand side of this expression. Because of this, this is also an option:

10[a] = *(10 + a)

Thankfully, that does not matter, because the operation is commutative. That means that the order of addition is not relevant to the output.

C, what the fuck??!

What do you think that the value of a will be here?

int a = 0;
// What will be the value of a????/
a++;

You probably know that it won’t be 1, but there is a big chance that you only know that because I asked this question this way.

a will actually not change, and that is because a++; will never be run. This is because of the comment above. There is something special about this line. Before we jump into that, let’s look at another example:

!didIMakeAMistake() ??!??! CIsWrongHere();

This actually compiles, which is already impressive on its own. The question is, however, what the fuck does this do?

To understand this, I have to admit one thing: I have to pass -trigraphs to a modern version of gcc before this actually works. Trigraphs are special combinations of characters that were invented because of a problem in C: it uses 9 characters that are not in the ISO/IEC 646 Invariant character set. That are these characters:

Image: Wikipedia

Who uses C on a regular basis, should be able to figure out which 9 characters are missing from here. Those are the following:

 # \ ^ [ ] | { } ~

The table in the image above is meant to be clarifying, but we discover that this table might be confusing. That is because the characters that are missing, can all be found in the table above. You should note however, that they are all grayed out. That is because they are national code points, and therefore not an international rule.

This could become very interesting. Let’s look at this simple line for example:

{ a[i] = '\n'; }

This would be written like this by a Swedish programmer:

ä aÄiÜ = 'Ön'; ü

This is because they would use different characters for the national code points, than an American programmer would use for example.

The ANSI C committee of course recognized this problem and therefore, they decided to introduce the trigraphs. That are nine combinations of characters that were meant to replace the non-standard characters.

Image: Wikipedia

Of course, this isn’t a beautiful solution, but it should do the trick.

Now, with this knowledge, let’s look at the lines that we started with. The latter is the easier one:

!didIMakeAMistake() ??!??! CIsWrongHere();

If we look at the table above, we see that ??! should be replaced with |. Therefore, this line actually says:

!didIMakeAMistake() || CIsWrongHere();

If you understand how short-circuit evaluation works, you can understand that this will result in the following:

if (didIMakeAMistake()) 
  CIsWrongHere();

The other example is actually more interesting, and a good reason to be cautious with trigraphs:

int a = 0;
// What will be the value of a????/
a++;

Earlier, I already explained that a will be 0, because a++; is never executed.

A trigraph is only a trigraph when the ??s are followed by one of the nine string literals. So in this case, the C preprocessor will replace the code above with the following:

int a = 0;
// What will be the value of a??\
a++;

This \ actually escapes the newline, which eventually results in the following:

int a = 0;
// What will be the value of a??a++;

And this is why a++; was never executed.

I would like to end with a note from the committee itself:

The Committee makes no claims that a program written using trigraphs looks attractive. As a matter of style, it may be wise to surround trigraphs with white space, so that they stand out better in program text. Some users may wish to define preprocessing macros for some or all of the trigraph sequences.

Rationale for International Standard Programming Languages C (5.2.1.1)

A function to sleep a 1000 years: explained

I encountered a code challenge that asked for a sleep function that sleeps a 1000 years. The current functions use an integer to declare the amount of seconds, but 232 is only around a 100 years long.

A smart guy posted this interesting line of C code:

i;f(){--i&&f(sleep(7));}

This works fairly correctly, surprisingly. Let’s look further into it!

First, we’ll dissect the code into more readable code:

i;
f() {
	--i && f(sleep(7));
}

The first line creates a signed integer, called i. According to the C99 standard, section 6.7.8.10, it will be initialized as 0:

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:

— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these rules.

Section 6.7.1.2 tells us that we have to define a declaration specifier. Therefore, i would have to become int i. However, in C89, this was not yet a rule. This is what the rationale of C99 tells us about that:

A new feature of C99: In C89, all type specifiers could be omitted from the declaration specifiers in a declaration. In such a case int was implied. The Committee decided that the inherent danger of this feature outweighed its convenience, and so it was removed. The effect is to guarantee the production of a diagnostic that will catch an additional category of programming errors. After issuing the diagnostic, an implementation may choose to assume an implicit int and continue to translate the program in order to support existing source code that exploits this feature.

Now, we know that i would have been an int, if we use C89 rules. gcc will not give us an error if we try to do this, but just give us a warning:

warning: type defaults to ‘int’ in declaration of ‘i’ [-Wimplicit-int]

According to section 6.2.5.12, integers are arithmetic types. This, in combination with the second rule, makes that i will now be 0.

Then, the function f() is declared. This function takes no parameters and gets the default return type. This is also an int, because of the exact same rule as quoted above.

So, now we have an int called i and a function with return type int called f(). Let’s look at the contents of f()!

--i && f(sleep(7));

You have to understand that --i means that 1 will be substracted from i, before it is being evaluated. This means that we start with -1 in i.

The second part is actually more interesting. f() is called recursively, but with a parameter! Let’s look at another example that I wrote:

#include <stdio.h>

f() {
	printf("Bye world!\n");
}

int main(int argc, char const *argv[])
{
	f(printf("Hello world!\n"));
	return 0;
}

This actually outputs the following:

Hello world!
Bye world!

Apparently, we can still pass things to our function. They will be executed. If you want to avoid this, you will have to give void as the parameter type. So the following would not have worked:

f(void) {
	printf("Bye world!\n");
}

Now, we know why the recursive function works. So, when does it stop? According to the author, it stops at approximately a thousand years. He tells us it will run for around 952.69 years.

Let’s look at when the function technically stops. To start with this, you have to know that every number but 0, is true. Just looking at the function does not help a lot. There is not even a return value! For this, you have to be familiar with short-circuit operators. The following means that f(sleep(7))will only be executed, if --iis true. This is always the case, except for i == 0. This means that f(sleep(7)) will be run until i is 0:

--i && f(sleep(7));

So, when will i be 0? Well, i is a signed integer, which has a size of 32 bits. That means that i has a minimum value of -(2^31), which is -2147483648, and a maximum value of 2^31, thus 2147483648. We will go from -1 to -2147483648 and then we will flip to 2147483648. We will see 2^32 = 4294967296 values for i. -1though, because 0 is not an option. 0 would mean that f(sleep(7)) would not be run, and that’s the end of our f().

So, sleep(7) will be executed 4294967295 times. Let’s calculate the amount of years that this will run.

The amount of seconds: 4294967295 * 7 = 30064771065
The amount of hours: 30064771065 / 3600 = 8351325.3
The amount of days: 8351325.3 / 24 = 347971.9
The amount of years: 347971.9 / 365.25 = 952.7

So this code will indeed run for a thousand years, with a margin of ten percent.

With this, we have proven that this beautiful line of code is correct!