[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

C

Filed under: Hacking — Thomas @ 11:45 pm

2009-5-11
11:45 pm

Somehow I spent all this time on this planet without ever learning that in C,
numbers[4] == 4[numbers]

Arrays are just simply syntactic sugar.

I feel robbed of my childhood!

Come on, go ahead and tell me how you realized this when reading K&R back when you were 6 and writing BASIC interpreters in assembler for fun!

37 Comments »

  1. I sincerely would have preferred never knowing, beside reading it in the textbook I first learnt C/C++ from.

    The reason for that is that it became _very_ clear when I was forced to study 8086 (yes, not x86, 8086) assembler in school.
    Together with the fact that the correct order of the 16-bit registers is AX CX DX BX, (and not the “natural” AX BX CX DX).

    Comment by Diego E. "Flameeyes" Petteno` — 2009-5-11 @ 11:59 pm

  2. I also did not know this

    Comment by Jan Schmidt — 2009-5-12 @ 12:24 am

  3. News to me too!

    Comment by Chris Lord — 2009-5-12 @ 12:41 am

  4. Something else very cool: “string literal”[i] gives the i-th char of the string literal.

    for (i = 0; i < n; i++) {
    printf(“\rWorking… %c”, “-\\|/”[i & 3]);
    /* …Do work… */
    }

    Comment by Luke Hutchison — 2009-5-12 @ 12:49 am

  5. that’s because i[t] == i+t == t+i == t[i]

    Comment by Andrés — 2009-5-12 @ 12:52 am

  6. Nope, I only found it out a few weeks ago when I walked into a room and a similar example was on the board.

    Comment by Nicholas Riley — 2009-5-12 @ 12:58 am

  7. noob!

    Comment by nerds — 2009-5-12 @ 1:11 am

  8. yep, http://www.fredosaurus.com/notes-cpp/arrayptr/arraysaspointers2.html

    Comment by teki — 2009-5-12 @ 1:36 am

  9. Never seen that in my life.

    Comment by Darwin Survivor — 2009-5-12 @ 2:04 am

  10. Not exactly when I was 6.. he he..
    Anyways, the C standard says: E1[E2] is equivalent to *(E1 + E2). And addition being commutative, it is equivalent to *(E2 + E1) and hence E2[E1].
    Another interesting example: “hello”[2] == 2["hello"]

    Comment by Syam — 2009-5-12 @ 2:18 am

  11. some_type a[];

    This holds because a[b] is equivalent to *(a + b) — I’d always assumed it would be equivalent to *(a + sizeof(some_type) * b) to take into account the size of the array elements. But a friend just enlightened me: adding a pointer to an integer in C automatically takes into account (multiplies the integer by) the size of the type of the pointer, so it doesn’t matter where that pointer occurs in the expression *(a + b) or *(b + a) — the result will be the same.

    Comment by Jeff — 2009-5-12 @ 2:21 am

  12. In the 3rd post, E1 and E2 are ‘expressions’, of course!

    Comment by Syam — 2009-5-12 @ 2:21 am

  13. Someone explain it! Please!

    Comment by anonymous — 2009-5-12 @ 2:48 am

  14. I knew that arrays were just syntactic sugar, and that numbers[4] == *(numbers + 4). But that’s a rather clever corollary I’d never realized.

    Comment by Shaun — 2009-5-12 @ 3:21 am

  15. “Arrays are just simply syntactic sugar.”

    Yes.
    Guru meditation:
    *(&numbers + 4)

    Comment by James Thiele — 2009-5-12 @ 3:34 am

  16. I like “abcd”[5]==5["abcd"] better, for some reason.

    It always made me sad that strings are special… you can’t write {1,2,3,4}[i] in plain C. ((int[]){1,2,3,4})[i] is possible in C99 but pretty darn ugly.

    Comment by ephemient — 2009-5-12 @ 4:30 am

  17. Any idea why the reversed array notation is actually accepted?

    I know that arrays are just syntactic sugar for pointers. Was that what you meant by your comment?
    eg. numbers[4] == *(numbers + 4)

    But I have never seen them written in that reversed form. Do you have any link or reference for more information?

    Comment by molasses — 2009-5-12 @ 5:00 am

  18. Me neither did know about that behaviour :|

    Comment by The Geek Inside — 2009-5-12 @ 5:05 am

  19. K&R says under 5.3 Pointers and Arrays: “In evaluating a[i], C converts it to *(a+i) immediately; the two forms are equivalent.” Though, it’s open to interpretation, since “a” was previously defined as an array in that section, so it’s not clear what happens if it’s NOT an array.

    Comment by behdad — 2009-5-12 @ 5:28 am

  20. GCC’s treatment of this is interesting:

    int a[] = { 0, 11, 22, 33, 44, 55, 66, 77, 88, 99 };

    == Example 1 ==

    printf(“%d, %d\n”, a[4], 4[a]);

    This is fine; prints “44, 44″.

    == Example 2 ==

    printf(“%d\n”, 4[4]);

    GCC: “error: subscripted value is neither array nor pointer”

    Expected failure, of course, but this suggests to me that 4[a] should not be legal. Is this just a slightly misleading error message, or does GCC simply consider 4 to be the subscript and a the subscripted value in 4[a], despite how it’s written?

    == Example 3 ==

    printf(“%d\n”, a[a]);

    GCC: “error: array subscript is not an integer”

    Expected failure once again, but this raises essentially the same question as the example immediately above.

    Comment by Jeff — 2009-5-12 @ 6:39 am

  21. Of course, somewhere the size of the datatype comes in for the pointer addition. You can only add integers to pointers, not add pointers. There is more typing in there than most of the comments want to make you believe.

    Comment by . — 2009-5-12 @ 8:53 am

  22. [...] wait, wut? http://thomas.apestaart.org/log/?p=887 [...]

    Pingback by shivan's status on Tuesday, 12-May-09 08:31:40 UTC - Identi.ca — 2009-5-12 @ 9:32 am

  23. C is just ANSI assembler, right. ;-) This construction just reminds you of that.

    Comment by Philip Paeps — 2009-5-12 @ 9:40 am

  24. The first step to enlightenment is to understand that arrays and pointers are pretty similar (as arguments to functions show). I never thought of something like your example, which is surprising and interesting (and if I ever find it in some of our code, someone will die).

    The second step of enlightenment is to understand that, no, arrays and pointers aren’t the same thing, as I discovered just a few months ago when playing around with objcopy-embedded data:
    http://sourceware.org/ml/binutils/2002-02/msg00846.html

    On a tangent note, I am suprised and disappointed by the amount of people who think they HAVE to use a typedef to use struct/union/enum types.

    Comment by Jean de Largentaye — 2009-5-12 @ 12:38 pm

  25. Head over to http://ioccc.org/ and you’ll discover many more amazing obscure corners of the C language. (Only it seems to be down at the moment.)

    Also, an Apache rewrite rule to fix Planet GNOME links to your blog posts so they actually show up with some text in them without me manually removing the trailing slash in the address bar would be nice.

    Comment by Marius Gedminas — 2009-5-12 @ 1:56 pm

  26. Didn’t know this but then again I just think C is the biggest hack ever written that was successful.

    It is not even a syntactic sugar for assembler, because C is still ugly as hell.

    Comment by mark — 2009-5-12 @ 3:23 pm

  27. Sorry for the offtopic but it seems the blog breaks not because of the trailing slash but because of Referer header being present. If you just enter the address field in your browser and press Enter, it loads correctly no matter if slash is there or not.

    Comment by Patryk (Patrys) Zawadzki — 2009-5-12 @ 3:59 pm

  28. fortune(6) occasionally drops this gem for me:

    “Hey, Thompson, how can I make C’s syntax even more obfuscated and difficult to understand?”
    “How about you allow 5[var] to mean the same as var[5]?”
    “Wow; unnecessary and confusing syntactic idiocy! Thanks!”
    “You’re welcome, Dennis.”

    I thought it quite appropriate for your blog entry. :-)

    Comment by jmd — 2009-5-12 @ 4:04 pm

  29. See page 210 section 7.4.1 of “C A Reference Manual” 5th ed. By Harbison and Steele. It explains why this notation is valid.

    Comment by Bob D. — 2009-5-12 @ 4:13 pm

  30. Try http://www.ioccc.org/ (Yes, the www is required), which is also where I learned about that misfeature.

    Comment by Eric — 2009-5-12 @ 7:28 pm

  31. Trigraphs are know by lot of people but so are not digraphs:

    #include
    int main(void) {
    int a<:42];
    0 = 9;
    printf(“%i\n”, *a);
    }

    I could not even believed it the first time I heard of that. I thought the person telling me was joking with a very weird and hard to understand humour :p

    Comment by xilun — 2009-5-12 @ 11:38 pm

  32. oups, paired < and > were filtered out.
    on the fourth line one must of course read :
    0<:a:> = 9;

    Also a little ref : C99 6.4.6 Punctuators, §3

    Comment by xilun — 2009-5-12 @ 11:43 pm

  33. /* addition without + */
    int add(int a, int b) {
    return (int)&a[&((char*)0)[b]];
    }

    Comment by someguy — 2009-5-13 @ 1:18 am

  34. xilun: That is hideous!

    void functionA(void)
    <%
    char a<::> = “Hi”;
    printf(“I’m a function, and I say %c%c!\n”, a<:0:>, a<:1:>);
    %>

    Thank god this syntax isn’t used often.

    Comment by Henry — 2009-5-13 @ 3:07 am

  35. “[...] realized this when reading K&R back when you were 6 and writing BASIC interpreters in assembler for fun!”

    No, I /read/ it in K&R (1st ed) back when I was 16. And it was Forth interpreters.

    Did you know that (*p)->xxx aka (*(*p)).xxx could be written as p[0]->xxx?
    Very handy on a 68K Mac.

    Comment by Peter Lund — 2009-5-13 @ 12:56 pm

  36. A historical note:

    The reason for this, as I understand it, is that the original K&R C compiler did not actually feature a typechecker (or if it did, I can’t find it). All it featured was a mechanism to propogate variable sizes (and size-of-thing-pointed-to) so that it could handle the difference between char and int. Thus, by the time that it analyzed a[b], there was no way to tell the difference.

    Now, I could be completely off base, because it’s been a few weeks since I’ve looked at the K&R c compiler source, and it’s rather obtuse, but in my experience writing a K&R compiler*, there is no need for typechecking.

    * The reason I’ve been looking at this is to compile Unix v6 or v7 to x86, so it can run on a modern machine. And that code is so non-portable that it won’t work with a standard compiler, even after updating it to use ANSI syntax.

    Comment by TheQuux — 2009-5-14 @ 8:51 am

  37. meh. I’d fire anyone that intentionally used this.

    Comment by Doug — 2009-5-18 @ 2:17 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

picture