A Quick Look behind a "for loop"

In a modern processor, such as x86_64 or ARM, it’s remarkably complicated. Here’s a quick example I knocked up to demonstrate.

I pulled out the loop body into a second function to stop the compiler from optimizing the empty loop into nothing, and so that the loop itself is separate from the stuff it does. When the above code is assembled for the ARM_64, you get this:


_example_for_loop: ## @example_for_loop
Lfunc_begin0:
 .file 2 "/Users/grahamcox/Projects/test" "/Users/grahamcox/Projects/test/test/forloop.c"
 .loc 2 13 0 ## /Users/grahamcox/Projects/test/test/forloop.c:13:0
 .cfi_startproc
## %bb.0:
 pushq %rbp
 .cfi_def_cfa_offset 16
 .cfi_offset %rbp, -16
 movq %rsp, %rbp
 .cfi_def_cfa_register %rbp
 pushq %r14
 pushq %rbx
 .cfi_offset %rbx, -32
 .cfi_offset %r14, -24
 leaq L_.str(%rip), %r14
 xorl %ebx, %ebx
Ltmp0:
 ##DEBUG_VALUE: i <- 0
 .p2align 4, 0x90
LBB0_1: ## =>This Inner Loop Header: Depth=1
 ##DEBUG_VALUE: loop_body:i <- %ebx
 ##DEBUG_VALUE: i <- %ebx
 .loc 2 24 2 prologue_end ## /Users/grahamcox/Projects/test/test/forloop.c:24:2
 xorl %eax, %eax
 movq %r14, %rdi
 movl %ebx, %esi
 callq _printf
Ltmp1:
 .loc 2 14 27 ## /Users/grahamcox/Projects/test/test/forloop.c:14:27
 incl %ebx
Ltmp2:
 ##DEBUG_VALUE: i <- %ebx
 .loc 2 14 20 is_stmt 0 ## /Users/grahamcox/Projects/test/test/forloop.c:14:20
 cmpl $100, %ebx
Ltmp3:
 .loc 2 14 2 ## /Users/grahamcox/Projects/test/test/forloop.c:14:2
 jne LBB0_1
Ltmp4:
## %bb.2:
 .loc 2 18 1 is_stmt 1 ## /Users/grahamcox/Projects/test/test/forloop.c:18:1
 popq %rbx
 popq %r14
 popq %rbp
 retq
Ltmp5:
Lfunc_end0:
 .cfi_endproc

This is assembled with optimization at max, which makes it as short and efficient as it thinks it knows how. Note, for example, that the call out to loop_body has been inlined (line 28).

Some of this is just debugging hints, such as the various ‘.loc’ lines, so they’re not part of the executable code itself. But they do give you a link to what part of the source code the assembled code is derived from.

Not all machine code is this difficult. Back in the day when the 6502 was popular, a ‘for’ loop like this could be written in 4 or 5 instructions. This was just as well when you consider 1MHz was considered ‘fast’.


LDA #100;
TAX;
LOOP: JSR LOOP_BODY;
DEX;
CPX #0;
BNE LOOP;

(n.b. this might not be genuine 6502 assembly code -But it follows the general idea). Hope its useful.

A Quick Look behind a "for loop"

Post a Comment

0 Comments

Featured Post

Designer tools RoundUp

Recent Posts

Popular Posts

Crystal - The Future of Programming languages

Responsive Web Design Projects - Build a Tribute Page | How to do Tutorial & Source Code

The Campfire Made of HTML and CSS

Categories

Tags

Hits

Recent in Articles

A Quick Look behind a "for loop"

Related Posts

Post a Comment

0 Comments

Featured Post

Designer tools RoundUp

Recent Posts

Popular Posts

Crystal - The Future of Programming languages

Responsive Web Design Projects - Build a Tribute Page | How to do Tutorial & Source Code

The Campfire Made of HTML and CSS

Categories

Tags

Hits

Recent in Articles

Social Widget