Problems with assembling gcc assemblies

While writing C code, I decided to compile it into assembly and read it - I just do it occasionally from time to time - it's an exercise to get me thinking about what the machine does every time I write an expression in C.

Anyway, I wrote these two lines in C

asm(";move old_string[i] to new_string[x]");
new_string[x] = old_string[i];
asm(";shift old_string[i+1] into new_string[x]");
new_string[x] |= old_string[i + 1] << 8;

      

( old_string

is an array of char

, a new_string

is an array unsigned short

, so given two characters 42 and 43, this would put 4342 in new_string[x]

)
Which produced the following output:

#move old_string[i] to new_string[x]

movl    -20(%ebp), %esi         #put address of first char of old_string in esi
movsbw  (%edi,%esi),%dx         #put first char into dx
movw    %dx, (%ecx,%ebx,2)      #put first char into new_string

#shift old_string[i+1] into new_string[x]

movsbl  1(%esi,%edi),%eax       #put old_string[i+1] into eax
sall    $8, %eax                #shift it left by 8 bits
orl     %edx, %eax              #or edx into it
movw    %ax, (%ecx,%ebx,2)      #?

      

(I am commenting on it myself so I can follow what happens.) I compiled it with -O3, so I might as well see how the compiler optimizes certain constructs. Anyway, I'm sure it's probably simple, but here's what I won't get:

the first section copies a char

from old_string[i]

and then transfers it (from dx

) to (%ecx,%ebx)

. Then in the next section copies it old_string[i+1]

, shifts it, or returns it, and then puts it in the same place from ax

. Does it put two 16-bit values ​​in the same place? Wouldn't that work?

Also, it shifts old_string[i+1]

to the top dword eax

, then ors edx

( new_string[x]

) into it ... then puts ax

into memory! Wouldn't it ax

just contain what was already in new_string[x]

? so it saves the same place in memory twice?

Is there something I am missing? Also, I'm pretty sure the rest of the compiled program is not relevant to this snippet ... I read before and after to find where each array and different variables are stored, and also the register values ​​would be when reaching this code - I think that this is the only part of the assembly that matters to these C lines.

- oh, it turns out that compiling the GNU compiler starts with C #.

+1


a source to share


3 answers


Okay, so it was pretty simple. I figured it out with pen and paper, writing down every step that he did with each register and then writing down the contents of each register with an initial initial value ...

I figured out that it uses 32 bit and 16 bit registers for 16 and 8 bit data types ... This is what I thought was going on:

  • The first value is put into memory as say 0001 (I thought 01).
  • the second value (02) is loaded into a 32-bit register (so it was 00000002, I thought 0002)
  • second value shifted left 8 bits (00000200, I thought 0200)
  • The first value (0000001, I thought 0001) xor'd into the second value (00000201, I thought 0201)
  • The 16-bit register is put into memory (0201, I thought only 01 again).

I didn't understand why he was writing it to memory twice or why he was using 32-bit registers (well, actually, my guess is that a 32-bit processor is faster at 32-bit values ​​than it is with 8 and 16-bit values , but this is a completely uneducated assumption), so I tried to rewrite it:



movl -20(%ebp), %esi       #gets pointer to old_string
movsbw (%edi,%esi),%dx     #old_string[i] -> dx (0001)
movsbw 1(%edi,%esi),%ax    #old_string[i + 1] -> ax (0002)
salw $8, %ax               #shift ax left (0200)
orw %dx, %ax               #or dx into ax (0201)
movw %ax,(%ecx,%ebx,2)     #doesn't write to memory until end

      

It worked exactly the same.

I don't know if this is an optimization or not (other than I am recording one entry, which obviously is), but if it is, I know that it really isn't worth it and didn't get me anywhere. Anyway, I am getting what this code does now, thanks for the help.

+1


a source


I'm not sure what not to understand unless I miss something.

The first three commands load a byte from old_string into dx and store it in your new_string.



The next three instructions take what is already in dx and combine old_string [i + 1] with it and store it as a 16-bit value (ax) in new_string.

0


a source


Also, it shifts old_string [i + 1] into the top dword eax, then ors edx (new_string [x]) into it ... then puts the ax into memory! wouldn't the ax just contain what was already in new_string [x]? so it stores the same thing in the same place in memory twice?

Now you can see why SEOs are a good thing. Such redundant code appears quite often in unoptimized, generated code, because the generated code comes more or less from templates that don't "know" what happened before or after.

0


a source







All Articles