[Cluster-devel] Re: [PATCH] gfs2: better code for translating characters
H. Peter Anvin
hpa at zytor.com
Mon Aug 13 05:51:53 UTC 2007
rae l wrote:
>>>
>> Only if the compiler is stupid.
> What? Did you know I really say? Could you tell a little more clear?
>
> if the string in sdp->sd_table_name has many '/' chars, the latter
> algorithm will be better.
> but if there's no '/' char, one assignment will be wasted.
>
You seem to have confused modern compiled C with an old BASIC interpreter.
Consider the code in point:
- while ((table = strchr(sdp->sd_table_name, '/')))
+ table = sdp->sd_table_name;
+ while ((table = strchr(table, '/')))
*table = '_';
sdp->sd_table_name refers to a memory location, and will have to be
loaded from memory into a register before it can be transmitted to the
strchr() function. In the latter case, we call this register "table";
since the value is immediately killed after the function call, there is
no reason for the compiler to carry it across the function.
Consider x86-64 as an example:
# Assume sdp is held in %r15 at this point, and assume
# the offset of sd_table_name is 0x30.
# First case
.L1:
movq 30(%r15), %rdi # First argument register
movb '/', %sil # Second argument register
call strchr
testq %rax, %rax # Result register
jz .L2
movb '_', (%rax)
jmp .L1
.L2:
# Second case
movq 30(%r15), %rdi
.L1:
movb '/', %sil
call strchr
testq %rax, %rax
jz .L2
movq %rax, %rdi
movb '_', (%rax)
jmp .L1
As you can see, in the zero case, the instruction sequence is exactly
the same, whereas in the nonzero case, we have replaced a memory load
with a register-register copy. On most architectures (x86-64, Alpha and
MIPS are the oddballs here) we wouldn't even need the copy.
-hpa
More information about the Cluster-devel
mailing list