[Cluster-devel] Re: [PATCH] gfs2: better code for translating characters

H. Peter Anvin hpa at zytor.com
Mon Aug 13 05:51:53 UTC 2007


rae l wrote:
>>>
>> Only if the compiler is stupid.
> What? Did you know I really say? Could you tell a little more clear?
> 
> if the string in sdp->sd_table_name has many '/' chars, the latter
> algorithm will be better.
> but if there's no '/' char, one assignment will be wasted.
> 

You seem to have confused modern compiled C with an old BASIC interpreter.

Consider the code in point:

-       while ((table = strchr(sdp->sd_table_name, '/')))
+       table = sdp->sd_table_name;
+       while ((table = strchr(table, '/')))
  		*table = '_';

sdp->sd_table_name refers to a memory location, and will have to be
loaded from memory into a register before it can be transmitted to the
strchr() function.  In the latter case, we call this register "table";
since the value is immediately killed after the function call, there is
no reason for the compiler to carry it across the function.

Consider x86-64 as an example:

	# Assume sdp is held in %r15 at this point, and assume
	# the offset of sd_table_name is 0x30.

	# First case
.L1:
	movq	30(%r15), %rdi	# First argument register
	movb	'/', %sil	# Second argument register
	call	strchr
	testq	%rax, %rax	# Result register
	jz	.L2
	movb	'_', (%rax)
	jmp	.L1
.L2:

	# Second case
	movq	30(%r15), %rdi
.L1:
	movb	'/', %sil
	call	strchr
	testq	%rax, %rax
	jz	.L2
	movq	%rax, %rdi
	movb	'_', (%rax)
	jmp	.L1


As you can see, in the zero case, the instruction sequence is exactly
the same, whereas in the nonzero case, we have replaced a memory load
with a register-register copy.  On most architectures (x86-64, Alpha and
MIPS are the oddballs here) we wouldn't even need the copy.

	-hpa




More information about the Cluster-devel mailing list