Re: Jakub's Recommendations for ia32 Support

On Tuesday, 03 February 2009 at 21:01, Ulrich Drepper wrote:
> Dominik 'Rathann' Mierzejewski wrote:
> > I'd like to see a case (not involving Pentium 4) where using cmov is slower
> > than not using it. It definitely is faster for decoding H.264 in FFmpeg
> > for example.
> I don't have a specific test case.  But I do talk to the CPU
> architectures at Intel regularly.

I didn't know architectures could talk. ;)

> They always say the cmov should be
> avoided.  Especially with the introduction of the fused micro-ops the
> various cmp+jcc pairs are likely move faster.
> And from the code generation perspective using cmp+jcc is also more
> flexible.  With cmov you have to tie up two registers.  This is
> particularly bad with the x86 ABI.
> There are certainly cases where cmov can be faster.  Perhaps exclusively
> on older micro architectures (P4s, early Core2, maybe AMD, haven't
> checked).  But in general it's no win.

Well, I talk to people who write hand-optimized assembly and care to
squeeze every cycle out of various CPUs and they say it's definitely
a win. So please, show me some code instead of hand-waving.


