[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: RFC: Optimizing for 386

Sean Middleditch wrote:
On Wed, 2005-01-19 at 10:26 -0600, Joseph D. Wagner wrote:

They're optimized for pentium4, but use the i386 instruction set.

There are instructions which have come out since the 386, like MMX, that could improve the performance of programs. In this case, graphics programs. Why should my graphics programs suffer because some fool is running a 20 year old computer?

File bugs for the apps that you *know* could use the optimization.  Make
sure the code actually can use those newer instructions and that the
compiler actually does any auto-vectorization for the code in question.
Provide benchmarks proving that performance increases instead of simply
blind and likely incorrect assumptions.

Benchmarking is important in this case. Use of the new instructions is only going to help in very specific cases where the code is CPU bound and the operations fit the style of parallel processing supported by the new instructions. In some cases the performance could be hurt due to the overheads of packing and unpacking the data for the MMX/SSE/3DNOW! register, besides on P4 processors the SSE hardware works at only half the speed as the integer units, so the gains are not going to be as big on P4.

OProfile could be used to identify areas of code that are cpu bound. Some analsysis could be done to determine whether those sections of code could benefit from using the the special operations. However, these types of improvements by changing instructions are not going to give the order of magnitude improvements due to algorithm changes/improvements in code. These instruction changes are also very processor specific, while improvements in the algorithms will work on virtually everything.

Blindly compiling all the rpms to use the latest instruction set isn't going to magically improve the performance of the code. There are other lots of other things besides processor instructions affecting performance like the memory hierarchy, interprocessor communciation, and algorithms.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]