[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: Performance again
- From: Ed Hall <edhall terminus ayched com>
- To: Herwin Jan Steehouwer <herwinjs palet nl>
- Cc: axp-list redhat com
- Subject: Re: Performance again
- Date: Sun, 03 Jan 1999 12:22:34 -0800
> On Sun, 3 Jan 1999, geerten kuiper wrote:
> > Herwin Jan,
> > - David Mosberger once wrote an article on optimizing code for Alpha. You
> > may be able to find it somewhere on the web. Cache optimization is the main
> > thing to realize the Alphas full potential.
> i cannot find him ;-( also his e-mail adres i found, was not working ;(
It's at "http://www.mostang.com/~davidm/papers/expo97/paper.ps.gz". I've
obtained 2:1 speedups using the techniques in his paper.
> > - on some older Alpha boxes, memory bandwith limits performance. On your XL
> > this should be less of an issue (is it 128 or 256 bits wide ?)
>
> 366Xl is 128 bits wide !
Memory bandwidth usually isn't as much the issue as memory latency. The
Mosberger paper presents some techniques for avoiding processor stalls due
to memory latency that work quite well.
> a test programm ( see attachment ) will give this output ! how can i speed
> this up ?? ( gcc -o preftest preftest.c ) NO optimizing for this test !
No optimization? That, IMHO, is silly! Just about any advanced processor
can benefit from compiler optimizations, some more than others. This is
especially true of highly pipelined multi-issue CPU's like the Alpha.
Concerning your benchmark: you are performing an unusual number of divisions.
The Alpha is especially weak on integer division, while its floating-point
division is only so-so. Avoiding divisions when possible (e.g. multiplication
by the reciprocal, such as "val1*= (val2)/(val2*val2 + val3)" instead of
"val1/= (val2*val2 + val3)/(val2)") is a win on just about any architecture,
including Pentium, but it is especially desirable on Alpha.
Just for fun, I ran your benchmark (unaltered and unoptimized) on a 533MHz
LX164 system:
time perftest i
2.77
time perftest f
0.93
time perftest d
1.28
Not too bad, really. (The P2-300's were just coming out when I bought this
system, BTW, and only cost about 20% less.)
-Ed Hall
edhall@ayched.com
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
[]