[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: SV: suggestions for udb
- From: Shane Sturrock <sss nova bru ed ac uk>
- To: axp-list redhat com
- Subject: Re: SV: suggestions for udb
- Date: Fri, 10 Dec 1999 08:31:33 +0000 (GMT)
On Thu, 9 Dec 1999, Wes Bauske wrote:
> Interesting. Similar to what graphics folks do. Given
Yup. I am rather hoping the chip manufacturers will see fit to slot
more suitable instructions into the processors in the future, particularly
as 128 and wider processors appear.
> that sort of implementation, the Intel should actually
> run very fast since it has those MMX instructions for
> just that sort of problem. It would be similar to what
> the paper talks about for the I860. You could build a
> simple asm macro instead of what the authors do for the
> i860.
The performance figures for the 64 bit integer speedups were way above
what FP/MMX type instructions gained though, typically in the 20% bracket
whereas 64 bit integer microparallelism sees 700% in our case.
> Not sure if I'd bother though unless it was really
> fast and #ifdef'd with equivalent C.
No, the advantage of our code is that it is perfectly normal and legal C,
it compiles on any 64 bit architecture. Very important.
> Still, I see no reason why one couldn't put two 15 bit words
> in a 32 bit word on an intel and improve it's performance
> as is done with 4 15 bit words on Alpha. Also, one could
> implement the FP version of the algorithm and cut the
> Alpha's lead that way too.
In order to do the folding and extraction of words you end up roughly
doubling the number of instructions, this means that 15 bits on a 32 bit
architecture would be at best the same speed as doing 32 bit work assuming
a single integer pipeline. Now the latest x86s have more than that but
lack of registers would most likely negate any effects. My experience
with normal 32 bit code suggests that the Alpha is 2x as fast clock for
clock than an x86 (PII core). PIII and Athlon may make a little
difference but not enough to bring them to parity with a 164 Alpha for our
work. So, it might be possible to get an Intel to go twice as quick as it
currently does by microparallelism but that would only bring it close to
the non-micropll code on Alpha so we don't consider it worth the effort.
64 bit processors give you a much bigger bang for the buck.
> > One question though. Why are you comparing only 15 bits
> at a time? Is this something intrinsic to the problem?
> Why not directly use 32 or 64 bits??
You need to have a mask bit which allows you to identify where each word
ends and to allow you to detect when a word overflows. If you don't,
well, it can get tricky :-)
Shane
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
[]