[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

XP1000 stream result



Summary: It's nearly as good as the 264DP motherboard. egcs gets only
50% of Digital Fortran's performance. The wh64 instruction is only a
small part of that difference. This machine only has 1 bank of memory,
so that's also not important.

I forget how to print what version of Digital Fortran that I have. Ah well.

Anyway:

(digital) f77 -O5 -tune ev6 foo.f:

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:        896.7368      0.0358      0.0357      0.0364
Scale:       879.5568      0.0364      0.0364      0.0365
Add:         892.0275      0.0538      0.0538      0.0539
Triad:       888.3963      0.0541      0.0540      0.0541

(digital) f77 -O5 -tune ev56 foo.f: [ shouldn't generate wh64 instruction]

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:        725.3111      0.0444      0.0441      0.0463
Scale:       675.5626      0.0474      0.0474      0.0475
Add:         736.6600      0.0652      0.0652      0.0653
Triad:       692.9703      0.0693      0.0693      0.0694

egcs-1.1.2 -O foo.f:

Copy:        364.0914      0.0887      0.0879      0.0898
Scale:       360.0898      0.0889      0.0889      0.0889
Add:         413.0420      0.1170      0.1162      0.1172
Triad:       409.6017      0.1181      0.1172      0.1182

egcs-1.1.2 -O9 foo.f:

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:        555.3898      0.0587      0.0576      0.0596
Scale:       461.5275      0.0696      0.0693      0.0703
Add:         416.5441      0.1155      0.1152      0.1162
Triad:       423.7250      0.1134      0.1133      0.1143


Here's a complete report. Yeah, I could use a more accurate timer:

----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =    2000000
 Offset     =          0
 The total memory requirement is   45 MB
 You are running each test  10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of        24636  microseconds
    (=        24636  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:        896.7368      0.0358      0.0357      0.0364
Scale:       879.5568      0.0364      0.0364      0.0365
Add:         892.0275      0.0538      0.0538      0.0539
Triad:       888.3963      0.0541      0.0540      0.0541
 Sum of a is =   2.306601562591874E+018
 Sum of b is =   4.613203124856438E+017
 Sum of c is =   6.150937500141256E+017



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []