[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: lapack, blas, gemm under linux/axp



Title: Re: lapack, blas, gemm under linux/axp
Greg Lindahl wrote:
>Do these RISC blas things have tunable parameters for block sizes?
Here's a snippet from the makefile header on the ENSEEIHT BLAS project.  The generic architecture settings expand into a whole slew of further settings and compiler switches, including cache sizes (but going far beyond that).  FWIW, the Meiko configuration was the extent of my involvement with this project.

Regards,
Stig Telfer

# Two variables, ARCHI and ORG, have to be set to machine-dependent values.
# ARCHI is used to set the cache size to the correct value,
# and thus the block size. The block sizes are included in
# two files blockd.h and blocks.h that give the block size
# for double and single precision, respectively. Additional
# architectures can be included by editing these two files.
#
# The block size NB is the largest even number such that
#
#               3*(NB**2)*PREC < CS
#
# where PREC is the number of bytes corresponding to the
# precision (4 bytes using single precision and 8 bytes
# using double precision in IEEE format) and CS the cache size in bytes.
#
# The valid options for ARCHI are the following:
# STANDARD, SPARC10, RS6K32, RS6K64, and DECALPHA.
# In the STANDARD option the block size
# is determined using a cache size equal to 64KB.
#
# The valid options for ORG are NOTRIADIC or TRIADIC.
# ORG should be set to TRIADIC on
# architectures where the hardware supports triadic operations
# (for example the IBM RS6000 that implements floating-point multiply-and-add).
# Moreover, this option should always be used except when you are sure
# that the compiler you are using does not generate efficient code for
# triadic operations (early SUN compilers for example).
#
# It is important to select the correct version for the IBM since
# we use different organization of computations
# and timing procedures in these two versions.
# RS6K32 corresponds to an RS6000 using a 32KB data cache (750 or SP1)
# and RS6K64 to processors using a 64KB cache (550 or SP2).
#
# Finally on some DEC Alpha compilers it is necessary to use the alternate
# call to the c preprocessor (cpp) since it is not called properly from
# cc for this machine.  The cc call and its alternate can be found near
# the end of the makefile.
#
# Options for some RISC processors :
# ================================
#
#      - DEC Alpha EV4, CRAY T3D : ARCHI = DECALPHA4,        ORG = TRIADIC
#      - DEC Alpha EV5           : ARCHI = DECALPHA5,        ORG = TRIADIC
#      - HP  PA7100              : ARCHI = STANDARD,         ORG = TRIADIC
#      - IBM RS6000              : ARCHI = RS6K32 or RS6K64, ORG = TRIADIC
#      - SUN SPARC10 and SPARC20 : ARCHI = SPARC10,          ORG = NOTRIADIC
#      - MEIKO CS2-HA            : ARCHI = CS2HA,            ORG = NOTRIADIC
#      - SGI Power Challenge     : ARCHI = SGI,              ORG = TRIADIC
#
# By default select ARCHI = STANDARD and ORG = TRIADIC.
--
Any opinions expressed are my own and not those of Alpha Processor Inc.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []