Evaluating Hardware/ Software Tradeoffs - Hardware extensions and the compilers that make them real


< Prev Contents Next >

Extension models

ISA extensions do more than just add a "mess" of instructions and leave their use to assembly writer and compiler folk. ISA extensions are in essence a computing model ¾ a set of new instructions and a structured, ordered way of using them. These new instructions are designed to work together, to support one another and to provide new capabilities to the existing ISA.

For example, for DSP extensions, the new instructions were created to allow developers to set up and iterate through a DSP loop accumulating values for a series or for matrix or vector operating. The "model" in this case includes setting up addressing pointers, setting loop values and running an iterative DSP processing loop. The key to efficient DSP operation is to minimize "bookkeeping" operations and run very tight inner loop code.

Thus extensions add a set of instructions to the hardware and ISA and a "computing model" to the ISA. The "computing model" provides the model of behavior that programmers can emulate to craft efficient code. The compiler is also structured to optimize that "computing model" or way of coding. This goes beyond the traditional compiler optimization, which is to optimize individual instructions or threads of instruction usage.

Computing models make the point that ISA extensions are more than a few new instructions. Instead, they are a collection of operations that collectively support a model of processing. The closer library and application code follows those models, the more efficient code execution will be.

For example, 16-/32-bit RISC enables a 32-bit machine to execute from 16-bit code and maximize code density. To do this, to shrink 32-bit instructions to 16-bit instructions, the number of instructions and number of registers referenced are reduced. Smaller instruction word fields equal less instructions and less register resources. Less register resources means that the compiler must be register stingy and users should minimize defined register use.

Benefits of extending the Instruction Set
The benefits from using ISA extensions and deploying them in assembled and compiled code is not trivial. These extensions can deliver a great boost in performance or usability for specific application classes. Benefits include:

  • Graphics/Math-Intensive ISA Extensions
    graphics or visual (math intensive) extensions can deliver 8x or greater performance boosts for math and graphics operations that operate on fields smaller than the full data word. Logical, math and graphic operations can be performed on multiple fields specified by one instruction. It also cuts the number of iterative loops by n (the number of fields processed in parallel).
  • Dynamic 16-32-bit RISC
    embeds a 16-/32-bit processor in a 32-bit RISC. Has 16-bit instruction set (and registers) with 32-bit data path. While theoretically, reduction in code size for a 16-/32-bit RISC can approach a 50%, the typical reduction is on the order of 25% to 35%. Dynamic operation enables programmers to use the 16-bit ISA and the 32-bit ISAs as needed.
  • DSP Extensions
    adds DSP instructions, operations to RISC. Since RISCs run faster than DSPs (up to 3 or 4 x), RISC DSP iterations can match lower speed, multifunction DSP operation. With these extensions, many DSP functions can be moved into the main processor instead of requiring an external DSP.
  • Java
    Java does not deliver a performance boost over C or C++; it's actually much slower. Instead, however, it provides a much more controlled, structured object-oriented development and runtime environment. Java is a cleaned up, modernized C++ with a built-in run-time environment with multimedia support.

These ISA extensions allow developers to stay with standard microprocessor architectures, yet get critical performance boosts (or usability boosts) for their applications. MIPS-16, for example, enables programmers to use MIPS RISC for applications that need 32-bit processing power but are restricted in memory costs. See Figure 2 for an example of MIPS 16 instruction decompression.

Figure2: MIPS 16 decompression


< Prev Contents Next >