100GB, 300 GB, 1,000 GB TPC results. Click to enlarge.

On October 29th, 2007, Sun Microsystems announced three new TPC-H performance results that are dramatically better than any previous result. These benchmarks are based on Red Hat Enterprise Linux 4.4 running the ParAccel Analytic Database on a cluster of fifteen SunFire x4100 systems (each configured with two dual-core AMD Opteron processors). The chart above provides a high-level summary of the results, extracted from the TPC-H website. It shows the quantum leap in performance that these results represent.

There are three separate benchmark results, each for a different database size, 100GB, 300GB and 1,000GB. Measured in QphH (Queries/Hour), they all show that the new #1 ranked result is at least four times faster than the #2 result – the previous world-record holder. And the price/performance figures, measured in $/QphH (dollars per Query/Hour) are less than a quarter of the price. Clearly, these results represent a whole new order of performance for TPC-H, which is the industry’s leading decision support benchmark. It is described on the TPC web site as follows:

"It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions."

So this is a real-world benchmark, and these results demonstrate that customers do not need to use a high-priced, heavyweight, legacy database to run their business. So how were these incredible numbers achieved?

Clearly it was necessary to take a new approach to the problem of high-performance decision support and analytics. A primary feature of the ParAccel Analytic Database is that it is a column-based database, not row-based like Oracle, DB2 and many others. Thus its query optimizers don’t have to read in a full row of data to perform a query. Only relevant columns are retrieved (meanwhile, a row-wise DBMS would pull all columns and typically discard 80-95 percent of them). To further increase performance all operations are done in parallel (a non-parallel DBMS must scan all of the data sequentially). Additionally, adaptive compression reduces disk overhead, while the memory-centric design maximizes in-memory processing.

To minimize costs, Sun decided to use a locally attached 2-disk configuration rather than using a more expensive array-attached storage, such as Fiber Channel or iSCSI. This further improved the price/performance result.

And, as the highest performance platform on which to run these benchmarks, Red Hat Enterprise Linux was, once again, the winner’s choice.

To view TPC benchmark results, visit