Where data warehouse appliances come from

Products

Data Warehouse Appliances Overview

Philip Howard of Bloor Research says,

"This is what Netezza has done in the data warehousing market: it has totally changed the way that we think about data warehousing."

Overview

Your organization has spent millions of dollars implementing a data warehouse or data mart using the corporate standard database, such as Oracle or DB2. After all, in-house expertise is available for these databases and volume purchase discounts and site license agreements with the vendors in place. But despite the considerable expertise, service breaches are occurring frequently, users are dissatisfied with the response they are getting, a backlog of enhancements is mounting and any big ideas your organization has about extending Business Intelligence (BI) has dissipated. Faced with these problems, you have employed large teams of database administrators to tune the database to make it respond better to complex queries and you have upgraded your system as far as your budget will allow. But all of this is in vain: every step you take just seems to take you right back to the beginning.

Like most first time visitors to www.netezza.com, it is exactly this frustration that has led you to where you are today.

At Netezza, we believe that data warehousing shouldn’t be this complicated. Instead of spending time making data warehouses run efficiently, Netezza users deploy purpose-built data warehouse appliances that solve business problems. Netezza data warehouse appliances are installed quickly and easily, integrate with your preferred ETL and BI tools and can be largely left alone to get on with the job.

Netezza appliances provide a database-server-storage configuration in a purpose-built system designed to perform complex queries against large volumes of stored data. Netezza data warehouse appliances are designed for blisteringly fast analysis against terabytes of data 10-100 times faster than traditional solutions, with a lower TCO and greater ease of use. Netezza uses massively parallel processing and an architecture that puts processing right inside storage to provide a brute force solution that can deal with complex analytics against large data volumes.

But what’s wrong with conventional databases?

Because they were designed for a completely different application profile, OLTP databases provide a poor platform for data warehousing. They are designed to read small indexed records from disk, moving them to memory where they are updated and written back to disk as the persistent record. Analytical systems do the reverse: they trawl through vast amounts of data looking for exceptions. At a level of one terabyte and above, databases designed for transaction processing struggle to deliver acceptable response times and their owners begin to spend inordinate amounts of time and money on performance tuning. However, the root problem is not in the database software per se. It is the latency in moving vast data volumes from disk across a network and into the memory of the computer so the database can then start to do its job. The problem is outside the control of the database. In analytical systems, database tuning is a self-defeating task: to tune, the administrator must be able to predict which queries will access what data. Analytical processing has at its core the "problem→question→better question cycle"; the analyst, let alone the database administrator, cannot predict which complex query will be used next.

Traditional database architecture is unable to process data until it has been transferred from disk across a network and into memory of the CPU(s) running the DBMS. This movement of data represents a technology bottleneck caused by several factors:

  • Disk transfer I/O rates cannot read terabytes of data quickly enough
  • Network transfer rates cannot move terabytes of data quickly enough from disk to memory
  • Memory density growth cannot keep up with data growth making traditional caching less effective over time

And increasing CPU performance becomes irrelevant since the gating factor in traditional implementations is moving the data off disk to the CPU.

Next >

Netezza Community

Join other enzees to express your opinions, learn and participate in the Netezza Community. If you're a developer, join our Netezza Developer Network.