of servers, storage, operating system(s), DBMS and software specifically pre- installed and pre-optimized for data warehousing (DW). × Typically supplied on a preconfigured set of hardware as a complete system, acting as a true appliance. × Can apply to software-only systems, purportedly easy to install on specific hardware configurations. Drivers • Performance • Scalability • Cost • Availability • Portability
Parallel Processing (MPP) × Data Warehouse Appliance architectures provide high query performance and platform scalability × Most MPP architectures implement a "shared-nothing architecture" where each server operates self-sufficiently and controls its own memory and disk × Basis of MPP Scalability × Divide the data, workload, and system resources evenly among many parallel, processing units × No single point of control for any operation × I/O, Buffers, Locking, Logging, Dictionary × Nothing centralized × Nothing in the way of linear scalability DW Appliance Logs Locks Buffers I/O
VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs Advantage of Shared Nothing Architecture × Delivers linear scalability × Maximizes utilization of resources × To any size configuration × Allows flexible configurations × Incremental upgrades × Linear with a slope of 1 at any size # Nodes Performance
not just entry level × High performance algorithms × Join, Aggregation, Sort etc. × Compiled expressions × Complex query features × Derived tables, Case expressions, all forms of sub-queries, Samples etc. × Big limits: 256 table joins, 256 nesting levels etc. × 1MB SQL/Views/Macros × The more complex the better
Migration Drivers Approach Performance 1:1 or Redesign or Evolution Availability 1:1 or Evolution Scalability 1:1 or Evolution Single View of the Business Redesign or Evolution Business Questions (Complexity) Redesign or Evolution Cost of Ownership 1:1 There are several options for migrating from Oracle to a DWA. The best strategy is dependent upon the overall goals of the project, existing capabilities and organizational constraints.
using commodity technologies rather than proprietary assembly of commodity components. × Implemented applications show usage expansion from tactical implementations to strategic and enterprise data warehouse use × Most analysts see DW appliances gaining market share vs. traditional DBMS solutions × Vendors have begun providing the ability to incorporate 'in-database' analytic algorithms to take advantage of their MPP architectures
Linux Operating System n Fully-integrated cabinet design n Intel Six Core Xeon Westmere Processors @ 2.93GHz n Nine servers per cabinet n (216) 300GB or 600GB enterprise-class drives per fully populated cabinet n (108) 2TB SAS drives per fully populated cabinet n 16.4TB customer data per fully populated cabinet with 300GB drives n 32.2TB customer data per fully populated cabinet with 600GB drives n 54.9TB customer data per fully populated cabinet with 2TB drives n Scales up to 6 cabinets, ~ 275TB customer data with 2TB drives n Teradata Managed Server Options n RAID1 disk mirroring, Automatic Node Failover n 864GB memory per cabinet n Teradata BYNET® over Gigabit Ethernet
Platform, Aster Data nCluster, delivers a massively parallel software solution that embeds MapReduce analytic processing with data stores for big data analytics that incorporate new data sources and types to deliver new analytic capabilities with breakthrough performance and scalability. Its unique Applications- Within® architecture runs analytic application logic inside the system, leveraging its massively-parallel processing architecture and patent-pending SQL- MapReduce® to fully parallelize processing for deep and ultra-fast analysis of massive data sets. Appliance Highlights • “Always-Parallel” Pervasive Parallelism • Embedded MapReduce • Unlimited Scalability • “Always-On” Fault Tolerance • Hybrid Row and Column Stores • Dynamic Mixed Workload Management • Extensibility Framework
Greenplum Database is at the core switching between them can be effortless as requirements change and DCA The DCA is a purpose-built, highly scalable next generation data wareh architecturally integrates database, compute, storage, and network into easy-to-implement system. It is the industry leader in price and perform HIGH CAPACITY DCA 5IF)JHI$BQBDJUZ%$"JTEFTJHOFEUPIPTUBNVMUJQFUBCZUFPGEBUBXJUI space, surging power consumption, or increasing costs. For businesses analysis of extremely large amounts of data or those looking for a longe model offers the lowest cost-per-unit data warehouse. DCA FAMILY SPECIFICATIONS OVERVIEW DCA GP1000 High Capacity DCA GP1000C Master Servers 2 2 Segment Servers 16 16 Total CPU core 192 192 Total Memory 768 GB 768 GB Segment HDD’s SSDs 192 192 Usable Capacity (uncompressed) 36 TB 124 TB Usable Capacity (compressed) 144 TB 496TB Maximum Expansion 6 racks 6 racks Scan Rate 24GB/Sec 14 GB/Sec Data Load Rate 5#)PVS 5#)PVS t4VQQPSUTMJOFBSTDBMBCJMJUZ HIGH CAPACITY DCA t"CJMJUZUPIPTUNVMUJQFUBCZUFPGEBUB without taking up additional space, surging power consumption, or increasing costs t#FTUQSJDFQFSVOJUEBUBXBSFIPVTF appliance DATA COMPUTING APPLIANCE FAMILY The EMC Greenplum Data Computing Appliance (DCA) is available in two models: DCA and High Capacity DCA. DCA The DCA is a purpose-built, highly scalable next generation data warehousing appliance that architecturally integrates database, compute, storage, and network into an enterprise-class, easy-to-implement system. It is the industry leader in price and performance. HIGH CAPACITY DCA The High Capacity DCA is designed to host a multi-petabyte of data without taking up additional space, surging power consumption, or increasing costs. For businesses that require detailed analysis of extremely large amounts of data or those looking for a longer term archive, this model offers the lowest cost-per-unit data warehouse.