Data warehouse modernization programme - Keynote by TOBY WOOLFE at Big Data Spain 2014
General Motors (GM) is in the process of constructing a single global information warehouse that will become the foundation for all business analytics and decision support across the enterprise.
help explain what is working in the world of Hadoop, real time analytics and internet connected devices. IBM delivers a significant data warehouse modernization programme to General Motors Toby Woolfe, Big Data Industrial and Automotive Solutions Leader, IBM Europe [email protected] +44 7795 328 742
We are on the precipice of awareness of, and adoption of, internet connected devices far beyond the Smartphone... Devices: 145 actuators, 70 on- board computers, 4700 relays and 70 sensors, including radar, sonar, accelerometer, camera, rain sensors. 433MB per minute 1 car year = 1TB Plus existing enterprise data, social media data, 3rd party suppliers,...etc
3 But how fast does information currently get delivered? Monthly reporting? Based originally on the lunar cycle, the Julian Calendar was introduced by Julius Caesar in 46BC Quarterly reporting? Celebrated by Druids for centuries, the four seasons result from the yearly revolution of the earth around the sun and the tilt of the earth’s axis relative to the plane of revolution. Annual reporting? In temperate and polar regions, the seasons are marked by changes in the intensity of sunlight that reaches the Earth's surface, variations of which may cause animals to go into hibernation or to migrate, and plants to be dormant Weekly reporting? Recorded in Babylonian carvings dated 6th century, the origin of the 7 day week was probably based on a quarter of the lunar cycle (though inaccurate)
4 “GM Opens New Data Center Modeled on Google, Facebook” CIO Randy Mott launched a new $130 million data center Chief Executive Dan Akerson said ”If we are going to win, we must turn IT into a competitive advantage and not treat it as something that is just a utility,” The IT Operations and Command Center in the $130-million Enterprise Data Center at the General Motors Technical Center in Warren, Michigan.
5 GM’s people, process and technology evolution GM has hired about 1,500 software developers and engineers, up from “close to zero” just one year ago, according to Tim Cox, CIO of GM global development services About 70% of the 10,000 IT staff will be focused on innovation Until recently, GM was using 23 different data centers. Now the company is moving to two The site in Warren has the capacity to hold more than 10,000 ‘pizza box’ size servers, as well as smaller-size servers and larger mainframes The data warehouse incorporates both the cutting edge IBM BigInsights for Hadoop technology as well as more traditional Massively Parallel Processing technologies historically used for data warehousing Source/thanks to
6 General Motors wiki Business profile Global vehicle sales leader for 77 consecutive years from 1931 – 2007 212,000 staff in 157 countries Business problem GM has issued 45 recalls in 2014 involving 28 million cars worldwide The cost of one recall due to faulty ignition switches, which has been linked to at least 13 deaths, is $1.3 billion IT transformation General Motors (GM) is in the process of constructing a single global information warehouse that will become the foundation for all business analytics and decision support across the enterprise. The business objectives are – to reduce operating expenses for data warehouses by 25-30% – deliver 10x the measurable business value – double IT project capability – and integrate existing data warehouse systems with new technology to offload data, reduce latency, reduce costs and improve performance.
7 Abstract: IBM delivers a significant data warehouse modernization programme to General Motors High performance and continuously available data management environment 329,000 users with a 10% concurrency rate Grow to approximately one petabyte in size over three years IBM proposed a centralized enterprise architecture leveraging a traditional data warehouse matched with IBM Big Insights Hadoop technology for big data analytics Data types: • Inventory control of parts • Manufacturing equipment and assembly line data • Warranty and services data from dealers • Telemetry data from vehicles • Customer services and social media data
11 Capabilities required 1. Efficient data protocols 2. Real time analytics 3. Capture all data to a landing zone 4. High performance analytics platform 5. Application Platform 6. Data governance & integration
12 IBM conclusions applicable to enterprise data warehouse modernisation learned from work with GM GM’s objectives published in WSJ: –Reduce operating expenses for data warehouses by 25-30% –Deliver 10x the measurable business value –Double IT project capability –Integrate existing data warehouse systems with new technology to offload data, reduce latency, reduce costs and improve performance Educate enterprise architecture staff on the new disruptive technology capabilities delivered from Hadoop, and its integration into existing data architectures Engage in a revue of data warehouse architecture to harness the cost savings, speed advantages and business focus that new technologies such as IBM BigInsights for Hadoop brings to market Experiment with new data asset/business objective combinations with IBM’s Big Data experts
13 For a volume production car fleet using telematics data, IBM correctly predicted warranty codes/claims 43 days out with 86% accuracy and only 1% false positives For a new electric car model, IBM predicted warranty claims 60 days out with 96% accuracy and 1% false positives on 619 vehicles using Big Data and analytics technology Case study illustrates the hidden value of machine data
14 Broad range of role based visualisation and analytical capabilities SPSS Predictive analytics R Script support Integration with Cognos BI Data scientist Business person Spreadsheet visualisation of data 360 degree view of machine/client/location Delivered using the BigInsights wizard to accelerate the use of Hadoop data
How is Big Data Being Used in Industry? Connected Vehicle V2V, V2C, V2I, V2X Automotive Telematics Real-time Alerts Geospatial Analytics Predictive Failsafe Analytics Driver Specific Predictive Analytics Condition Based Monitoring After Sales Fraud Analytics Regulatory Analytics Social Media Analytics Warranty Analytics Customer 360 Next Best Action Marketing & Sales Social Media Analytics Text Analytics Customer 360 Actionable Customer Intelligence Next Best Action Patent Analytics Social Media Analytics Text Analytics Quality & Warranty Analytics Product Development & Engineering Manufacturing & Quality Predictive Maintenance Quality Early Warning Operational Efficiency Supply Chain Analytics
Suggested next action: Create an inventory of your structured and unstructured data and assess where business value will be found Internal External Structured Unstructured Relatively easy to acquire Difficult to acquire Complaints MDM Customer Data Financial Data Txn History Payment History Warranty History Workflow Data Call Recordings Maintenance records Activities Emails Web Traffic Shipping schedule data Telematic Data Machine failure data Geolocation Data Google Alerts Catastrophic Data Machine Performance Data Social Media Data