At Strata 2013 last week, Pentaho had the privilege to host a speaking session with Ben Lloyd, Sr. Program Manager, AutoSupport (ASUP) at NetApp. Ben leads a project called ASUP.Next, which has the goal of implementing a mission-critical data infrastructure for a worldwide customer support program for NetApp’s storage appliances. With design and development assistance from Think Big Analytics and Accenture, NetApp has reached the “go-live” milestone for ASUP.Next and will go into production this month.
A Big Data Problem
More than 250,000 NetApp devices are deployed worldwide; they “phone home” with device statistics and diagnostic information and represent a continuously growing collection of structured data that must be reliably captured, parsed, interpreted and aggregated to support a large collection of use cases. Ben’s presentation highlighted the business and IT challenges of the legacy AutoSupport environment:
- The total cost of processing, storing and managing data represents a major ongoing expense ($15M / year). The storage required for ASUP-related data doubles every 16 months — by the end of 2013 NetApp will have more than 1PB of ASUP-related data available for analysis
- The legacy ETL (PL/SQL) and data warehouse-based approach has resulted in increased latency and missed SLAs. Integrated data for reporting and analysis is typically only available 72-hours after the receipt of device messages
- For NetApp Customer Support, the information required to resolve support cases is not easily available in the time required
- For NetApp Professional Services, it’s difficult or impossible to aggregate the volume of performance data needed to provide valuable recommendations
- For Product Engineering, failure analysis and defect signatures over long time periods are impossible to identify
Cloudera Hadoop: at the Core of NetApp’s Solution
The ASUP.Next project aims to address these issues by eliminating data volume constraints and building a Hadoop-centered infrastructure that will scale to support projected volumes. Ben discussed the new architecture in detail during his presentation. It enables a complete end-to-end workflow including:
- Receipt of ASUP device messages via HTTP and e-mail
- Message parsing and ingestion into HDFS and HBase
- Distribution of messages to case-generation processes and downstream ASUP consumers
- Long –term storage of messages
- Reporting and analytic access to structured and unstructured data
- RESTful services that provide access to AutoSupport data and processes
Pentaho’s Data Integration platform (PDI) is used in ASUP.Next for overall orchestration of this workflow as well as implementation of transformation logic using Pentaho’s visual development solution for MapReduce. Pentaho’s main value to NetApp comes from shortening the development cycle and providing ETL and job control capabilities that span the entire data infrastructure, from HDFS, HBase and MapReduce to Oracle and SAP. Pentaho also worked closely with Cloudera to ensure compatibility with the latest CDH client libraries.
NetApp’s use of Hadoop as a scalable infrastructure for ETL is increasingly common. Pentaho is seeing this use case across a variety of industries including capital markets, government, telecommunications, energy and digital publishing. In general, the reasons these customers use PDI with Hadoop include:
- Leveraging existing team members for rapid development and ongoing maintenance of the solution. Most organizations have a core ETL team that can bring a decade or more of subject matter expertise to the table. By removing the requirement to use Java, a scripting language or raw XML, team members are able to actively help with the build-out of jobs and transformations. This also lessens the need to recruit, hire and orient outside developers
- Increasing the “logic density” of transformations. As you can see in the demo example below, it’s possible to express a lot of transformation logic in a single mapper or reducer task. This makes it possible to reduce the number of unique jobs that must be run to achieve a complete workflow. In addition to improving performance, this can result in designs that are easier to document and explain
- Focusing on the “what”, not the “how” of MapReduce development. I was surprised (actually shocked) to see how many of the speakers at Strata were still walking through code examples to illustrate a development technique. The typical organization has no desire and little ability to turn itself into a software development shop. The language-based approach may work for the Big Data “Titans”, but not for businesses that need to implement Big Data solutions quickly and with minimal risk
Since this was a Pentaho-sponsored session, Ben summarized his experience working with the Pentaho Services and Engineering teams. His main points are illustrated in the photo above. Most of his points revolve around how Pentaho provided support during early development and testing. A large number of Pentaho employees contributed their time, energy and brain-power to ensure the project’s success. Many enhancements in PDI 4.4 are a direct result of improvements needed to support ASUP.Next use cases.
What has Pentaho learned from this project? Pentaho gained a number of valuable insights:
- Big Data architectures to support low-latency use cases can be complex. Not only are multiple functional components needed, but they must integrate with existing systems such as enterprise data warehouses. These architectures demand a high degree of flexibility
- Big Data projects require customers, system integrators and technology providers to “plumb the last 5%” as the solution is being developed. Inevitably, new capabilities are used for the first time and need to be fine-tuned to support real-world use cases, data volumes and encoding formats. A good example is PDI’s support for AVRO. Although we anticipated needing to adapt the existing AVRO Input Step to work with NetApp’s schemas, we only understood the full set of requirements after seeing their actual data during an early system test
- Pentaho’s plugin-based architecture isolates the core “engines” from the layer where point-functionality is implemented. Pentaho is able to implement all of the required enhancements without a single architectural change. The AVRO enhancements and other improvements (such as HTableInput format support for MapReduce jobs) were all coded and field-deployed via updates to plug-ins, completely eliminating the possibility of introducing defects into PDI’s data flow engine.
- Open source is a significant “enabler” making it easy for everyone to understand how integration works. It’s hard to overestimate the importance of code transparency. It allows the customer, the system integrators and the technology partners to get right to the point and experiment quickly with different designs.
It’s been a pleasure working with NetApp and its partners on the ASUP.Next solution. We look forward to continuing our work with NetApp as their use of device data evolves to exploit new opportunities not previously possible with their legacy application.
Dave Henry, SVP Enterprise Solutions