June 9, 2010
Top 5 advances in ETL and BI in the past 15 years:
- Near Real-time Data Warehouses
- Drag and Drop ETL
- Ad hoc Analytics
- Metadata Driven ETL
What is missing from the list above? That would be the fact that BI use has not penetrated into everyday business life like we would have expected. But wasn’t that the point that we were all heading towards, that of BI for the masses? That goal of information transparency that everyone was looking for seems to have gotten lost by most vendors in the race to have the biggest and baddest feature list. Very powerful BI technology exists today to get at pretty much any data and slice and dice it in ways that we could have only imagined years ago. Yet, the complexity and disparity of that technology continues to be a major roadblock to getting everyday business managers the information they need to best run their businesses. Now, add in the ever changing, dynamic nature of business today, we find the chasm between the IT department and business users largely intact with business users seeking more and more self service BI on their own.
That is why we set out on our Agile BI initiative, to solve these obvious problems that other vendors ignore. The market and competitive response to our Agile BI initiative has been fun to watch. Suddenly, lots of competitors are talking about how to make their technology more agile and industry analysts are again writing about agile as well.
Unfortunately, competitors miss the point. Pentaho’s Agile BI initiative looks to make things simpler, not more complex. This isn’t about adding more technology to the mix, this is about using the technology that we already have in more agile and elegant ways so that we can bridge the chasm between IT and business users. It’s not about long winded explanations of technology infrastructure that only technology geniuses can understand, it’s about opening up the BI process so that IT and business users can collaborate and deploy business relevant BI application quicker.
Isn’t that the real point of BI, getting at more information quicker?
Keep it simple folks.
VP, Product Marketing
May 19, 2010
Earlier today Pentaho announced support for Apache Hadoop – read about it here.
There are many reasons we are doing this:
- Hadoop lacks graphical design tools – Pentaho provides plug-able design tools.
- Hadoop is Java – Pentaho’s technologies are Java.
- Hadoop needs embedded ETL – Pentaho Data Integration is easy to embed.
- Pentaho’s open source model enables us to provide technology with great price/performance.
- Hadoop lacks visualization tools – Pentaho has those
- Pentaho provides a full suite of ETL, Reporting, Dashboards, Slice ‘n’ Dice Analysis, and Predictive Analytics/Machine Learning
The thing is, taking all of these in combination, Pentaho is the only technology that satisfies all of these points.
You can see a few of the upcoming integration points in the demo video (above). The ones shown in the video are only a few of the many integration points we are going to deliver.
Most recently I’ve been working on integrating the Pentaho suite with the Hive database. This enables desktop and web-based reporting, integration with the Pentaho BI platform components, and integration with Pentaho Data Integration. Between these use cases, hundreds of different components and transformation steps can be combined in thousands of different ways with Hive data. I had to make some modifications to the Hive JDBC driver and we’ll be working with the Hive community to get these changes contributed. These changes are the minimal changes required to get some of the Pentaho technologies working with Hive. Currently the changes are in a local branch of the Hive codebase. More specifically they are a ‘Short-term Rapid-Iteration Minimal Patch’ fork – a SHRIMP Fork.
Technically, I think the most interesting Hive-related feature so far is the ability to call an ETL process within a SQL statement (as a Hive UDF). This enables all kinds of complex processing and data manipulation within a Hive SQL statement.
There are many more Hadoop-related ETL and BI features and tools to come from Pentaho. It’s gonna be a big summer.
Learn more - watch the demo
March 31, 2010
I’m a happy guy today. Not only was I able to make it onto the lake this morning for some early morning water skiing, I also just reviewed the results and feedback from Pentaho’s PDI 4.0 launch yesterday. This release was a record setter for us on a number of fronts. Here are a few highlights:
- Over 2,600 PDI 4.0 EE downloads in one day
- We had 1,850* people registered for the live WebEx today, ‘Pentaho Defines a Better Way to Build BI Solutions’ – the most people we’ve ever had!
- Pentaho.com reached an all time high for unique visitors.
- Forum activity is up 251% over normal days
- Demo.pentaho.com is up 192%
- Great coverage and positive response from industry press and the community on Twitter.
To me this proves that the market is hungry for a new approach to an old idea. If you are one of the people looking for a BI solution that is quicker to build, easier to adapt and faster time to value, you can see that you are not alone.
*If you were not one of the 1,850 that attended the live webinar, ‘Pentaho Defines a Better Way to Build BI Solutions,’ you can watch the replay here and access additional resources and recordings.
If you were one of the 1,850 that attended the webinar– what did you think? What were some of your key takeaways?
March 30, 2010
Sometime in Q3 2009 I was working with our CTO James Dixon using Pentaho tech to extract information from Salesforce.com and Eloqua to do analysis on our own leads. Yes, we do eat our own dog food. Constantly we were going back and forth through the process of extract, model and visualize. As you can imagine after the 7th iteration, James was sick and tired of having to do this process over and over again, especially with me changing requirements with each iteration. This process is widespread with all BI tech and we knew that somebody needed to improve it.
Over happy hour at The Miller Ale house or “West Campus” as we call it, we got thinking and brainstorming ideas of how to:
- Remove the pain in the process
- Streamline the process
- Make it is easier so non-techies can achieve success
We also knew that whatever we could do to remove any barriers in building and deploying BI applications would certainly come back to benefit our community and customers.
This was the start of our path to create the Agile BI initiative. With this concept, we also looked for feedback from our open source community and SI partners who are currently doing real world implementations.
Today, I am very proud to announce the release of Pentaho Data Integration (PDI) 4.0, the next product milestone in the company’s Agile BI initiative. PDI 4.0 is the world’s first fully unified ETL, modeling and data visualization development environment that enables developers and business users to work side by side to build and deploy BI applications in minutes to hours rather than from weeks to months. In addition, we have added Enterprise Edition improvements in security, team collaboration, content versioning, repository, and integrated scheduling. Available as either on premise or in the cloud, the end game is a solution that enables BI projects that are quicker to build, easier to adapt and provide faster time to value.
Now I can sit down with James and together we can quickly extract data from our CRM and marketing applications, model the data and visualize the data all in the same development environment. I can get the information I need much more quickly and James’ life will be much easier. And isn’t that what we all want?
Let the new BI era begin!