Saturday, February 19, 2011

Agile BI at Pentaho

Ian Fyfe, Chief Technology Evangelist at Pentaho showed us what the Pentaho Open Source Business Intelligence Suite is and where they are going with it when he spoke to the February meeting of the SDForum Business Intelligence SIG on "Agile BI". Here are my notes on Pentaho from the meeting.

Ian started off with positioning. Pentaho is an open source Business Intelligence Suite with a full set of data integration, reporting, analysis and data mining tools. Their perspective is that 80% of the work in a BI project is acquiring the data and getting it into a suitable form and then other 20% is reporting and analysis of the data. Thus the centerpiece of their suite is the Kettle data integration tool. They have a strong Mondrian OLAP analysis tool and Weka Data Mining tools. Their reporting tool is perhaps not quite as strong as other Open Source BI suites that started from a reporting tool. All the code is written in Java. It is fully embeddable in other applications and can be branded for that application.

Ian showed us a simple example of loading data from a spreadsheet, building a data model from the data and then generating reports from the data. All of these things could be done from within the data integration tool, although they can also be done with stand alone tools. Pentaho is working in the direction of a fully integrated set of tools with common metadata between them all. Currently some of the tools are thick clients and some web based clients. They are moving to have all their client tools be web based.

We had come to hear a presentation on agile BI and Ian gave us the Pentaho view. In an enterprise, the task of generating useful business intelligence is usually done by the IT department in consultation with the end users who want the product. The IT people are involved because they supposedly know the data sources and they own the expensive BI tools. Also, the tools are complicated and using them is usually too difficult for the end user. However, IT works to their own schedule, through their own processes and take their time to produce the product. Often, by the time IT has produced a report, the need for it has moved on.

Pentaho provides a tightly integrated set of tools with a common metadata layer so there is no need to export the metadata from one tool and import it into the next one. The idea is that that the end to end task of generating business intelligence from source data can be done within a single tool or with a tightly integrated suite of tools. This simplifies and speeds up the process of building BI products to the point that it can be delivered while it is still useful. In some cases, the task is simplified to such an extent that it may be done by a power user rather than being thrown over the wall to IT.

The audience was somewhat sceptical of the idea that a sprinkling of common metadata can make for agile BI. All the current BI suites, commercial and open source, have been pulled together from a set of disparate products and they all have rough edges in the way the components work together. I can see that deep and seamless integration between the tools in a suite will make the work of producing Business Intelligence faster and easier. Whether it will be fast enough to call agile we will have to learn from experience.

No comments: