Thursday, 20 September 2012

Reporting framework with Apache POI, MS Excel and BIRT


Recently I came across a unique requirement, where in an inventory entitlement analysis model was developed in MS Excel and the customer wanted to move away from MS Excel model and turn it into a more scalable, extendable model, which could then further be deployed as SaaS model. Though the requirement was to move away from MS Excel, considering the business domain peculiarities, practicalities and user behavior, using MS Excel in some way was a strong possible solution option. Also the requirements demanded usage of statistical computation for forecasting and estimations where in the calculation power of MS Excel would have come real handy, so that was finally the option of choice. 

Problem Statement

The pressing need of moving away from MS Excel based model for the customer was result of few things. Firstly, the MS Excel based model was “evolved” to what it was and so a lot of static references were introduced all over the sheets and it had reached a state where making a slight change like adding a new product or a warehouse would mean a lot of work and making it very rigid and practically impossible to scale. Another reason to move away from MS Excel based model was the need to have a relational database and a way to maintain full history of actual and forecast data, to be able to do better prediction and enhanced reporting. 

Solution

Basically, I had three options to choose from – 
1. Develop the entire app in excel by re-modeling the data storage and using macros 
2. Use excel as front end and plug-in a database engine like MS Access 
3. Use excel only as calculation engine and develop a web based application that ‘uses’ the excel model using third party libraries for excel api. 

After studying the customer problem statements, requirements and understanding their business, and knowing the typical user behavior, I decided to still make use of MS Excel, in a more optimal fashion, which led me to choose the third option. 

Overall solution then went on evolving as we dug deeper into each component of the high level architecture. JSF was chosen to cater for the pluggable user interface need on account of its recent popularity and active community (struts community didn’t show any activity for last few months). For relational database we chose MySql’s community version – that was an easy choice. Another very crucial requirement was to be able to generate crosstab reports, line and pie charts. Keeping the reporting dynamism in mind, I decided to use a reporting framework – again, an open source one. There were couple of options to choose from where in Jasper and BIRT topped the list. Some of the pros we saw in BIRT compared to Jasper were in terms of ease of development, implementation of crosstabs and drill-down and linked reports. After few POCs, we choose BIRT over Jasper. BIRT also had a third party library (JSF4BIRT) which made it very easy to incorporate BIRT report into JSF. 

The application being a forecasting application, involved some complex evaluation models and formulae which, I thought, would be very easy to implement in MS Excel and fairly complex, otherwise. So I decided to deploy MS Excel to this task. Apache’s POI was then deployed to facilitate integration of the java application with MS Excel documents. 

For other calculations, which were huge in number but not so complex, PLSQL seemed to be right place since if they were to be done in the middle layer, would involve huge number of database calls hampering the performance, so in order to do these calculations closer to where data resides, PLSQL Stored Procedures were deployed. 

There we had, as envisioned, a web based J2EE application where in, the business user could easily maintain the master data and during every forecast run (typically weekly or monthly), download an automatically generated, dynamic MS Excel based input template. He/she would then fill it in at leisure and upload it back into the system. Basic data validations being handled using the MS Excel validation framework itself, System would then do further business validations and dump the data into the staging area. User would then “run” the model to trigger the system to generate forecasts and variety of reports in batch and using BIRT reporting engine, just render the filter based parameterized crosstabs and charts, on the report pages.

Future

This framework could further be enhanced to incorporate various goodies like BI kind of features letting users create their own reports, adapters with ERPs to automate process of data population and report generation, tagging exceptional data points and preserving tags for future reference and so on.

2 comments:

  1. Hello Abhay, Nice info. I am pretty much designing for similar requirement. Can you calrify the following. Why did you require a PL/SQL for calculation if you maintain Excel template. Are the calculations so complex that if cant be done in Excel Macros. You can have input sheets populated and render the other sheets based on caculation. Doesnt BIRT+Apache POI support this without PL/SQL?

    Thanks in advance

    ReplyDelete
  2. Hi Vijay, yes, excel macros could also be used, but then there were limitations in cases where users wanted dynamic reports, which mean dynamic filters e.g. dates , where I needed to cover more than one business cycles, data for which was not present in the Excel templates which had data only for one business cycle. Involving history data along with current data in calculations was another reason for moving some calculations to PLSQL layer.

    ReplyDelete