Showing posts with label olap. Show all posts
Showing posts with label olap. Show all posts

Sunday, 24 October 2010

Too many choices for the modern analytics Solution Architect

We analytics practitioners have always had the luxury of alternatives to the RDBM as part of our data architectures. OLAP of one form or another has been providing what one of my colleagues calls ‘query at the speed of thought’ for well over a decade. However, the range of options available to a solutions architect today is bordering on overwhelming.

First off, the good old RDBMS offers hashing, materialised views, bitmap indexes and other physical implementation options that don’t really require us to think too differently about the raw SQL. The columnar database and implementations of it in products like Sybase IQ are another option. The benefits are not necessarily obvious. We data geeks always used to think the performance issues where about joining but then the smart people at InfoBright, Kickfire et al told us that shorter rows are the answer to really fast queries on large data volumes. There is some sense in this given that disk i/o is an absolute bottleneck so less columns means less redundant data reading. The Oracle and Microsoft hats are in the columnar ring (if you will excuse the mixed geometry and metaphor) with Exadata 2 and Gemini/Vertipaq so they are becoming mainstream options.


Data Warehouse appliances are yet another option. The combined hardware, operating systems and software solution usually using massively parallel (MPP) deliver high performance on really large volumes. And by large we probably mean Peta not Tera. Sorry NCR, Tera just doesn’t impress anyone anymore. And whilst we are on the subject of Teradata, it was probably one of the first appliances but then NCR strategically decided to go open shortly before the data warehouse appliance market really opened up. The recent IBM acquisition of Netezza and the presence of Oracle and NCR is reshaping what was once considered niche and special into the mainstream. 


We have established that the absolute bottleneck is disk i/o so in memory options should be a serious consideration. There are  in-memory BI products but the action is really where the data is.Databases include TimesTen (now Oracle’s) and IBM’s solidDB. Of course, TM1 fans will point out that they had in-memory OLAP when they were listening to Duran Duran CD’s and they would be right.

The cloud has to get a mention here because it is changing everything. We can’t ignore those databases that have grown out of the need for massive data volumes like Google’s BigTable, Amazon’s RDS and Hadoop. They might not have been built with analytics in mind but they are offering ways of dealing with unstructured and semi-structured data and this is becoming increasingly important as organisations include data from on-line editorial and social media sources in their analytics. All of that being said, large volumes and limited pipes are keeping many on-premises for now.

So, what’s the solution? Well that is the job of the Solutions Architect. I am not sidestepping the question (well actually, I am a little) However, it’s time to examine the options and identify what information management technologies should form part of your data architecture. It it is no longer enough to simply chose an RDBMS.  

Thursday, 30 July 2009

BI Application Design – The Missing Step

If Amazon applied typical BI application design techniques to their web site the user experience might be very different. For example, I might be browsing the DVD's for a gift and need a little inspiration so I hit the 'reports' tab. Here I get presented with a long list of reports, one of which one is 'Top n Products'. I then get prompted with a pick list of products where I select 'DVD' and finally I select '10' to get the Top 10. The list is pretty interesting, but there is nothing that grabs me so I decide to look at the next 10. I re-run the report, selecting 20 instead of 10. This time, I spy the ideal box set and go back to the main site to make my purchase. I am sure you get the picture by now. It sounds awful, clunky and not at all like the actual Amazon experience.

BI reports can sometimes get designed with little or no understanding of the decisions they support. Actually, more often than not, the requirement is communicated as a report layout and the underlying business need is at best inferred or at worst lost in a fixed specification of rows, columns, filters, sorting and grouping. Without an understanding of the audience for the report or how the information is used, the report layout communicates a general requirement in a one-size-fits all report.

A BI specialist on a project that we are currently assisting with recently demonstrated how it should really be done.

The requirement is for a team of internal sales reps each targeted with a number of customers to call each month. Our BI specialist was asked to provide a report which compared actual calls made with target. Rather than creating one multi-purpose report, he created three specific solutions. One for the rep, one for the sales managers and one for the senior management team. The report for the rep contains the number of calls they have made, their target and how many calls they need to make today, this week and for the remainder of the month. It is run daily. The rep can plan their daily, weekly and monthly activity based on this information. The report for the sales managers is ranked so that they can manage the individuals accordingly. This report is weekly, reflecting the frequency with which they review and action the performance of their teams. The senior management team report is monthly and is a team summary for monthly and quarterly sales performance meetings with the sales managers.

Of course, the Amazon comparison is not completely fair. Whilst it is using highly summarised information for the purpose of decision making the decisions are discrete and simple. Which computer game should I buy? What computer games did others that bought this one also buy? It is also an application that has a single user type – customer. Not only that, there are millions of users which makes design effort not only cost-effective but critical to generating revenue.

However, some of the design principles absolutely apply;

  1. Understand the decision (which reps do I need to coach to make more calls?)
  2. Understand the information required to make the decision (how many calls they have made, what is their target and what is the variance)
  3. Identify the actions that can be made as a result of decision making (reps may be incented, coached or disciplined)
  4. Link the information as closely as possible to the business process it supports (sales managers review activity levels every Friday morning)

Finally, this doesn't mean there is not a case for analysis. BI supports managers and not all decision making is predictable and routine. The business environment is continuously changing and each month or quarter will throw up new and interesting business challenges to solve. These can be subtle variants of historical business challenges or completely new. This is where decision makers need the flexibility to explore and analyse trends, exceptions and patterns to validate what is happening and determine the actions that they will take next.

The answer, as is often the case, lies in a combination of providing BI reports that support decision makers closest to the point at which they need to make the decision along with the flexibility we have come to expect from good OLAP tools.