Tuesday 23 June 2009

Business Intelligence and the Semantic Web

Analytics strategist Seth Grimes was in town last week speaking in Covent Garden on the subject of Web 3.0. For those still catching up with what this means then the evolution of the web is generally thought of as;

Web 1.0

Retronym which refers to the web as largely a publishing paradigm

Web 2.0

The interactive web characterised by the rise of social networking

Web 3.0

The semantic web. Functionally rich and understandable to machines as well as people

Die Hard 4.0

Fourth instalment of the Bruce Willis franchise released in the US as Live Free or Die Hard

Whilst web 3.0 is some way off there are some early glimpses into the world of possibilities with search engine innovations from Google and Wolfram Alpha. Fellow BI blogger, Peter Thomas who also writes about this subject in his blog Literary Calculus uses the example http://www.google.co.uk/search?&q=age+of+the+pope.

What interests me about the semantic web though, has much less to do with what might be described as contextually aware searching and more to do with the impact on Business Intelligence.

BI, since the 1970's, has been almost entirely focused on numbers. This is understandable given that these tend to be highly structured, organised and available in databases. Arguably though, this wasn't the original vision. In his 1958 (yes, 1958) article "A Business Intelligence System", IBM visionary, Hans Peter Luhn describes the objective as "to supply suitable information to support specific activities carried out by individuals, groups, departments, divisions.." and what Luhn goes on to describe is statistical analysis of text and documents as information sources.

Seth Grimes, in his Intelligent Enterprises article BI at 50 Turns Back to the Future makes the point that BI has more latterly revisited this original vision on the analysis of text and document sources which, after all, reflects the vast bulk of corporate data.

Early innovators, including my own company Artesian, are already making progress in the field of analysing documents for the purpose of business intelligence. This new breed of semantically aware business intelligence technology can "supply suitable information" to "support activity" by answering questions like;

  • 'which of my competitors are growing and which are declining?'
  • 'are my customers launching initiatives that could be supported by our products or services?'
  • 'do market behaviours indicate a declining need for our products?'
  • 'how did customers respond to our competitors when they changed their business in a way that we are also considering?'

This isn't to suggest that semantically aware BI should function like search engines. Indeed, I would strongly argue that it should not. Search engines deliver a single set of answers to a single ad-hoc question. Business processes are much more frequent, diverse, repeatable and involve wider audiences. Further, businesses require scalability and high degrees of automation. Here, the business questions need to be regularly monitored, visualised, distributed, shared and collaborated around. Interestingly this has more in common with today's best practice BI systems. So there it is. The shape of semantically aware BI is emerging through the fog of future developments and it is unsurprisingly an evolution of what is best about business intelligence as we know it today with some breakthrough thinking that will unlock meaning from the colossal volume of corporate and online documents.

Wednesday 17 June 2009

BI Project Managers and Eyebrows

Like eyebrows, you don't really notice project managers when they are there but if you are rash enough to let them go you will end up looking startled and stupid.

I point this out because over a period of more than 10 years I have had the opportunity to observe many, many BI projects and one of the most surprising patterns is the scaling back of project management largely because the project is going well!

The openly declared reason is usually cost or some other misdirection but it is invariably preceded with pointed questions about what value the project manager has been adding to a project that is going so well. Perversely, the better the project is doing, the higher the risk that there will be murmurings about things like the overhead of project reporting and that project management activity will ultimately be reduced or even removed altogether. It has become as common and predictable as it is deeply and logically flawed.

Perhaps this is one of the phenomena that explains why the trend for project failure is not getting any better. According to the latest Standish Group report which is covered by Peter Taylor, author of 'The Lazy Project Manager', in his blog 'Are your Project Managers working too hard to be successful?' instances of challenged (late, over budget or reduced deliverable) projects continues to rise.

As BI practitioners we often value technical skills, competency in the reporting tool and the deep musing of the data architect and yet have a blind spot when it comes to project management. This may be partly because early BI projects were often departmental in scale. It may also be because many of today's BI Competency Centres originated as 'skunk works' initiatives and see project management as all methodology and meetings but we ignore it at our peril.

It is true that project management can be at its most obviously valuable when priorities need resetting, additional resources have to be secured or controlled management escalation is called for. However, we shouldn't assume that if a Project Manager is not doing these things that they are not doing anything.

Planned projects with predictable timescales along with accurate project reporting are rewarded with confidence from our business sponsors. A considered set of risks based on real-life experience of BI projects will mitigate against them becoming time sucking issues and properly managed issues will prevent them becoming show-stoppers.

A good Project Manager may make it look easy but don't take the lack of fire fighting and crisis meetings as an indication that nothing is being done. Look deeper for the benefits of order over chaos or be prepared to invest in an eyebrow pencil for a look that is decidedly a poor second best.



Monday 1 June 2009

Small Children, Energy and Efficient Data Warehousing

Last week, I referred to Peter Thomas, and his article Using multiple BI tools in a BI implementation – Part II, In the article, Peter points out that the way to drive consistency across dimensions and measures is to define as much logic as possible in the data warehouse.

I was musing over this again this weekend (I hear the cries of what an interesting life you have) whilst out for lunch with friends and their small children (ages 9 and 7) On the short walk to the restaurant I was amused by how different their approach at getting from A to B was to the 'grown ups'. We were focused on getting to the destination in a relatively direct and efficient way. However, the children ran to and fro, stopped, doubled-back, looped a few circles and even randomly waved their arms in the air. They generally spent as much time in a state of motion as possible. Clearly the criteria of small children, when en-route to a destination, is to use the maximum amount of energy possible!

This, if you will go with me on this, is rather like trying to implement data warehouse consistency in the BI tools rather than further downstream in the data warehouse. You do eventually get to the destination, but will probably be exhausted, out of breath and hungry. This is fine if you are two children out for dinner with some stuffy grown-ups but not an efficient use of the somewhat limited time of a BI practitioner.

A typical BI architecture comprises tiers that include;

  • Source Systems
  • ETL
  • Data (Data Warehouse, Data Marts, OLAP Cubes)
  • BI Metadata
  • BI Application (Reports, Scorecards, Dashboards)

A properly architected data warehouse (more on this in later blogs) should have been built against an enterprise schema and is therefore *the* consistent representation of business information. Common definitions of customers, departments, profit and products live here. If there is one good reason for this (although there are many) then it is simply that there these can all be defined once in the data warehouse but would have to be defined many, many times in what can often be hundreds of reports that comprise a BI solution.

One of the reasons that we fail to do this is that inconsistency is often made visible for the first time by the BI tools. At this point the project momentum is around building metadata models and reports. Inconsistencies are fixed where the resources are focused ... in reports. Add to this that revisiting the design may need involvement from the ETL developer, the DW designer and the business analyst and, if there is a lack of clarity, the business users. It is no surprise that the report author is inclined to fix it where they stand. After all, the tools make the fix simple and it is only when the report author has built the same calculation for the tenth time that they become suspicious about starting it in the first place. And of course, the initial build of the BI application is only the beginning. Many more reports will be built over the life of the BI application.

So work hard to establish the correct definitions during the design of the data warehouse and it will reap productivity gains. Leave it to the BI application only if you have the carefree attitude, the free-time and the energy levels of an eight year old.