I have been around OLAP databases since 1994. By “around” I mean that I have used them off and on for all those years without actually becoming a true BI dev – rather I am an occasional consumer.
This is because I have focused on Excel and Access and SQL Server (and lately SharePoint), developing transactional databases and budgeting and reporting documents using standard aggregations of business data. Having the data ensconced in a Data Warehouse with OLAP dimensions and Fact tables occasionally has been a big asset for me and frankly I’ve never understood why BI isn’t more broadly used.
Of course the answer has been that BI has been the realm of the BI gate-keepers in every organization I have known and the last things they seem to want is for anyone to actually get any value out of their precious snowflake designs without their help :-).
Also it seems that to be a BI dev you have to be an MDX guru and frankly it is your job to make the whole thing so complex and inpenetrable that the clients or their company needs them to use the technology – job security anyone?
Now with the release of PowerPivot Microsoft is trying to make BI available to the general IW marketplace but to date it seems to me to have been designed primarily for those same BI gurus with the introduction of the DAX “language” (or whatever you call it) into the PowerPivot environment.
As I have mentioned before, I have found DAX an almost impossible hill to climb. I always thought and now I am convinced that this is because I am fundamentally a relational database guy and I believe that nearly everything I need to do in BI can be accommodated in the data sources themselves (either in SQL Views or Functions or in Access queries), but that you needed complex, convoluted DAX expressions if you wanted to do the one thing that I believe is the core of BI – Hierarchies. Without intrinsic hierarchies I see very little value to a product like PowerPivot (other than its amazing capacity) as anything but just another way to do aggregations and to source Pivot Tables.
But Hierarchies in PowerPivot CTP3 change everything for me. You see, I have used and plan to use BI hierarchies as the secret to a reliable, flexible accurate reporting capability. With Hierarchies I only have to define the hierarchical order of the Dimensions of my database and I can then roll up my data at any level up and down and across my financial, sales, production or budgeting data in a flash, and as new values show up or disappears in the data it gets automatically assigned to the correct level in the hierarchy and reflects immediately in all levels of roll-ups. This means a reporting capability that does not require the client to call me every time they add a new office or SKU or reorganize their regions and restructure their management team responsibilities. Everything is managed from tables through the established hierarchies of the “Cube”.
A properly designed OLAP data store should be able to handle all these changes easily and consistently while maintaining the value of historical data and be automatically able to accommodate the future. And it is Hierarchical Dimensions related to Fact tables of core data that are how most of us can get value out of BI that we can’t get any other way. Of course there are other more complex uses of OLAP design, but for the vast majority of business users THIS is the core use of BI … to do aggregations up and down and across their business data, across time in various increments, looking for jewels of revelation, for planning and budgeting help, for tracking KPIs and for standard period-end management reporting.
Lets explain … By assigning your corporate structure to a Hierarchy such as World, Region,Country,State or Province, County, City you can expand a level in a pivot table and it will KNOW what the CHILD level should be shown automatically by design. Without Hierarchies those relationships will have to be defined in every Pivot Table of CUBE Formulas usage you make from the data. This way the levels of the hierarchy can now be defined in PowerPivot at the table level and are inherited every time you create a report. The key is that the user does not need to know that hierarchy and they can’t screw it up as easily on a report – it is “baked in” the data source.
So now that PowerPivot will have Hierarchies I believe that I will be able to do the vast majority of the things I used to have to do using DAX functions by using Hierarchies instead. I also believe that if I am a SQL Server or Microsoft Access Developer I can do most if not all of my convolutions (transformations?) of data there rather than by using DAX, thereby being able to apply my existing knowledge to this new capability and also it will allow me to manage my data at the source rather than in some abstract middle-ware. I will use PowerPivot simply to manage my Hierarchies in my dimension tables to allow me to create cool new reports in Excel and/or SharePoint while managing the data at source as much as possible. Use the right tool for the right job.
This is NOT a criticism at all of DAX or PowerPivot, this is just an alternative way to approach using this product. Frankly it looks to me like DAX is a product for use by BI people who aren’t necessarily strong in Relational databases (which seems odd to me) and who definitely are not Excel gurus. If you are going to use ALL these tools together (data sources in SQL or Access, etc, Hierarchies in your Snowflake OLAP design in PowerPivot and your newly powerful and capable Excel Pivot Tables and CUBE formula-driven spreadsheets in either SharePoint or on the client) then the addition of Hierarchies in PowerPivot is the final shoe to drop.
Now we can get to work.
DAX is for those of us that are not SQL experts.
That’s kinda my thinking too. So fine . You can use DAX and I will use SQL statements to accomplish most of the same … ? Is that a problem?
Actually David…. I do believe that in order to properly do BI you could really benefit from an understanding of Relational Database Theory to begin with.
This is because your data sources are relational data tables anyway. Online Analytical Processing (OLAP) or Multi-Dimensional Databases were a concept added by Dr. Codd to his relational theories in order to optimize performance in analysis of relational data by “de-normalizing” the data to the maximum by aggregating data at every potential level of possible analysis. This originally meant grinding out the data overnight and storing the data in HUGE tables on expensive disks thereby allowing for the fastest possible querying of the aggregations at as many levels as you have defined in advance.
And yet as computers got faster and faster (and cheaper) and RAM and disk space and connectivity increased exponentially (and got cheaper and cheaper) and databases became more sophisticated, indexes worked faster and with more complexity the basic concept of OLAP still applied – it’s just that you can make it happen a whole lot faster on a whole lot more data than old E.F. ever imagined. In today’s world much of the OLAP processing can be done on-the-fly in real-time directly out of relationally joined tables.
But fundamental to using OLAP cubes to me is a fundamental understanding of how to use databases that respond to SQL like SQL Server and even Microsoft Access (especially for the departmental applications that PowerPivot can get a lot of value from). I think those that think they do not have an understanding of the practical use of relational databases probably know a lot more than they realize already – instinctively. But I suggest they take the time to get their heads down into the table architecture and learn basic Structured Query Language. It will be a big help.
I on the other hand will continue to work on my knowledge of DAX. The answer to me is to know SQL AND Relational database concepts, OLAP AND OLAP data concepts (such as hierarchies) , PowerPivot functionality (including DAX), AND Excel Pivot Tables AND overall Excel functionality. Then use the right tool for the right part of your solutions. I don’t think you can do BI right using PowerPivot without them all.