I have just completed a major conversion of one of my “traditional” spreadsheet reporting solutions to PowerPivot for a major coal mining company. It has been one of the most satisfying solutions I have ever done and convinces me even more than before of the importance of PowerPivot to all Excel users.
While the older solution relied on data gleaned from SQL Server, the presentation layer relied on standard Excel formulas. This meant that if there was ever a change, deletion or addition to the levels of the model (i.e. new mine, closed mine, reorganization, etc.) I personally had to make a detailed and frightening change to the core report that impacted levels up and down, right and left throughout the model (new Sub-Sub-Totals, Sub-Totals, Totals and Grand Totals – new variance analyses up and down through the model). This was a tedious and painful job and one that often cropped up at inconvenient times for me - always when I was busy working somewhere else in the world.
The second problem was that this business model was just too complex for a standard Pivot Table solution – so that was not an answer. One thing that always makes me shake my head is how complex businesses can make their business processes and how it is usually legitimate.
PowerPivot on the other hand is capable of multi-table, DAX-driven calculations that can then be dropped into the enhanced PowerPivot Pivot Tables in a way that accomplishes exactly what the original model did – but totally data-driven and needing zero maintenance by me if any of the things I mentioned earlier happen (additions deletions, reorgs, etc.)
What I Learned
I learned (or re-learned) the following very important things in this process:
- This shit isn’t easy!! Anyone who says that PowerPivot makes reporting in Excel easy should move beyond Adventureworks and try doing REAL solutions for REAL businesses
- I could not have done any of this without my knowledge of relational databases in general and SQL Server in particular.
- “Pre-Aggregation” is critical to a successful PowerPivot solution. The idea of bringing millions of records up into Excel and then doing all the work there is not only just not practical but simply ridiculous. Why would you want to bring detailed transactional data into the backend of an analytical solution? For example in my solution I only cared about how much of each type of coal was mined at each mine by day and only for the period I want to analyze. In this case it shrunk a nearly million row data table of 4 years of truck data into 22,000 rows of aggregated data for the year 2013. I had all the info I needed and the size was manageable and most importantly the time for refresh is acceptable. It also means that my solution can live in client-side Excel and does not need to be promoted to SharePoint to succeed – although SharePoint is going to also be part of the solution as well.
- SQL Server Functions are the most efficient way to return data to your PowerPivot solution. A straightforward SQL statement like
select * from fnTruckSummary(’01/01/2013′,’12/31/2013′)
is efficient to work with as you move through all the various data sources involved. I have also found that Functions are significantly faster than Views for reasons I have yet to explain.
- My code to change the source data dynamically through VBA works as advertised and I am finding that my personal “Best Practices” around this technique work well for me. For example is created a table on a separate spreadsheet like this:
If I need to feed a value chosen by the user on the spreadsheet through clicking a slicer to choose a particular year of data I use a formula. So for example underlying
spFcst_Plan 2013,F above is this formula =”spFcst_Plan “&CurrYear&”,F”.
Then my code to refresh all these connections is just
For Each r In Range(“tblConnections[Connection]“)
Connect r.Value, r.Offset(0, 1).Value, Range(“txtProvider”).Value
Sub Connect(ConnName, SQLText, txtProvider)
With ActiveWorkbook.Connections(ConnName).OLEDBConnection .CommandText = Array(SQLText) .Connection = txtProvider .Refresh End With
Notice that Range “txtProvider” in that code is a cell reference to a cell in the model where I manage the OLEDB provider for all these links. This allows me to easily change the Provider for all these links in one place one time.
Once I have refreshed all the data sources I set the value of a cell that is visible to all users of the model to capture the data and time last refreshed
Range(“LastRefresh”).Value = Now()
Then I save the spreadsheet and announce this to the user\
MsgBox “Refresh Complete … File Has Been Saved With The Current Data”, vbInformation + vbOKOnly, “Data Refresh”
In this way when a user opens this file they may not need to refresh it if, for example, the first user of each day runs this refresh code (driven by a button on the first Worksheet). This way all other users only need to open and use the file (no one really cares about the changes in the data until the end of a day anyway so today’s data means nothing to them).
I know that there are those who think that my posts are too long (??) but the fact is that I could go much, much further into the details of all this but I won’t anymore then. Instead I will give you highlights from now on and you gotta figure the rest out for yourself, I guess.
In the end every solution is a snowflake (they’re ALL different) and all I can do to help promote this technology is give you high-end things I have figured out. I see a lot of folks coming here but very few comments to tell me whether I’m helping or not. That frustrates me a lot I’m afraid :-(.
I will however keep posting here because I believe that this technology is important to us all and to our clients internal and external.
I also believe that most of those posting about PowerPivot and Excel futures are ignoring the capabilities available to client-side Excel users in a way that risk the acceptance of PowerPivot broadly.
Funny final story …. Whenever I mention to ANY of my clients that my analytical solutions in Excel using PowerPivot can be published to a SharePoint “Gallery” and shared easily by Senior management and others throughout the organization they always say “That’s the LAST thing I want to do !!! I want this available only to those who can get REAL business value out of this !! I can’t spend all my time answering stupid questions from people drawing stupid conclusions from all this !! “ I hear this EVERY time I mention the SharePoint Gallery option….. Too good – and makes too much sense. Data distribution has to be controlled as well.