How Many Versions Of My Data Do I Really Want?

I got into a bit of a disagreement with someone today over the idea of Pre-Aggregation of data before bringing it into PowerPivot in Excel as opposed to bringing in all the source data for all the tables to be analyzed and then doing all the necessary aggregations exclusively in Excel using PowerPivot.

Let me make it perfectly clear that I am not pleased that there has to be a disagreement .. the fact that it looks like it is becoming one is sad.   Unfortunately I feel the need to aggressively promote what I believe is true because there are people who are in my opinion limiting the prospects for PowerPivot and Excel by their promotion of a limited vision for PowerPivot.  I know in their hearts they believe they are right – But I beg to offer an alternative view.  I believe that those who do not share my vision and I are discussing 2 separate uses for PowerPivot and thus there is less disagreement than there would appear to be.

Here’s the nub of this thing … I have a vision of PowerPivot as a “Report Generator”.  I am personally not especially interested in the use of PowerPivot for ad-hoc analysis – the proverbial “ad-hoc slicing and dicing” of data.  Oh sure I do that from time to time but mostly I develop databases for clients and the reports that they need to see on a regular basis for management.  I have found PowerPivot when combined with VBA to be an extraordinary Report Generator – second to none IMHO.  I think people will benefit from knowing how I am using PowerPivot.  Frankly, if I was to keep it to myself (which I could), then I believe that MY business would suffer.  I believe we all would miss a BIG opportunity to get more companies using PowerPivot ever day.

I have a long experience with databases (for what that’s worth) and one of the most important things I learned a long time ago is “You do not want to accumulate multiple copies of your data all over the place!” In a perfect world there should be only one version of the “Raw” truth.  Yes Corporations have moved to a Data Warehouse model for the most part and that’s awesome but to me that’s where the raw data to be used for analysis should stay. This is important for several obvious reasons that I don’t need to go into here.

One of my big problems with PowerPivot, especially the Cloud versions of PowerPivot within PowerPivot SharePoint services and now PowerBI, has always been the fact that it ends up creating it’s own private data warehouse in addition to existing Corporate data warehouses, to say nothing of the data stores of the applications themselves that create and manage the data through business activity.

Now that does NOT mean that I do NOT believe in the value of the functionality of Power BI and Microsoft’s Cloud offering for data storage and use in Browser-based Analytics or  as a source for Client-Side Analysis.   I do, however, believe that I would rather store pre-aggregated and somewhat pre filtered data “up there” if necessary.  In this way I would not have a Data Warehouse but rather a “Reporting Data Store for PowerPivot Reports” all of which would be based on a moment in time – but would only include the smallest amount of data necessary.

I have also found that by using my technique with VBA I can develop powerful Client-Server solution against SQL Server (or other database servers) without going through SharePoint or PowerBI.  But by using Pre-Aggregation and the power of the database server I can deliver solutions that refresh in seconds and that are small enough files that they can be distributed in “old-fashioned” ways if so desired – or simply published to PDF files and then thrown away.  I have trouble seeing what is wrong with that.

When I use the techniques I have been promoting here, I am able to source data from as close to the source of the data as possible, in a  timely fashion through the Client Excel, thus removing the need for all the overhead and complexity of publishing and managing my reports in the Cloud and not recreating yet another data warehouse with it’s attendant issues around timeliness, maintenance, access and management in general.  I can create a stand-alone solution that I manage and that I can make as secure and reliable as I can.  This way MY users will get exactly what they want when they want it and can rely on that…..

I do believe also that there are a LOT of developers of Corporate reports that this is all they really want to do.

I must also add that this in no way means that I think any less of the capabilities of PowerPivot because I am promoting an alternative way of using it.  Quite the contrary – it is because I see what PowerPivot is capable of that I am promoting this as yet another way to use the product, to get better use out of Excel and to give our customers (internal or external) what they REALLY want … not what I think is good for them.

Why this has to create so much negativity from certain quarters disappoints me no end. And this from people I call friends or at least friendly fellow members of a community that wants the same thing in the end.  We’re all trying to do the same thing … provide better solutions for our customers to get better results with their efforts – and to get paid for doing so of course :-) … but paid for good results!  I believe that PowerPivot is THE tool to allow us to do that.

Dick

About Biggus Dickus

Dick is a consultant in London, ON Canada who specializes in Microsoft Excel and Microsoft Office Development.
This entry was posted in Excel, Excel 2013, OLAP, PowerPivot, Spreadsheets, VBA. Bookmark the permalink.

13 Responses to How Many Versions Of My Data Do I Really Want?

  1. timrodman says:

    Dick,

    Could you elaborate on the pros/cons of using the VBA approach to aggregation versus using Power Query to aggregate data before bringing it into Power Pivot? I have used Power Query to do aggregation, but I haven’t attempted it yet with VBA. The idea of using VBA to interact with Power Pivot is still very new to me and I’m trying to wrap my head around it.

    Tim

    • Biggus Dickus says:

      Actually the key is not VBA but rather it is the use of SQL statements in Excel Data Connections that make this work. And if you want or are able to go deeper you can create Functions or Views in SQL Server (or Oracle or MYSQL or Access) you can create them there and call them from Excel using VBA (or refreshing them manually I guess)…

      The best of my posts to read is

      http://dmoffat.wordpress.com/wp-admin/post.php?post=802&action=edit

      Let me know if you have any more questions on this.

      Dick

      • timrodman says:

        Thanks for the working link which you posted further down in the comment thread:
        http://dmoffat.wordpress.com/2013/08/14/building-a-flexible-and-efficient-client-side-powerpivot-solution-in-excel-2013-2/

        I actually read that post last week and I think I even put a comment on it although it’s not showing up now (maybe it’s in your pending comments queue?). I liked the post very much and that is what prompted me to add your blog to my RSS feed over the weekend. I also posted a link to the post in the discussion thread that got you fired up and prompted you to write this current post :)

        I like the VBA method because I agree that’s it’s better to only bring in the data that you need and I’m wondering, in your opinion, is there any benefit to using VBA over Power Query? I know that you can now feed your Power Pivot model with Power Query by checking the “Load to Data Model” box in Power Query. When you do this, you can refresh the Power Pivot model by using the standard “Data -> Connections -> Refresh All” button without having to write any VBA code (although I think you would need VBA to associate the refresh with a button on the spreadsheet as you propose in your post). Also, I believe that Power Query can check the parameters that a user has selected before running the query which I think would accomplish the same thing that you’re doing with VBA.

        So, as far as I can tell, Power Query is just as powerful as VBA, but much more user-friendly. I’m just wondering if you can think of something additional that VBA has to offer.

  2. Dick,
    If you are an end user you probably don’t have access to the databse itself, i mean you can’t to the aggregating on the database level. In such cases I think you have to do aggregations in Excel.
    But of course if we are talking of professional solutions made by IT pros then you should use all the tools available. Stuff like creating views and functions in the database layer and then do the rest in Excel. This ist exactly me vision, to have a powerful report generator in Excel alongside with all the database goodies.

    • Dick Moffat says:

      Andreas :

      Good points… It seems that there is a conscious decision to ignore a professional approachvto using PowerPivot in favour of an almost amateur one. Unfortunately developing serious Reporting Solutions is a professional activity in my opinion.

      Btw … If you do not have rights to create your own objects on the server you can still do aggregations in your Connection string in Excel itself and even filter the results on the fly through VBA as I have outlined here… If you get a copy of the SQL client tools you can create views there and instead of saving them in the database you copy the SQL back into your Excel solution… Do it all the time ;-).

      An alternative would be to create passthrough queries on the fly in an Access middle-ware file that would then talk to the server on your behalf.. Kinda complex but doable… There might be an issue with using Access passsthroughs though – I will have a look at and get back here on but it SHOULD work :-).

      Regardless, not having rights on the server beyond readonly access to the tables can be accomodated with some effort…

      Dick Moffat Sent from my Cell Phone 1-519-200-7133

      >

  3. I assume you’re being intentionally vague about where the negativity comes from (or I’m just missing something), but I am interested in the other side of this argument. What is the case to be made for aggregating in Excel?

    • Biggus Dickus says:

      Hey “Other” Dick … Hope all is well…. I’m celebrating a new Grandson today :-)

      The person on the other side is at the self-proclaimed tippity-top of the PowerPivot Hierarchy (but he doesn’t like Hierarchies either btw).

      He said:

      “Also, I’d like to say that your opinions on Power Pivot are a bit “off.” It is TOTALLY, 100% practical to bring millions of rows of data into Power Pivot and do the aggregations there.
      In fact it’s superior to ANY other way of doing things. As long as the data fits into the file, that is.
      If not, ok, do some pre-aggregation in the database and then import. Great!
      And Office 365, or SharePoint, is in no way required. It is helpful, but you will still be FAR ahead of the game on the desktop alone.
      But the old way of doing things is no longer the right “default.” Your experiences are valid, they are just a bit out of date.”

      in an email yesterday.

      So apparently the “correct” way to use PowerPivot is to bring in raw tables of data (no matter how large apparently) but pre-aggregation is in fact “just a bit out of date.”. ??? “It is TOTALLY, 100% practical to bring millions of rows of data into Power Pivot and do the aggregations there. ” I just TOTALLY disagree with that if you have any plan to use the Client- side version of PowerPivot … refreshes would just be painful.

      Even if you end up using PowerBI, users should know the can of worms that opens up for them. Moving to PowerBI means you are moving your solutions into the realm of Corporate IT and Enterprise technology.
      1. Do you really think that IT departments are going to be super excited about a bunch of enduser Excel spreadsheets being published up to their servers, using up bandwidth and ending up leaving a majority of dead solutions up there that never get used after a short time?
      2. You will have to convince the “gate-keepers” of this realm of the wisdom of what you’re doing and you will have to work within their rules – which are not necessarily designed to make things easy for publishing non-Enterprise solutions.
      3. And who REALLY thinks that Corporations are going to accept an Excel-based solution as the technology for their Enterprise BI solutions? Not me :-)

      So if that is the approach we are to take then PowerPivot must ONLY be used with raw data table inside PowerBI in Office365 and Corporations we will actually be limiting the opportunity for the technology and I wonder if in the end it will just kill it. The fact that success would involve Excel developers synchronizing their efforts with IT has never worked and I believe it never will.. No matter how much Microsoft would like it to.

      I am also not sure what “but you will still be FAR ahead of the game on the desktop alone.” means.. I find that on the desktop using VBA and managing my sources on the fly I am FAR ahead of the game … the game being giving clients timely, flexible reliable reports to help them run their business. I simply cannot agree with that point – quite the contrary.

      I am not pleased to be told that because I am able to use multiple technologies together to create the most efficient result is somehow “old-fashioned” and to say it is insulting is putting it mildly. I do not believe it is in Microsoft’s interests either in the long run IF they are interested in PowerPivot as anything more than just an excuse to push more content to Office365 – which would be tragic to me.

      My argument is to use the technologies specifically designed to manage large amounts of data (not all of which are from Microsoft BTW) to deliver an optimal data set to my PowerPivot solution in order to produce efficient, productive report solutions for clients EVERYWHERE…

      But I guess I’m just an old man so what do I know :-) ..

      • Hmmm, I’m pretty sure I replied to this, but there’s nothing here. It wasn’t much of a reply, but I guess I get a second chance at it. I’m not a PowerPivot guy yet. I’m sure I will be someday, but it’s a technology that’s in evangelical stage and I’m currently preaching keyboard shortcuts and class modules and I don’t really have room for another religion.

        But I appreciate you taking the time to tell the other side. I want to keep up on the technology because someday Mike Alexander will force me to write a chapter on it and I’ll have to learn it.

  4. dick@plogic.ca says:

    Sorry … try this:

    http://dmoffat.wordpress.com/2013/08/14/building-a-flexible-and-efficient-client-side-powerpivot-solution-in-excel-2013-2/

    Dick

    Dick Moffat Personal Logic Associates Inc. P – 519-474-4224 C – 519-200-7133

    On Tuesday 14/01/2014 at 11:10 pm, “Dick Moffat’s Spreadsheet Blog” wrote: > New comment on your post “How Many Versions Of My Data Do I Really > Want?” > Author : timrodman (IP: 76.188.65.184 , > cpe-76-188-65-184.neo.res.rr.com) > E-mail : tim@rodman.us > URL : http://www.linkedin.com/in/timrodman > Whois : http://whois.arin.net/rest/ip/76.188.65.184

  5. Dick Moffat says:

    Jason Wiser wrote to me the following which somehow didn’t come through as a comment:

    I’m going to fall in the “there’s a time and place for everything” camp. I’m in the early stages of learning PowerPivot, and like what I see. But there is a movement going on with younger professionals – of which I would still consider myself :). Too often, the answer is “pull everything, and figure out the rest later.” With computers being as powerful as they are now, that is not as much of a problem as it used to be. But that doesn’t mean that is the right answer.

    Likewise, there are times where you really do need all the information.

    I think at its most basic it comes down to good design principles. Pull what you need, report what you need and move on.

    My reply is:

    Hey Jason ….

    Of course there is a time for detail (which is a good time to use my VBA tehcniques to pre-FILTER your data ifvuou not going pre-aggregate it) as wellbas for summary. My rule is you should be as efficient with the tools you have as possible.

    Just sucking im everything sounds great but I think it’s just not being smart … In addition to the refresh issue there is the question of complexity…

    I use queries referring to subqueries referring to subqueries, etc. to get at the data I really need in SQL or any DB tool and then bring the result ONLY into PP where I join it with other data from other sources in a way that can be managed in DAX and Pivot tables..

    I cannot see how you can do all of that in the PowerPivot UI ? Having it ALL laid out in the Powerpivot UI is just not practical…. The fact that people don’t see the value of that just strengthens my argument… Is there no value to experience anymore?

    To say this debate has depressed me is an understatement. It’s devasting … Have we decided that experience means nothing ?… We’re screwed..

    Dick Moffat Sent from my Cell Phone 1-519-200-7133

    >

  6. Biggus Dickus says:

    But apparently there is one “official” opinion in the Powerpivot world and mine are out of date ….

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s