Search
Twitter

Entries in Power Query (3)

Saturday
Sep072013

Power Query Data–Should it be Loaded to the Worksheet?

Overview:  This post discusses situations when you may *not* want to load Power Query data to the worksheet, but instead to the data model only.  One reason relates to file size limits in Power BI for Office 365; the other reason is to simplifying & de-duplicate the fields displayed in the Pivot Table Field List.

Power Query Settings

When you bring data into Excel via the Power Query add-in, you have some options on the Query Settings pane. 

Enable download:  Needs to be set to On in order to retrieve or download the data from the source.

Load to worksheet:  Specifies whether the data downloaded will be stored in an Excel table within the worksheet.  Default = Yes.

Load to data model:  Specifies whether the data downloaded will be stored in the data model (i.e., Power Pivot).  Default = No.

     image

There’s a couple of reasons why you might want to flip-flop the above settings.  Specifically:

Load to worksheet:  Change to No.

Load to data model:  Set to Yes.

Following are a couple of reasons why you may want to consider loading to the data model only.

File Size Limits for Browser Display in Power BI for Office 365

The default file size limit in Office 365 for browser display is 10MB.  That limit for browser display has been increased to 250MB if all 3 of the following are true:

  • The workbook is enabled for Power BI (which requires SharePoint Online Enterprise E3 or E4 plan), and
  • The workbook contains a data model (i.e., an embedded PowerPivot model), and
  • Data in the workbook outside of the data model doesn’t exceed 10MB.  <--includes Power Query data residing in an Excel table

This means you want to be very careful when working with Power Query to not necessarily import its data to the workbook, particularly if the volume of data is high.  The two-step process to import only to the data model would be:

  1. Move the slider “Load to worksheet” to Off.  <--Important to click this one 1st before it becomes greyed out when you click #2.
  2. Click the “Load to data model” option.

     image

The reason this distinction makes a big difference in the file size is because the data model uses xVelocity compression.  This is a much higher compression level than standard xlsx data.

Duplicates in the Pivot Table Field List

Loading the data to only the data model and not the accompanying workbook has another benefit.  You won’t see the set of data twice in the pivot table field list.

     image

It’s much cleaner having one clean set of fields to choose from.  You can still use the “Filter & Shape” functionality of Power Query without loading the data to the worksheet.

Also, be aware that there’s some situations where Power Pivot will implicitly convert data to the data model, such as:

  • When Power Map is used within an Excel workbook.  Power Map has a dependency on the data model for its functionality, so if you have not explicitly created a Power Pivot data model, it will do so on the fly when you insert a Power Map worksheet.
  • When you import two or more tables at once.
  • When you insert a Power View worksheet using a Table range of cells.

Finding More Information

Office – File Size Limits for SharePoint Online

 

Wednesday
Jun052013

Integrating Data from Data Explorer into PowerPivot

Overview:  A few thoughts about using Data Explorer with and without the Excel Data Model (i.e., PowerPivot).

My first reaction when I heard about Data Explorer being released in preview mode as an add-in to Excel was “Why isn’t it part of PowerPivot?” Now that I’ve thought about it further, it’s occurred to me that its independence means you can use Data Explorer with or without PowerPivot.

Before we illustrate that further, let’s pull a bit of data into Data Explorer. One nice piece of functionality is that when you hover on the items in the Online Search pane on the right, the left side populates with a preview of the data immediately – this is nice to give you an idea if the data will be useful before you bother downloading it. Using the Online Search button on the Data Explorer ribbon, I’ve done a search for “Charlotte North Carolina” (just for grins) which is shown as follows:

     clip_image002

Note that with this initial preview of Data Explorer, the only website being searched is Wikipedia. Options here will be enhanced over time.  There’s numerous other ways to get external data as well (some of which are just really super cool, but that’s for another time…).

Using Data Explorer to Import Data to an Excel Table

The settings we are concerned with are in the Query Settings pane, shown as follows:

     image

The default option, as shown above, is “Load to worksheet.” This is apparent because its slider is set to On.  Also, by observing the Table Tools > Design ribbon we can see that the data downloaded went to an Excel table named Table2.

Note that this isn’t a Pivot Table – it’s an Excel Table applied to a range of cells.

Using Data Explorer to Import Data to a PowerPivot Model

Conversely, if you wish to import your data into the Excel Data Model (i.e., PowerPivot), then you want to select the “Load to data model” option on the Query Settings pane. After the data has been uploaded to the PowerPivot model, it will change your Query Settings pane as follows:

     clip_image006

If you don’t see the Query Settings Pane, just click anywhere in the table & the pane should pop up. If it doesn’t, check the Table Tools > Query ribbon & select the “Show Query Settings” button.

Tip: if you need to modify the data and/or clean it up, you might want to handle this before you load it into the data model.

Just What is the Data Model Anyway?

If you are familiar with PowerPivot, then you are already familiar with the data model.  I’d imagine the Excel team is going for ease of use & understandability with the data model terminology.

You can import data into the data model by using the PowerPivot add-in.  Alternatively, a range of data can be added to the data model.  The following screen shot shows a small range of cells I have highlighted & then selected “Format as Table” from the Home ribbon:

     image

Next you are presented with a dialog box which confirms whether or not the first row is a header:

     image

Now we have some options on the Table Tools > Design ribbon.  If you choose “Summarize with PivotTable” you can interact with this range of data within the PivotTable field list. 

     image

Creating a PivotTable from a range of data is very cool, but not the reason I’m mentioning all this…note at the bottom of the dialog box there’s an option for you to “Add this data to the Data Model.”  If you choose to add the range of data to the data model, you can then launch your PowerPivot window and see it there represented as a table.

     image

This opens up a lot of possibilities considering you can bring data from a range of cells like this, and/or you can get external data from PowerPivot, and/or you can get external data from Data Explorer –> and then choose whether or not to integrate it with PowerPivot.  That really is a lot of flexibility.

Deciding Where to Store the Data

The way I see it, you could keep the data in an Excel table if a tabular result set (i.e., the output from Data Explorer) is really all you need.  You can even use a PivotTable with that Table range of data, which is very cool.

Here’s some reasons you might want to load the data obtained via Data Explorer into the data model (i.e., PowerPivot):

  • You have an existing PowerPivot data model, or other data sources, you wish to integrate this data with (although there is Merge functionality in Data Explorer that could work too).
  • You want to utilize other PowerPivot functionality such as Key Performance Indicators or Hierarchies.
  • You want to use the map feature of Power View (requires the geographic data to have a certain data category).
  • You want to change the default aggregation type from Sum to something else like Avg or Count.
  • You are more comfortable creating DAX calculations in PowerPivot than using the Data Explorer Formula Language (called “M”).
  • You wish to compress the data significantly.
  • You have a large set of data which will perform better using the xVelocity engine.

There’s times that Excel will implicitly create a data model for you, such as:

  • If you’re importing 2 or more tables at once
  • If you’ve inserted a Power View worksheet using the Table range of cells
  • If you've inserted a Power Map (its functionality requires a data model)

The implicit creation of the data model had me a bit stumped at first, but Excel is just being helpful.  Folks without the PowerPivot add-in enabled won't even know when the implicit conversion to the data model happens.

Happy hunting for data!

Finding More Information

Office – Create a Data Model in Excel

Office – Microsoft “Data Explorer” Preview for Excel Help

 

Sunday
Apr222012

What is Data Explorer? And How Does it Compare to PowerPivot?

March 2013 Update:  The post below was written prior to the release of the Data Explorer Preview which is an Excel Add-In, so some information below will differ somewhat.

Overview: When investigating the Discover > Enrich > Publish capabilities of Data Explorer, my immediate question was: “Hmmm…Sounds awfully similar to PowerPivot. How is Data Explorer different?” This entry shares my initial impressions of Data Explorer.

What is Data Explorer?

clip_image001[4]Data Explorer is a self-service tool to Discover, Enrich (Transform), and Publish datasets. Data Explorer is currently in a SQL Azure Labs “Preview” (i.e., experimental) mode. You can view the datasheet here.

For a business user, the primary purpose of Data Explorer is to improve or eliminate manual processes to consolidate, standardize & cleanse data before it’s consumed by an end-user tool (such as PowerPivot).

For an IT user, the primary benefit of Data Explorer is likely to be the capability to publish the results as an OData feed for consumption by another application.

Discovering Data:

    image_thumb[4]

Enriching Data:

    image_thumb[10]

    image_thumb[28]

Publishing Data:

    image_thumb[26]

    image_thumb[27]

A Few Differences Between Data Explorer and PowerPivot

Discovering Data:

  • Data Explorer has functionality to “suggest” additional data sources for you (from Azure Marketplace or Bing) based upon the data you already have (i.e., if it detects a phone number or an address in your dataset).  While both tools are similar in that they facilitate bringing together disparate data sources, PowerPivot does not attempt to make suggestions.  Data Explorer places more emphasis on finding & bringing in additional related data from the web.
  • Data Explorer can extract data from some interesting data sources not available to PowerPivot, such as Web Content (lists & tables), Office 365 including SharePoint Online, or JSON documents.
  • For relational data, Data Explorer can currently connect only to SQL Server or SQL Azure.  PowerPivot can natively connect to many more relational data sources.
  • Data Explorer cannot connect to Analysis Services from what I can tell (whereas PowerPivot can).

Enriching Data:

  • Data Explorer does have some nice capabilities to allow a non-technical user to perform transforms on the data (nowhere near the capabilities of SSIS though – don’t forget Data Explorer is intended as a self-service tool). Some transformations might be easier to do in the Data Explorer interface than in PowerPivot.  It also provides a way to centralize the transformations, if the dataset will be consumed by multiple downstream applications.

Publishing Data:

  • Publishing data is where Data Explorer really distinguishes itself – really, the main purpose of Data Explorer is to publish data so it can be consumed downstream by another tool.  Conversely, PowerPivot’s main purpose is to support data visualization rather than publish it outside of the Excel (.xlsx) file it resides in.
  • The ability to publish OData feeds via Data Explorer is, perhaps, its biggest benefit.  Publishing data as a feed allows other applications to consume it (including PowerPivot).
  • Currently only the cloud version has publishing capabilities (the Data Explorer team is working on this capability for the desktop add-in).

Other:

  • Data Explorer has its own Formula Language called “M” (um, yeah, we were hoping for another language to learn…I just had to say it).
  • Data Explorer has a cloud version, and a desktop version which works as an add-in to Excel (both 32-bit and 64-bit downloads of the Excel add-in are available). It may be easier to access on-premise company data behind a firewall using the desktop client version of Data Explorer.  (PowerPivot is currently only available as an Excel add-in; no word on when it might be available as part of Office 365.)
  • The cloud version of Data Explorer uses Silverlight.

Finding More Information

MSDN – Learn More about Microsoft Codename “Data Explorer”

MSDN – Data Explorer – Getting Started

Chris Webb’s Blog – Self-Service ETL with Data Explorer

MSDN Magazine Article by Julie Lerman – Let Power Users Create Their Own OData Feeds

Jamie Thomson’s Blog – Thoughts on Data Explorer

Jamie Thomson’s Blog – Data Explorer Feedback Part 1

Data Explorer Team Blog

Oakleaf Systems Blog – Mashup Big Data with Microsoft Codename “Data Explorer” – an Illustrated Tutorial