Sometimes the performance of Power Query reading your source data is restricting your dashboard development.It takes too long to reload all your data.I Recently had a case where my source was a set of log files on an Azure Blob. While developing the dashboard I often times had a very bad internet connection. This meant that any change in cleansing and preparing the data set for my dashboard meant waiting for minutes. Sometimes even not comming to a refresh. Very annoying.
Wouldn’t it be nice when you can develop using a reduced dataset; while in production (PowerBI.com) use a full set of data.
The basis of my solution is using parameters.
STEP 1: Add dataset
For an example I read a folder with several very simple CSV files. Imagine this will take more than 15 minutes to load ;).
Combine and edit
This leads to combined dataset for reporting:
STEP 2: Add Parameter
Next step is to add a parameter with the name “IsProduction”. This variable determines wether we are running a production report; or busy developing.
STEP 3: Use parameter as a filter in PowerQuery
We can reduce the amount of data being read inside the query, In this example we reduce the amount of files to one specific file.
Insert a new filter step:
Change the default M query to:
= if IsProduction then Source else Table.SelectRows(Source, each ([Name] = "data 201901.csv"))
STEP 4: Publish and set parameter to production
Publish the dashboard to PowerBI.com and set the IsProduction parameter on the dataset to true.
The big bad in this solution; you need to set this parameter every time you republish 🙁
The good is that this is a design pattern that can be applied to all kinds of datasources.
Sources can be found in: https://github.com/schalkje/PowerBIBlog/tree/master/DataSourceFiltering