This post is also available in: German

Hits: 5224

This article first appeared on the blog of my employer IT-Logix.

In our article What is Power BI Dataflow we have given a small overview about this new possibility of data preparation and storage in Power BI. On my personal blog I have also already made a small interpretation order.

Today’s article deals with the question how Power BI Dataflows could be used.

If you want to read more about Power BI, subscribe to our newsletter.


 

Reuse data in multiple reports

Surely you have already experienced the situation where you used the same data in several Power BI reports. The data or each query is stored within the respective Power BI Desktop file and is not available for other reports.

Until now, the Power Query query had to be copied to the various Power BI files. Alternatively, a dataset already available in the Power BI Service could be used for further reports. However, the data cannot be enriched in this approach. These limitations can be circumvented with the Power BI Dataflows.

The time dimension, the calendar, can serve as an example. Each model requires a time dimension and the query is recreated for each Power BI file or an existing M code is copied and pasted.

Using customer data as an further example, this can be illustrated as follows. The same file is reused several times for different reports. The necessary data transformations must be performed separately in each Power BI file. Data inconsistencies are very likely to occur here.

multiple use of the same source data
multiple use of the same source data

 

By using the Power BI Dataflows the data can now be stored centrally, edited and then made available within a Power BI workspace. This eliminates the need to rewrite or copy the query each time and reduces development time. A customer dimension is provided centrally and can be imported via the Dataflow connector into the Power BI file or directly into a new report in the Power BI Service.

 

Using the source data via dataflow in different reports
Using the source data via Dataflow in different reports

“Centralised” , uniformed and consistent data

In many cases, data is not imported into Power BI from a central storage location, but is imported over and over again directly from the source. This approach may make sense to a limited extent. However, as soon as many reports repeatedly access the same source(s) and, in the worst case, even simultaneously, the implications for the source system and the performance of the BI system are not insignificant. In order to avoid this, interim data storage is usually used. This can be a persistent storage in an additional database or a data warehouse. The Power BI Dataflows are now also available.

I deliberately don’t use the word “data warehouse” in connection with the Power BI Dataflows, because for me there are still some components missing at the moment, such as incremental loading in the Pro version (see also the explanations in my blog post). Although this would be achievable with a workaround – the older data is loaded once into a separate table and then excluded from the refresh. In the rather expensive premium version, incremental loading is already feasible.

But even more important is the fact that due to the lack of the incremental loading option in the Pro version, the data in the Dataflows is overwritten with every update. Thus, data historization is also not possible. It is particularly recommended here to store the required data centrally in a data warehouse. This also ensures that all reports are based on the same data and that the reports are created on a uniform and consistent basis.

 

Individual data update

In Power BI Service, the data is updated completely, i.e. all tables of a dataset are updated at once. This is not a big issue for smaller databases. However, for larger datasets, it is desirable that the update be done at different times. Be it because the update may take too long or because data sources are not updated every day and a daily refresh therefore makes no sense.

The Power BI Dataflow offer an elegant solution by allowing the update time to be set individually for each Dataflow.

 

Faster report development

One of the main criticisms of Business Intelligence projects is the fact that the clean development of BI solutions can sometimes take a very long time, but users want to receive a new report immediately. For a fast and agile approach, the user can create the required report using the data stored in the Power BI Dataflows. Later, the report can be transferred to the IT department so that it can be included in the report portfolio.

This procedure corresponds to a development from Quadrant IV to Quadrant II in Ronald Damhof’s Data Quadrant Model (see also our report in Netzwoche). In Quadrant IV, reports are prepared systematically and in a controlled manner, whereas in Quadrant II, reports are prepared rather opportunistically and unsystematically.

Data Quadrant Model
Data Quadrant Model

Data storage at low cost

The use of Dataflows requires at least a Power BI Pro license. This costs at the moment about CHF 10 per user and month. There are no additional costs as long as the premium functionalities can be waived (see below under restrictions).

As long as the customer is not using a Data Warehouse, the use of Power BI Dataflows can be an interesting option under certain circumstances. However, this must be clarified in detail in each individual case.

 

Unified Datamart

At IT-Logix we are currently working on a model how data can be combined into a Unified Datamart using Power BI Dataflows and provided in a semantic layer exclusively with Power BI components. We will publish more on this at a later date.

 

Current restrictions

At the time of writing, Dataflows in the Pro version can only be created per workspace. As a result, different datasets may exist in different workspaces. The Power BI Premium Capacity does not know this limitation and Dataflows in another workspace can be accessed from one workspace.

In addition, the Power BI Pro version can only update data eight times a day. However, in most cases this is completely sufficient. Premium Capacity does not know this limitation either.

And as already mentioned, the Pro version does not support incremental data updates. Premium capacity is also required for this.

 

Conclusion and further procedure

The Power BI Dataflows are still very young and are constantly being further developed. I consider the possibilities of Power BI Dataflows to be promising and will stick to the topic and closely follow the ongoing developments in this field.

Happy Querying!

If you want to read more about Power BI, subscribe to our newsletter.
 


 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: