Seems I’m the last one to pitch in… and @BrianJ and @NajahS have already provided great content on the subject. So I just want to share my experience.
Think of the Dataflow feature as “Power Query online” where you prepair and store your data in Azure Data Lake Storage Gen2 (in the cloud). Basically you only need Power Query skills to work with dataflows but there are some differences that I want to share with you:
- The interface. Microsoft has already made significant improvements in this regard but the interface differs from the one you are used to in Power BI Desktop. Good thing to note is that if you run into an issue where you can’t find a certain feature. Just prepare your query in Power BI Desktop and copy the M code from the Advanced Editor and paste that in the Advanced Editor in your Dataflow.
- Linked entities between dataflows are only available in Premium. This means you can’t reuse entities that have already been ingested, cleansed and transformed by other dataflows. If that need arises like for merge operations you’ll have to duplicate that query and disable it’s load.
- You have the ability to map the data to the Common Data Model this includes a set of standardized, extensible data schemas that Microsoft and its partners have published. This collection of predefined schemas includes entities, attributes, semantic metadata, and relationships.
- Finally because you have the ability to choose if you want to schedule refresh for a certain Dateflow it is posible to utilize this to create an archive for legacy data/systems which you only have to refresh once manually.
I hope this is helpful and remember it’s just Power Query