Data Modelling Workout 01 - Star Schema

Difficulty Rating: 2 out of 5

Data Modelling Workout 01 - Star Schema

The starting place to many successful Power BI data solutions is the star schema data model.

Goals
• create a relational data model from a flat-file data source

Tasks
Produce a Power BI file that:
• creates fact and dimension tables from the flat-file sales data provided
• creates a [Dates] table from the flat-file dates data provided
• creates a [Key Measures] table to house the DAX measures created from the DAX data provided
• links the various tables into a full star schema using only 1-to-many relationships and a waterfall layout
• includes slicers, a table, and cards to showcase the star schema data model “in action” (i.e., while making slicer selections)

Notes
• use only the DAX code provided; do not modify nor create additional measures or calculated columns
• (data source: a 10,000 row subset of the Microsoft [ContosoRetailDW] sample database)

Submission
Using the only the data and DAX files provided, create your solution, and reply to this post. Upload the following:
• a screenshot of the queries pane of the [Power Query Editor] with all groups expanded
• a screenshot of the star schema data model as displayed in the [Model View]
• an animated GIF (or, alternatively, a set of screenshots) of your Power BI file “in action” (i.e., while making slicer selections)
• your PBIX file

Please blur your screenshots and GIFs or place them in a hidden section.

Period
This workout will be released on Monday April 10, 2023, and the author’s solution will be posted on Sunday April 16, 2023.

Greg
Data Modelling Workout 01 - Data.xlsx (641.0 KB)
Data Modelling Workout 01 - DAX Measures.txt (1.4 KB)

10 Likes

Excited about this workout. Great starting problem

3 Likes
Summary
1 Like

Hello everyone, here is my submission thank you!

Summary
1 Like

Hi,
below is my attempt.

My Submission

ezgif.com-resize (1)
W01_1

1 Like

Hi All,
Hope my attempt is good

![image|690x3
Data Modeling
34](upload://6LzxmhNviuogStL8inrzMfltLThis text will be blurred9w.png)


Data Modeling - Mustafa.pbix (202.4 KB)This text will be blurred

1 Like

hi,
happy to participate to this important topic workout.
Please find attached my solution.
ciao !
Roberto Ravani - data modeling workout 01.pbix (193.8 KB)

1 Like

Here’s the notes I kept during the development of my solution to this workout.

Pre-Modelling Activities - General:

  • • create new, empty PBIX
  • • use File\Options\Options and settings\Global\Data Load\Time intelligence to ensure Auto Date/time for new files is disabled
  • • use File\Options\Options and settings\Current File\Data Load\Time intelligence to ensure Auto Date/time is disabled
  • • use File\Options\Options and settings\Current File\Data Load\Relationships to ensure Autodetect new relationships after data is loaded is disabled
  • • use File\Options\Options and settings\Current File\Data Load\Relationships to ensure Import relationships from data sources on first load is disabled

Pre-Modelling Activities - Data Loading:

  • • load raw sales data from [Sales] tab of the Data Excel file
  • • load raw dates data from [Dates] tab of the Data Excel file

Modelling Activities - Power Query:

  • • rename sales query to [RAW Sales] and un-select Enable load
  • • rename dates query to [RAW Dates] and un-select Enable load
  • • create [Staging Queries] group and move [RAW Sales] and [RAW Dates] queries into this group
  • • create [Dates] query by referencing the [RAW Dates] query
  • • create [Data Model] group and move [Dates] query into this group
  • • create [Sales] query by referencing the [RAW Sales] query
    • • right-click on [Country] and [Continent] columns, select Remove Columns
    • • right-click on [Colour], [Product Subcategory], and [Product Category] columns, select Remove Columns
  • • create [Channels] query by referencing the [RAW Sales] query; move into [Data Model] group
    • • right-click on [Channel] column, select Remove Other Columns, select Remove Duplicates
  • • create [Products] query by referencing the [RAW Sales] query; move into [Data Model] group
    • • select [Colour], [Product], [Product Subcategory], and [Product Category] columns, right-click on one of them, select Remove Other Columns
    • • select all columns, right-click and select Remove Duplicates
  • • create [Stores] query by referencing the [RAW Sales] query; move into [Data Model] group
    • • select [Store], [Country], and [Continent] columns, right-click on one of them, select Remove Other Columns
    • • select all columns, right-click and select Remove Duplicates

Modelling Activities - Model View:

  • • move [Dates], [Channels], [Products], and [Stores] tables to top row; these are my Dimension tables
    • • (not at all necessary, but I like to always place my [Dates] table in the top-left corner so its in the same place in every model; helps me to ensure I don’t miss it)
  • • move [Sales] table to row below dimensions; this is my Fact table
  • • if there are any relationships the were “automatically” created by Power BI, right-click and delete them
  • • use the Manage Relationships dialog and select New; in the Create Relationship dialog, select the source table and column and the target table and column, creating relationship for:
    • • Dates[Date] → Sales[Order Date]
    • • Channels[Channel] → Sales[Channel]
    • • Products[Product} → Sales[Product]
    • • Stores[Store] → Sales[Store]
  • • for all relationships, ensure the Cardinality is set to Many to one (asterisk:1) and the Cross filter direction is set to Single; adjust if necessary

Post-Modelling Activities - Measures:

  • • create a [Key Measures] table using Enter data icon on the toolbar
  • • create a new measure for each entry in the DAX Measures text file

Post-Modelling Activities - Visualizations:

  • • create slicers for Stores[Country], Products[Colour] and Dates[Date] columns
  • • create a table for the Sales[Order Date] column and the [Total Sales] and [Total Costs] measures
    • • (not at all necessary, but I like to always set the format of date columns (in this case, Dates[Date] and Sales[Order Date]) to dd-mmm-yyyy to avoid confusion with dates that are just numeric)
  • • create a multi-row card for dynamic measures
  • • create a multi-row card for static measures

Congratulations to all who participated, and good luck.
Greg
Data Modelling Workout 01 - Star Schema.pbix (239.8 KB)

5 Likes

Hi @Greg,

Sorry for the latest of this. Great workout!!!

Thanks
Keith

1 Like

This is my first workout in eDNA hope I followed it right.

Queries Pane

Frym Louie Maderas Data Modeling Workout 01 - Star Schema.pbix (196.2 KB)

My submissions. thanks. i have a question though. I started to create a colour and a country tables do reduce even more the model but it creates a snowflake. is that good or bad?

Summary

Thanks

1 Like

Here is my submission

@Jerome_Pomelos ,

Thanks for participating! Good question about further normalization of dimension tables to snowflake schema. Generally speaking, in Power BI it’s best to stop at the star schema and not continue to higher order normalization.

Disadvantages of snowflake schema relative to start schema:

:small_orange_diamond: DAX usually becomes more complicated due to more complex filter context
:small_orange_diamond: Slicers become much more difficult to synch properly
:small_orange_diamond: Performance may decrease due to have to traverse longer paths
:small_orange_diamond: Cannot create heirarchies across multiple dimension tables

This question was actually the focus of Data Modeling Workout 003, in which you had to denormalize a snowflake schema INTO a star schema.

I hope that’s helpful.

  • Brian

Hi Greg,

Nice and easy workout. Thanks for setting this up!
One could argue about a Geography table or combining the Store/Channel in one. IMHO even on simple models there are always things to argue about.
One thing I don’t understand is why in your solution file you didn’t clean up the Sales fact table.
In Power Query I added the index keys to my fact Sales table and then removed all the columns I modelled out to the dim tables. I was surprised you kept them all in. Slip of the pen or on purpose?

KR
Jonathan

Hi @Johnny. It was on purpose … the intent was to take “baby steps” in data modelling issues … the clean-up is covered in subsequent data modelling workouts (we’re on #5 now …).
Greg

1 Like

OK, fair, if it’s meant as a step-by-step teaching workout, I better understand the proposed solution.
I am a bit late to the party, but will try to catch up and join in!

Hello everyone,
Here is my submission for Data modeling workout-001.
@Greg @BrianJ really enjoyed this workout and it will be really helpful in real time as well.




Data Modeling workout 001.pbix (266.6 KB)
Thanks,
Pankaj

Power query

view

ZaraArshad_workout01.pbix (235.4 KB)

Slight differences in my model:

  1. I kept Channel column in Sales Table
  2. I moved Unit_Price and Unit_Cost to Products Table

Data Modeling Workout 01 Completed.pbix (198.2 KB)



QueryView



Workout 1.pbix (216.2 KB)


[details="Summary"]
[spoiler]type or paste code here[/spoiler]
[/details]