Power BI Accelerator Week #2 Is Live!

All,

Terrific discussions on going around this week’s Accelerator problem.

I’ve decided to push the solution session for Week back a week to 5pm EDT on Wednesday, July 21st (originally scheduled for the same time on the 14th).

This is partially due to wanting to provide some extra time for people who are participating in both the Data Challenge 14 ongoing at the same time, and also to maximize the number of people who have the opportunity to participate in this week’s Accelerator.

I would say this is probably the most important week of the series, because getting your data model correct is fundamental to everything else that happens afterwards. A good data model will make your DAX and visualization much, much easier, while a bad data model will sink you right from the outset, even if your DAX and visualization skills are top-flight.

I will provide the link for registration for the solution session on Monday.

Keep up the great work – I’m loving what I’m seeing so far…

– Brian

1 Like

Ok, here is my submission. It’s really my first for any challenge or weekly problem set. I think I’m on a good track, but I’m sure there are issues in the data set that I didn’t consider.

I didn’t choose to break out the locations since customers have many shipping locations and I didn’t see a good way to create a proper relationship. Best to leave the shipping location at the order level, in my view.

2 Likes

@mhough ,

Welcome to the forum – great to have you here, and awesome that your first post is participation in this initiative!

You absolutely are on a good track here – really well done entry, and I very much like what you did with your measures.

You have correctly identified that shipping location is the problem child in this group. In the solution session, we will talk about a number of different approaches for handling that dimension, and the pros, cons and trade-offs associated with those alternatives.

Thanks for participating!

– Brian

1 Like

Here goes. This was a great problem to solve.

Noticed the discussion regarding shipping locations. I used the postcode as the key identifier as there was no street address and a postcode usually covers a specific area. Is this logic incorrect?

There are a number of orders where the ship date precedes the order date. Shipping on consignment?

EDNA Power BI Accelerator – Week 2 Problem Final.pbix (13.6 MB)

2 Likes

Hi there, Sorry I’m late uploading the file.

I did a little something different with the date table by automating the process, If at any time the fact file changes the date table will automatically find the first transaction date in data and last date in the data. The Date Table will automatically update to first day of the Calendar year and last calendar day for the following year.
Example
First date in data Mar 1, 2012, Date Table will be Jan 1, 2012
Last date in data is Oct 28, 2014, Date Table will automatically be December 31, 2015

Any suggestions will be appreciated.

I’m looking forward to July 21 live solutions.
thanks
Keith

2 Likes

Hi eDNA Members and Staff,

First of all thank you for this opportunity to develop our skills and please find below my submission for this 2nd week of the PBI Accelerator.

Here are some comments about the work I have done.
When creating the dimensions tables especially the “Products” table, the “Customers” table and the “Locations” table, I first investigated the data to checked if there are only unique values or some multiple values.

  • Customers table
    The couple (Customer ID , Customer Name) did not revealed any multiple values.

  • Locations table
    One Postal Code “92024” have two different cities “Encinitas” and “San Diego”. After checking on “Postal Codes United States (geonames.org)”, I replaced “San Diego” by “Encinitas” where Postal Code = “92024”
    Some Postal Codes were not matching the correct cities so I made adjustments but I did not went through each Postal Code.

  • Products table
    As mentioned before, 32 Products ID have two Product Names. I adjusted the Product ID of each one of the two Product Name that have the same Product ID by adding a suffix to the Product ID.

I made the adjustments in the “Client Flat File” query and I left the queries I used to check for multiples

For the “Date Table” I used “Melissa de Korte” script provided on eDNA Extended Date Table (Power Query M function)

I also applied these steps

  • Marking the Dates table as Date Table
  • Defining the Summarizing By to None for all columns
  • Categorizing the columns of the Location tables
  • Sorting by Column
  • Hiding columns used for sorting other columns
  • Creating some basic measures.

EDNA Power BI Accelerator – Week 2 Problem Final - MehdiH.pbix (10.6 MB)

Any comments will be appreciated.

Best Regards
Mehdi HAMMADI

3 Likes

Hello @BrianJ

Please find attached my presentation. I look forward to your comments, again I couldn’t register for the Live Solution session in the link you sent as it appears to have broken. Can you please re-send the registration link

  1. I did something a bit different and this was to rename all the columns after I had imported the source CSV file. I realised that all the columns starts with the text “Superstore Flat File”, so i had to find a way to rename all the columns in a one-go. I did this by reading about how Matt Allington solved a similar issue at
    https://exceleratorbi.com.au/renaming-all-the-columns-in-a-table-at-once-using-power-query-m-language/

  2. I created six dimensions and one fact table from the given source file. The DIMs are as follows;
    Customers
    Segment
    Ship Mode
    Address
    Dates
    Products
    and had ClientTransactionData as the Fact Table

  3. On the Date table, I used the CALENDARAUTO function to start with and created some relevant columns with the table like YEAR,QUARTER, WEEK DAY, MONTH etc

.

Cheers
Eze

2 Likes

This is the error i get when i click on the registration link

@Eze ,

That’s really strange – I just tried the link myself and it’s working fine for me. I’m looping in the @EnterpriseDNA team to check this out, and I would also ask you to please send a brief message and your screenshot to rebecca.walker@enterprisedna.co -she will make sure you get registered.

Thanks for letting me know. Is anyone else having the same problem?

– Brian

1 Like

Hi Brian

A bit of a rush to submit something before deadline.

How do I register for the live solution session?

Jonathan

1 Like

@stevens ,

Beautifully done. Great-looking model and elective measures, nicely presented.

Thanks very much for participating!

  • Brian

@Keith ,

Very nice job! Kudos for stretching yourself and incorporating the dynamic start and end dates. Certainly every situation is different, but in most of the data I work with the start date of the fact table doesn’t change over time, so you can buy some speed by hardwiring the first of the year corresponding to the earliest fact table entry into the date table and just making the end date dynamic.

The snowflake structure you used for your customer dimension is by no means wrong, but does have some pros and cons that we will discuss in the live session.

Thanks very much for participating!

• Brian

1 Like

@MehdiH ,

Exceptional job on this! Really impressed by the thorough job you’ve done in the data review prior to the modeling, and the careful thought given to how to resolve the problems identified.

Also, great job in following the wide range of best data modeling practices.

Thanks very much for participating!

  • Brian
1 Like

@Eze ,

Nice work! Again, I love the way people are challenging themselves relative to their own skill sets, in your case incorporating the custom M code for renaming columns.

My one strong suggestion here would be to use the EDNA Extended Date Table, rather than the CALENDARAUTO generated date table. When we get to time intelligence in a few weeks, I think you will find that the former offers some incredible power (particularly relative to the offset fields) that will greatly simplify your DAX.

Were you able to get your registration straightened out? If not, let me know and I will make sure we get you into the session.

Thanks for participating!

  • Brian

@KieftyKids ,

Thanks for participating! Nice job on your measures.

In terms of your data model, what you’ve done isn’t technically wrong but I think it’s far from optimal. In de-normalizing your dimension tables to that extreme, I think you’ve made your model far more complex than it needs to be, and also created some problems in terms of syncing slicers, while gaining only marginal file size efficiency. We will talk a lot about this sort of trade-off at the live session tomorrow.

Here’s the link for registration.

See you there!

• Brian

Hello @BrianJ

Thanks for the feedback, I have now been able to register for the session using the link you sent on your comment on @KieftyKids nice work :smiley:

1 Like

Thank you @BrianJ for the feedback and look forward to attending the solution

All,

A HUGE thanks to all who participated this week. Closing in on 700 forum views for this week’s problem, double-digit submissions and some great discussions on the forum thread.

Thanks also for those who joined us today for the live solution event. @sam.mckay and I really enjoyed it, and hopefully you found it beneficial. For those who weren’t able to attend, we’ve got the recording of today’s event, a copy of the Powerpoint slides, the original problem with data file, and both @sam.mckay 's and my PBIX solution files up on the portal now at this location under Power BI Accelerator:

We covered a lot of ground today. Some of the topics we addressed were:

  • Why do we do data modeling at all?

  • The Four Horseman of the Data Modeling Apocalypse (and how to defeat them)

  • Identifying fact and dimension tables

  • What are star schemas and snowflake schemas, and the pros and cons of each?

  • Why do star schemas work so well in Power BI?

  • The Art and Science components of data modeling

  • And a detailed walkthrough of the specific Week #2 solution.

We’ll be back next Wednesday 7/28 with the Week #3 problem, which will be focused on exploring how CALCULATE can be used to alter filter context to produce powerful analyses.

See you then!

  • Brian
2 Likes

Really enjoyed the session last night, was great to be on a live session, very useful to have interaction from fellow EDNA’ers asking questions etc.

2 Likes

@DavieJoe ,

Thanks very much for the feedback. I really enjoyed it as well, and there were some great questions from the participants. @sam.mckay and I are still experimenting with the best platform for these events, but if folks feel most comfortable asking questions via chat, we may switch back to Re-Stream which allows Sam to flash the questions asked at the bottom of the screen.

  • Brian
3 Likes