Help with Data Modeling for dynamic hierarchical data

Hi All,

I need help with creation of data model to help me create representation for a dynamical hierarchical data that we have.

Essentially, we have lot of free form reports via an application. Data has been gleaned from the form but the next challenge is to understand how to represent the data.

Here is how my sample form looks like -

Am also attaching the sample Data for two months. The actual data runs into about 15-20 main categories followed by multiple sub categories and further divisions in them. There are various other forms with different hierarchical structure. For now I am tackling one form at a time.

Form data.csv (1.5 KB)

I have worked with Power BI before and have decent exposure to DAX and have built medium complexity dashboards. Essentially, what I am looking for is to be able to create say a time-series report of Main Categories - Books, Electronics etc. User should then be able to further breakdown on any of them into sub categories and see the movement.

In some cases such as the Kindle and Kobo, where there are sub-stages, one should be able to slice it across them (say seperate visual or something, not via drillable from top).

Have gone through various videos, tutorials on the net but am not able to get proper direction on how to proceed. I am able to get few pieces at a time by dragging in ‘Description’ and setting the filters on it. I want to understand how to set-up a dynamic hierarchy and relation and how to define such model.

Any help or links to relevant sources would be highly appreciated.

@vkarthik21,

Welcome to the forum, great to have you here!

Generally, I think that the basic Parent-Child hierarchy pattern on the SQLBI DAX Pattern site will get you most of the way there:

https://www.daxpatterns.com/parent-child-hierarchies/

However, I am scratching my head as to how to incorporate the Stage 1/Stage 2 breakdown info into that heirarchical structure. I will give it some additional thought and hopefully someone else here can come up with an approach to handing that wrinkle.

I hope this is helpful.

  • Brian
1 Like

Thanks Brian for the reply. Am now going through that article and will check to see how it has been developed.

I will await your reply on Stage 1/ Stage 2. Basically as I was saying there are various subsections in the actual data that having this info will help in creating the appropriate analytics.

@vkarthik21,

In relooking at this, I think there’s a very simple solution that’s completely compatible with the basic parent-child hierarchy pattern – just make the stages the fourth level down on the hierarchy the child level of the publisher. See below for example:

1.2.1. Harper Collins
1.2.1.1 Harper Collins - Stage 1
1.2.1.2 Harper Collins - Stage 2

– Brian

Hello @vkarthik21,

Thank You for posting your query onto the Forum.

As @BrianJ suggested, you’ll be required to create a hierarchy into the table. I’ve created two tales based on the data that you’ve provided. One is “Fact Table” and other one is “Dimension Table” and have also added the “Dates” table into the model and have marked date field as a “Date Table”.

So this is how your model looks like now. Below is the screenshot provided for the reference -

And this is how the results looks like in a hierarchial manner and also more importantly numbers matches with the data that you’ve provided. Below is the screenshot provided for the reference -

Please go through the transformation steps into the Power Query. I’m not writing the entire explanation here since it’ll just increase the length of the post by huge margin.

I’m attaching the PBIX file of my working for the reference. Also I recommend you to please go through the “Data Modelling” course which is already available onto our education portal. Below is the link of the course provided for the reference.

Please give a shout in case wherever you find it difficult to understand any of the step.

Hoping you find this useful and meets your requirements that you’ve been looking for. :slightly_smiling_face:

Thanks and Warm Regards,
Harsh

Data Modelling - Dynamic Hierarchial Data.pbix (71.0 KB)

3 Likes

@Harsh,

Wow - nicely done! I like your PQ-based approach better than my recommended DAX approach above.

  • Brian
1 Like

Hello @BrianJ,

As you’ve always said, if we can do the transformations in Power Query than we need not spend much time in writing complex DAX measures. :+1:

And that’s what I tried to do once I established the hierarchy I had to write only one DAX just to check whether it’s properly established or not.

Thanks and Warm Regards,
Harsh

2 Likes

Thor Yes

3 Likes

Thanks @Harsh for your reply. Yes, this definitely is what I was looking for. Could you please explain me how you built the Dimension table? What logic went into it? In my actual data the levelling goes up to depth of 5.

Hello @vkarthik21,

Sure, would be really happy to help you and provide the brief explanation behind the logic that went through. For understanding the steps that’s been applied you can go throught the “Applied Steps” option into the Power Query by clicking onto the individual steps as well as by clicking onto the “Gear Icon” so that you can see what was actually done there.

Firstly, the “Fact” and the “Dimension” Table are created by referencing the original table that you had provided into the CSV file. If you want to know what’s referencing you can find a video into the “Data Modelling” course.

So as you suggested in your initial post that you’ve “Categories”, “Sub-Categories” and “Stages” but than for some only of the cases you also have “Sub-Stages” so this is where it gets trickier. And the minute details of the original table was also very important at the first instance otherwise splitting of the column will provide ambiguous results. In your original table (as you can see from the screenshot provided below) some of the hierarchy numbers ends with symbol - “.”.

And therefore, it was important to end all the hierarchy numbers with the symbol - “.”. This you can check it out how it was done by going through the “Applied Steps” option. And now since we’ve achieved the consistency into the hierarchy numbers now we’ve to split the column based onto the symbol - “.”. Once the columns are split this is how it’ll look like. Below is the screenshot provided for the reference -

Now, once I split the column I removed some of the un-necessary columns as well as some of the un-necessary data from some of the columns. Again go through the “Applied Steps” option. After removing all the un-necessary things from the table, this is how the table will look like. Below is the screenshot provided for the reference -

Now, just select the “Description.2” column and right-click on it and use the “Fill Down” option. Below is the result in the form of screenshot of that column provided for the reference.

Now, you must be wondering let’s follow the same process for “Description.3” column as well. Or why I didn’t applied the same steps on “Description.3” column as well. So here’s the screenshot of the result had we followed the same process for “Description.3” column as well.

So if you observe first part of the screenshot everything looks good but in the second part you’ll find the error that “Paper and Hardbacks” which is the sub-category of the category “Books” is now getting dragged under the category of “Electronics” part as well. And this is where we need to write the logic of “IF AND ELSE” in the form of “M Code” by adding a “Custom Column” in order to avoid this type of ambiguous results. Below is the M Code provided for the reference -

if [Description.2] = "Books" and [Description.3] = "e-Readers" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kindle" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kindle - Stage 1" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kindle - Stage 2" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kobo" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kobo - Stage 1" then "e-Readers" else
if [Description.2] = "Books" and [Description.4] = "Kobo - Stage 2" then "e-Readers" else

if [Description.2] = "Books" and [Description.3] = "Paper and Hardbacks" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Harper Collins" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Harper Collins - Stage 1" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Harper Collins - Stage 2" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Penguin" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Penguin - Stage 1" then "Paper and Hardblacks" else
if [Description.2] = "Books" and [Description.4] = "Penguin - Stage 2" then "Paper and Hardblacks" else

if [Description.2] = "Electronics" and [Description.3] = "Speakers" then "Speakers" else
if [Description.2] = "Electronics" and [Description.4] = "Outdoor Speakers" then "Speakers" else
if [Description.2] = "Electronics" and [Description.4] = "Sound Bars" then "Speakers" else

if [Description.2] = "Electronics" and [Description.3] = "Television" then "Television" else
if [Description.2] = "Electronics" and [Description.4] = "Samsung" then "Television" else
if [Description.2] = "Electronics" and [Description.4] = "75 Inch" then "Television" else
if [Description.2] = "Electronics" and [Description.4] = "65 Inch" then "Television" else
if [Description.2] = "Electronics" and [Description.4] = "LG" then "Television" else

null)

The logic basically states that -

If Description.2 column is “Books” and Description.3 is “e-Readers” than result should be “e-Readers” and than since Description.3 column contains the null so now we’ll take help of Description.4 column which also an attribute or sub-part of Description.3 column. And that is, If Description.2 column is “Books” and Description.4 is “Kindle” than result should be “e-Readers” and now keep following this same logic till the very end”.

And this is how the result will look like. Below is the screenshot of the result of “Custom Column” provided for the reference -

And now, in the screenshot you’ll see that “Electronics” is blank for the first line whereas previously it was getting over-lapped with the sub-category “Paper and Hardbacks”.

Now, once this is done. We’ll move on towards the last important part. And that is, for some of the Stages you’ve Sub-Stages and for some you don’t have so will add a “Conditional Column” here. So the logic will be like this -

If you’ve Sub-Stages than provide us the result of Sub-Stages otherwise provide us with the result of Stages. That is, if Description.4 column has a Sub-Stage than provide results of the Description.4 column otherwise provide the results of Description.3 column”.

Below is the screenshot provided for the reference of the conditions as well as of the results for the reference (See Custom Column) -

And than lastly since we need to create a relationship between the “Fact” and “Dimension” table I’ve added an “Index Column” into it. And this is how the final result will look like after performing all the transformations -

For Fact table, I’ve applied basic transformation steps and that you can go through those steps throught the “Applied Steps” steps.

Now, just click on “Close and Apply” option, load the tables into the Power BI and create a relationships between them. Once this is done, just drag and drop the fields from the “Dimension” table into the Table visualization and the one simple basic measure that I had created. This is how the result will look like at the first instance below is the screenshot of the result provided for the reference -

Since it includes the blanks as well. In order to remove them just go to the “FILTER PANE” and un-check the “(BLANK)” option. Below is the screenshot of the results provided for the reference -

Un-check Blanks from the FILTER PANE

And voila!!!. This is how the results were achieved.

Note: It absolutely doesn’t matter whether you’ve 4 hierarchy levels or 10 hierarchy levels the logic will remain absolutely same in order to derive the results.

Hoping you find this explanation useful and meets your requirements that you’ve been looking for. :slightly_smiling_face:

And Best of Luck with the analysis that you’re trying to achieve. :+1:

Happy Weekends!!!

Thanks and Warm Regards,
Harsh

4 Likes

Hi @vkarthik21

Using DAX you can create hierarchies.

dynamic hierarchical data.pbix (30.4 KB)

2 Likes

Thank you so much @Harsh for putting in so much effort and going through the process of deconstructing the logic and explaining the steps in methodical fashion.

I do see that it has manual hardcoded steps to be put into and am afraid it is just going to add complexity in building up my logic as I need to be able to adapt to various types of forms and hierarchies. This would entail adding in the manual changes everytime.

@Rajesh - Your solution is more adaptable and thanks a ton for providing the file. I am planning to re-use this.

On the whole I must say, I am very impressed by the vibrant community here. I just hope I would be able to build my skills gradually as I begin my journey.

One question I would like to ask is, I have just started the Mastering DAX course. I do have fair bit of Power BI Knowledge and created some dashboards and overall not adept. Are there any particular exercies where I can practise what I am learning here as in example dataset and exercises to get the desired result?

Thanks again all for the wonderful responses.

You have come to the right place. Every month Enterprise DNA runs a “Data Challenge” where you are given a sample dataset, and a short brief explaining the overall goal and the type of questions your report should answer. Participants post their reports along with a detailed writeup explaining how they created their report. In addition, Enterprise DNA founder and CEO Sam McKay provides a written evaluation with detailed constructive feedback for every single entry submitted. Entrants range from relatively new Power BI users to expert professional developers, and the community is incredibly supportive. I think almost every single person who participates regularly in the Challenges would tell you that it is the absolute best way to improve your Power BI skills quickly, and that it’s also a lot of fun. Data Challenge #12 should be kicking off this coming week, and you can find all the past Challenges, reports and PBIX files in the Data Challenge Showcase.

I would really encourage you to give it a try. In addition, we also offer some great prizes for the top submission each challenge by a first time entrant.

In addition to the Challenges, on the first and third Wednesday of each month we run “Problem of the Week”. These are intermediate level problems based on applied, practical problems that would be commonly seen in a business environment. The first problem each month is a DAX problem, while the second problem each month is focused on the use of Power Query. The problems are run by different members of our expert team, so you get exposed to a wide variety of techniques and problem-solving styles, both from the lead expert and from all the other entrants as well.

Past Problems of the Week, , along with submissions and expert solution videos can be found in forum Problem of the Week category:

https://forum.enterprisedna.co/c/problem-of-the-week/51

The DAX Problem of the Week for March kicks off this Wednesday.

So, lots of great structured practice opportunities to improve your Power BI skills. Hope to see you in one or both of these, and give a shout if you have any questions.

  • Brian

Thanks again @BrianJ for your response. Greatly appreciate it. Yes, I will certainly take an effort to put in my responses and learn somethings on the way.

1 Like

@BrianJ - Just a query. Going through the link you gave for Power BI Challenge Showcase, it is asking for Enrollment. Is this a separate payment again? I already have a Enterprise DNA Membership.

@vkarthik21,

Nope - that’s all included with your membership. However, the forum and the training portal do require two separate logins.

  • Brian

Thanks Brian. All good. :slight_smile: