Pareto Rule Total Issue

Matty · July 10, 2020, 8:17am

Hi,

I watched Pareto Rule, there are no resources but is the total correct for 20 %?

If we would add every individual customer for each state the total doesn’t add up to 10k is way over.

Harsh · July 10, 2020, 9:04am

Thank You for posting your query onto the Forum.

As you’ve already seen in a video that how Sam has calculated the Sales of Top 20% Customers. I’m just providing the formula below for the reference.

Part 1: Calculating Sales of Top 20% of Customers

Sales of Top 20% of Customers =

VAR CustomerPercent = DISTINCTCOUNT( Sales[Customer ID] ) * 0.2

RETURN
CALCULATE( [Total Sales],
   FILTER( VALUES( Sales[Customer ID] ),
     RANKX( VALUES( Sales[Customer ID] ), [Total Sales], , DESC ) <= CustomerPercent ) )

Now, in the given image you’ll observe that the analysis is done based on the State Code and not on Customer Name or Customer Code so the context of State Code while calculating the “Total Sales of 20% Customers” has been ignored. Now to add back the context of “State Code” in your total I’ve used a “Measure Branching Technique”. And below is the formula you can use after calculating the “Sales of Top 20 Customers”.

Part 2: Calculating Total Sales of Top 20% Customers

Total Sales of Top 20% Customers = 
SUMX(
    SUMMARIZE( Locations , Locations[State Code] , "Total Sales of Top 20% Customers" , [Sales of Top 20% of Customers] ) , 
        [Total Sales of Top 20% Customers] )

Now, after writing the above DAX you’ll observe that the “Total Sales of Top 20% Customers” will show you the correct total at below. I’m also providing a screenshot below as well as the PBIX file of my working for the reference.

I’ve laid emphasis on the “Measure Branching Technique” for which Sam has always advocated to us. We could have also used the “Variable Technique” in order to compress it into one formula but as Sam I also lay emphasis on “Measure Branching Technique” so that I can understand the concept behind the formula which also helps you to correct the formula if you find something unusual in order to achieve the desired result.

Whereas in “Variable Technique” it would be difficult to analyze the steps about where one went wrong since all the calculations happens virtually.

Hoping you’ll find this useful and helps you in your analysis.

Thanks & Warm Regards,
Harsh

The Pareto Principle (80-20) Rule.pbix (433.1 KB)

Matty · July 10, 2020, 9:59am

Hello @Harsh,

Thank you for your help, I am trying to get my head around.

So when we are applying “State Code” context the formula is re-valuating/ re-calculating values to each State?

I am a bit confused now as I tried to apply filters customer name or ID from customer or sales data but I just receive a only total. Is it because we used customer ID filters in the formula so we can only see a whole value?

I did use the same formula in my file and I would like to have a better understanding of how DAX treats filters in these contexts.

For Example In my context

Total Revenue is 47 million and as I knew it 20% of customer will generate around 80% of Sales which actually comes to 35.5 milion but if I apply my filter similar to State code and then I used your formula to summarize column I get 17 milion and for few states there are no values, even though I know that there was a sale for 5 milion (if we would apply base rank formula this would be ranked around 3-5th)

Apologies for asking but I just want to understand the context between formula and the filter that we apply.

Harsh · July 10, 2020, 11:33am

Hello @Matty,

Now as you said that you tried different variations such as customer name or ID from customer or sales data. The context changes from State Code to Customers.

I’m attaching the screenshot below to explain my case with the example -

In the given image as shown above, I’ve adopted two approaches to calculate the “Top 20% Customers by Customer Name”.

Now, you may seem confused that although I’m analyzing the table by Customer Name why I’ve given name to my measure as Top 20% Customers by ID**". There are 2 reasons which I would like you to explain here -

Part 1: Measure Branching Technique

Since my Sales Table and Customers Tables is created on the basis of relationship between Customer IDs so while calculating the Unique Customers I’ve used Customer ID from the Sales Table since my Customer ID column from Customer Table flows down and creates a relationship with the Customer ID column from the Sales Table. (Waterfall Effect i.e. when we create this type of effect our relationship moves down from Dimension Table to the Fact Table and not vice - versa).

So below is the formula provided for the reference -

Unique Customers 1 = DISTINCTCOUNT( Sales[Customer ID] )

After this, since I’ll be adding my Customer Names from the Customer Table I will be required to calculate their Ranks. Below is the formula provided for the reference -

Customer Rank = RANKX( ALL( Customers ), [Total Sales], , DESC )

Now once I’ve my Unique Customer ID and their Ranks I’ll proceed further to calculate “Top 20% Sales by Customer Name”. Below is the formula provided for the reference -

Top 20% Customers by Customer ID 1 = 
VAR UniqueCustomers =  CALCULATE( [Unique Customers 1], ALL( Customers ) )
VAR Top20 = UniqueCustomers * 0.2

RETURN
IF( ISFILTERED( Customers[Customer Name] ),
    IF( [Customer Rank] < Top20, [Total Sales], BLANK() ),
        CALCULATE( [Total Sales], TOPN( Top20, ALL( Customers ), [Total Sales], DESC ) ) )

Part 2: Variable Technique

Now to avoid all these process one can also use the Variables provided one has really better understanding of the context. Below is the formula provided for the reference -

Top 20% Customers by Customer ID 2 = 
VAR CustomerRank = RANKX( ALL( Customers ), [Total Sales], , DESC )
VAR UniqueCustomers =  CALCULATE( DISTINCTCOUNT( Sales[Customer ID] ) , ALL( Customers ) )
VAR Top20 = UniqueCustomers * 0.2

RETURN
IF( ISFILTERED( Customers[Customer Name] ),
    IF( CustomerRank < Top20, [Total Sales], BLANK() ),
        CALCULATE( [Total Sales], TOPN( Top20, ALL( Customers ), [Total Sales], DESC ) ) )

Now, after considering all these you might be wondering why “IF( ISFILTERED( Customers[Customer Name] )” has been used. This is where the main answer lies.

As you said whats difference between the formula and filter context. The reason is as follows -

When we are analyzing the Customer Names without applying any filters from outside the table then the context is there within the table only. i.e. Customer Name in this case. (So Customer Name is as a context within the table).
But now if we apply any filter/slicer in any form from outside the table then alongwith the Customer Name context which is there within the table we are also evaluating and considering the context which is coming from outside the table and therefore, in order to consider the context coming from outside the table we’ve to include “IF( ISFILTERED( Customers[Customer Name] )” it our formula . And this is what context transition is all about.

Now, if I apply filters in on any state we’ll get the right values of “Top 20% Customers Sales by Customer Name”. Below is the screenshot provided for the reference -

I’m also attaching the PBIX file of my working as well as I’m providing few links below of videos which is specifically related to Ranking Techniques which was created by Sam during his “Learning Summit Series” and how it involves context transition. As well as there are several other videos available on our education portal and I sincerely recommend you to go through those videos for much more better understanding of the “Context” concept before you starting practicing with formulas.

Hoping you find this useful and helps you in your analysis.

Please note: If you change the Customer Name with Customer ID again the answer will change and you’ll be required to adjust your formulas accordingly. The given PBIX file is for example purpose.

Thanks & Warm Regards,
Harsh

The Pareto Principle (80-20) Rule.pbix (431.9 KB)

Matty · July 10, 2020, 4:57pm

@Harsh Once again big massive thank you.

I will read it tomorrow morning few times to understand better (my brain today is knackered long day :D)

I will definitely watch both Learning Summit Series before I try to apply formulas in my analysis

I will come back to you @Harsh as I slowly start to understand

Harsh · July 11, 2020, 7:30am

Hello @Matty,

You’re Welcome.

I’m glad I was able to help you.

And I forgot to mention it during my previous response that there was no need for apologies. I’m glad you asked your queries for better understanding of the concept. Always happy to help because we also learn some new things when members ask questions on the forum. Ask your queries freely without any hesitation.

And lastly before I sign off I’m providing a link below of an article from Alberto Ferrari. I recommend you to go through this article once where he has explained about although “DAX is simple, but not easy.”

Hoping you find this useful.

Thanks & Warm Regards,
Harsh

Matty · July 12, 2020, 11:06am

Hello @Harsh ,

I hope you had a great weekend.

I watched both Summits and read the article.

I agree that DAX is simple but not easy. With DAX you need to understand how each function will behave it’s not like excel where you can dive into the formula bar and read each step by step.

I think slowly I start to understand how the advanced RANKX formula works, as I understood the formula is summing up so we need to be careful if there are any duplicate names.

I am attaching the file that I work with. Pareto.pbix (122.9 KB)

There are few customers that have the same name but unique customer ID.

I have used different contexts to ensure I will get the correct answer.

Page 1 I Ranked by Customer and By Customer ID to display customerID and customer names, I add ALLSelected so it’s applying extra context.

On-Page 2 I add another context City, but then while I review the data, the rankx worked nearly correct, the data was kind of duplicating creating extra customer names for each City. I believe I could merge all 3 and then Rank them by unique name then the formula would work as I assume.

I used If statement to rank everything in order but I run into a small issue

If Rank City/Customer/ID = IF([Total Sales] = 0,BLANK(),CALCULATE( RANKX(ALL(Customer[Customer ID]), [Total Sales],DESC),
ALLSELECTED(City[City]),
ALLSELECTED(Customer[Customer Name])))

Is it possible to adjust the if statement to put 0 value in rank order? Or I should actually for this task merge 3 columns in the sales table since City and Customer are in two different tables?

On page 3 I tried to calculate 20% of RANKX

To ensure it works correctly, I applied the TOPN formula if you notice both Column Charts are correct.

But when we apply customer name or city context to the table I don’t believe the formula works correctly. As I do know in City 2 (see matrix table) there are sales over 6.5 and there is one customer only.

To see my TOP20% customer, I just used TOPN formula (see TOP27)
I thought I can apply Variable function to my top20% but I just received a correct total amount but when the customer names context is applied there rows are empty. I tried to apply your Formula “ISfiltered” but didn’t work.

I divided your formula into two sections to have a better understanding and this is how I understand:

At Page 4 I played with grouping measures

If I apply the filter for the city then I get all customer names grouped correctly but if apply the filter from treemap only one customer is visible.

Should I turn off the interactions between grouping and treemap or the formula needs to be adjusted?
As I understand treemaps are good for showing the proportions between each part and the whole, display large numbers shows patterns, outliers.

I will appreciate your help.

Harsh · July 12, 2020, 4:46pm

Hello @Matty,

Can I request you one thing?

Since this thread has been closed can you post this same question by creating a new thread so that I can provide the solution.

As per the forum guidelines, one question can be asked in one thread and since it’s been accepted as "Solution" just create a new thread and I’ll post my solution over there.

Sorry, for the inconveniences caused.

Thanks & Warm Regards,
Harsh