DAX Workout 010 - Pareto Analysis

Welcome to another workout. This is a 2-star workout.

This is a straightforward workout. With the existing dataset, attempt to do a Pareto Analysis as shown.

Have fun !!

I am looking forward to the results.

Submission

Load the supplied data file into a new Power BI file, create your solution, and reply to this post. Upload a screenshot of your solution along with the DAX measure. Please format your DAX code and blur it or place it in a hidden section.

Period
This workout will be released on Thursday, April 20, 2023, and will end on Sunday, April 23, 2023. But you can always come back to any of the workouts and solve them.

DAX _10 PARETO RULE.pbix (563.1 KB)

4 Likes

I’ve never been impressed with out-of-the-box pareto chart capabilities of power bi mainly because of the limited ability to annotate and enhance the visual with, say, x- and y-axis reference lines showing 80% of y and 20% of x or, even better, dynamic reference lines controlled by parameters. The real value for me is in drawing insights like, in the current example, what percentage of customers are responsible for what percentage of sales is that x to y ratio consistent across countries? Maybe less than 5% of customers account for more than 90% of sales in one country…Anyway, I find myself building charts like these in vega-lite more often than with the core visuals.

This is how I would calculate Pareto Chart (Customer) measure in your chart:

Pareto Chart (Customer)
Pareto Chart (Customer) = 
VAR __salesAll = 
    CALCULATE( 
        [Total Sales] 
        , ALLSELECTED ( Sales ) 
    ) 
VAR __salesSelected = [Total Sales]
VAR Result = 
    DIVIDE(
        SUMX(
            FILTER(
                SUMMARIZE(
                    ALLSELECTED( Sales )
                    , Customers[Customer Names]
                    , "Cum. Sales" , [Total Sales]
                )
                , [Cum. Sales] >= __salesSelected
            ) 
            , [Total Sales]
        ) 
        , __salesAll 
        , 0
    )
RETURN 
    Result

DAX _10 PARETO RULE.pbix (403.2 KB)

Thank you very much for planning this workout!

2 Likes

The recent updates to ORDERBY make this simpler than it used to be.

Dax Code
Pareto Chart (Customer) =
VAR _TotalSales =
    CALCULATE (
        [Total Sales],
        ALLSELECTED ( Customers[Customer Names] )
    )
VAR _CumulativeSales =
    CALCULATE (
        [Total Sales],
        WINDOW (
            1, ABS,
            0, REL,
            ALLSELECTED ( Customers[Customer Names] ),
            ORDERBY ( [Total Sales], DESC )
        )
    )
VAR _Result = DIVIDE ( _CumulativeSales, _TotalSales )
RETURN
    _Result
6 Likes

Nice @AlexisOlson. Not an approach that came to mind, and I think I will change my pattern script and adopt your method in the future.

1 Like

Thanks for the workout, below is my submission;

I got some ideas by watching @SamMcKay Pareto’s video https://www.youtube.com/watch?v=rlUBO5qoKow

Findings: The percentage of customers accounting for the highest sales
across each country is between 1 and 6%

DAX Code

Summary

3 Likes

Great job @Eze !! Sam’s tutorials are great.

1 Like

@Paul.Gerber,

Another really excellent, practical DAX workout - thanks!

Click for DAX Code
Pareto Sales = 

VAR __SelSales = [Total Sales]

VAR __vTable_ =
ALLSELECTED( Sales )

VAR __Denominator = 
SUMX( __vTable_, [Total Sales])

VAR __Numerator =
SUMX(
    FILTER(
        SUMMARIZE(
            __vTable_,
            Customers[Customer Index],
            Customers[Customer Names],
            "@TotSales", [Total Sales]
        ),
        [@TotSales] >= __SelSales
    ),
    [@TotSales]
)

VAR __Result =
DIVIDE( __Numerator, __Denominator )

RETURN __Result

This text will be hidden

3 Likes

Thanks @BrianJ I liked the approach of changing the color of the bar chart

1 Like

Here is my submission for this workout. I could have done this using windows functions but that has been covered already so used another method.
We can analyze sales as well as cost Pareto for each country and can see best and worst customers for product/s

Pareto.pbix (577.7 KB)
Pareto.pdf (152.7 KB)
DAX.txt (765 Bytes)

2 Likes

DAX _10 PARETO RULE Steven Stanford.pbix (565.0 KB)

Hi, great challenge.

1 Like

Hi Paul,

I have never used Pareto chart before but found Sam’s tutorial very useful. Thanks for a great workout

My dax:

Summary

1 Like

Great job, @KimC I used it a few times with revenue and inventory at work.

Hi @Paul.Gerber, can you explain more about the context in this DAX. I am still unclear about comparing the sales in variance and sales in the virtual column created.

Hello @collier.brittany Are speaking about the virtual “sales” column in your DAX? Can you be specific with your question as to the problem you are facing? Or are you trying to understand the Pareto Rule?

Paul

Hi @Paul.Gerber
For example, in my Dax code below, I compare [Sale] and _sales. I want to know why these two are different based on the context.

@collier.brittany

In the context of a Pareto analysis, the purpose of the DAX code you provided is to calculate the cumulative sales for each customer and to identify the top 80% of customers that account for 80% of the total sales.

The measure [Sale] represents the individual sales values for each customer, while the measure [Total Sales] represents the total sales value across all customers. The variable _sales are assigned the value of [Total Sales].

The FILTER function in the code filters the table generated by the SUMMARIZE function to include only those customers whose sales are greater than or equal to the total sales value. This effectively filters out customers whose sales are not significant enough to be included in the top 80%.

The SUMX function then iterates over the filtered table and calculates the cumulative sales for each customer. This allows you to identify the top 80% of customers that account for 80% of the total sales, which is a key step in a Pareto analysis.

Overall, the difference between [Sale] and _sales in this context is that [Sale] represents individual sales values for each customer, while _sales represents the total sales value across all customers. By comparing these two measures and filtering out customers whose sales are not significant enough, you can identify the top customers that account for the majority of the sales in your data set.

I hope this helps you.

Paul

Thanks for your explanation @Paul.Gerber

1 Like

Did that help?

1 Like

@Paul.Gerber
Yub. A little tricky but now I understand.

1 Like

Hello @Paul.Gerber,

Thanks for the amazing DAX workout!

As the concept was new for me, it made me curious to understand and then proceed with implementation. Entreprise DNA YT Channel helped me clear the concept and doubts.

Attached is the outcome of the DAX workout.

Below is the DAX Code:

Pareto_Score =
/* Purpose of asssigning measure to a variable:variable helps evaluate before the below formula context*/

var total_sales = [Total Sales]
var allsales = CALCULATE([Total Sales], ALLSELECTED(Sales))
return

/* Main purpose:Divide cumumative sales by Total sales to derive upon 80%-20% rule per Pareto Principle*/

DIVIDE(
SUMX(
/* Cummulative Total Code */
FILTER(
SUMMARIZE(ALLSELECTED(Sales), Customers[Customer Names],
“Sales”, [Total Sales]),
[Sales] >= total_sales),
[Total Sales]), /Cumulative Total code ends here/
allsales, 0)

2 Likes