DAX Workout 010 - Pareto Analysis

@pragnya_puranik I am so glad you enjoyed the workout. I love how you used diverging colors in your example.

Thank you!!

Paul

2 Likes

Thanks for linking Samā€™s video. It helped me answer mine too

3 Likes

I used the method from Samā€™s tutorial on the Pareto chart to solve this.

However, before I discovered the tutorial, I tried to come up with my own method, and I donā€™t still understand why it doesnā€™t work. Iā€™ll share it below and I hope someone out there can help me understand why it doesnā€™t, and if thereā€™s potentially a way to remedy it. I was able to calculate the cumulative sales by creating a new physical table. However, when I tried to do the same thing using a virtual table, it doesnā€™t work.

This code doesn't work

Table = //this code creates a physical table in the model VAR customerranks = RANKX(ALLSELECTED(Customers[Customer Names]), [Total Sales],,DESC) VAR ranktable = ADDCOLUMNS( VALUES(Customers[Customer Names]), "Revenue", [Total Sales], "Rank", RANKX(ALLSELECTED(Customers[Customer Names]), [Total Sales],,DESC)) RETURN ranktable

cumulativesales =
//This code correctly calculates the cumulative sales
CALCULATE(SUM(ā€˜Tableā€™[Revenue]),
FILTER(ALLSELECTED(ā€˜Tableā€™),
ā€˜Tableā€™[Rank] <= MAX(ā€˜Tableā€™[Rank]))
)

This is the code I used to produce the final chart:

Correct working code

Pareto value =

VAR customerrev = [Total Sales]
VAR Allsales = CALCULATE([Total Sales], ALLSELECTED(Sales))
VAR cumulative =
SUMX(
FILTER(
SUMMARIZE(ALLSELECTED(Sales), Customers[Customer Names],
ā€œRevenueā€, [Total Sales]),
[Revenue] >= customerrev),
[Revenue]
)
VAR paretoval =
DIVIDE(cumulative, Allsales, 0)
RETURN paretoval

2 Likes

@MubarakBabs I appreciate you having a go at it.

The code you provided is not directly related to generating a Pareto curve. The code seems to be creating a table with customer sales information and calculating the cumulative sales based on the customer ranking by sales.

To generate a Pareto curve, you need to plot the cumulative percentage of a certain variable (e.g., sales) against the percentage of the total population that contributes to that cumulative percentage. In other words, the Pareto curve shows the proportion of the total value that is attributed to the top X% of items, where X is on the horizontal axis, and the cumulative proportion of the total value that they represent, which is on the vertical axis.

1 Like

Thank you for the explanation.

Calculating the cumulative sales was meant to be an intermediate step in building out the Pareto curve. Iā€™m still confused as to why Power BI wouldnā€™t let me calculate the cumulative sales within that measure using virtual tables, while the exact same code would work when creating a physical table. Does this mean that virtual table columns have limits to their usage, which can only be overcome by using physical tables?

Case in point, in the calculation above, I referenced 'Table'[Rank] <= MAX('Table'[Rank]), which succeeded in achieving the intended results. But when I made ā€˜Tableā€™ a virtual table within a measure, I was unable to reference the virtual column [Rank] within the MAX function. This seemed to be the main limitation I faced. Is it possible to overcome this limitation? Or should I just give up and accept that virtual columns donā€™t always function like physical columns do?

@MubarakBabs

In Power BI, virtual columns, also known as calculated columns, can have some limitations compared to physical columns. One such limitation is that virtual columns cannot be directly referenced within aggregation functions like MAX, MIN, or SUM. This is because virtual columns are evaluated at the row level, while aggregation functions operate at a higher level, such as the filter context or groupings.

To overcome this limitation and calculate the cumulative sales within a measure using virtual columns, you can use the EARLIER function in combination with the virtual column. The EARLIER function allows you to refer to the value of a column in the previous row context.

I hope this helps and please donā€™t give up. Read up on virtual tables and columns. I am still learning new things every day.

Paul

1 Like

Thanks for the suggestion @Paul.Gerber. The EARLIER function doesnā€™t allow me to reference the virtual column. I guess Iā€™ll just move on from it until Iā€™ve learned more.

Thanks once again

1 Like

Amazing solution. So impressive

1 Like

This is an excellent workout, I go through the Lab and understood another way of calculating cumulative sales on different context/types of data. Thanks

DAX Measures

This text will be hidden

Cumulative Cost =
VAR _custCost = [Total Costs] 

RETURN
SUMX(
    FILTER(
        SUMMARIZE(
            ALLSELECTED( Sales ),
            Customers[Customer Names],
            "Cost", [Total Costs]
        ),
        [Cost] >= _custCost
    ),
    [Cost]
)

Pareto Chart =
VAR _allCost = CALCULATE( [Total Costs], ALLSELECTED( Sales ) ) 

RETURN
DIVIDE( [Cumulative Cost], _allCost, 0 )
1 Like

answer:
solution:


DAX Workout No10 MB.pbix (597.7 KB)