@pragnya_puranik I am so glad you enjoyed the workout. I love how you used diverging colors in your example.
Thank you!!
Paul
@pragnya_puranik I am so glad you enjoyed the workout. I love how you used diverging colors in your example.
Thank you!!
Paul
Thanks for linking Samās video. It helped me answer mine too
I used the method from Samās tutorial on the Pareto chart to solve this.
However, before I discovered the tutorial, I tried to come up with my own method, and I donāt still understand why it doesnāt work. Iāll share it below and I hope someone out there can help me understand why it doesnāt, and if thereās potentially a way to remedy it. I was able to calculate the cumulative sales by creating a new physical table. However, when I tried to do the same thing using a virtual table, it doesnāt work.
Table = //this code creates a physical table in the model VAR customerranks = RANKX(ALLSELECTED(Customers[Customer Names]), [Total Sales],,DESC) VAR ranktable = ADDCOLUMNS( VALUES(Customers[Customer Names]), "Revenue", [Total Sales], "Rank", RANKX(ALLSELECTED(Customers[Customer Names]), [Total Sales],,DESC)) RETURN ranktable
cumulativesales =
//This code correctly calculates the cumulative sales
CALCULATE(SUM(āTableā[Revenue]),
FILTER(ALLSELECTED(āTableā),
āTableā[Rank] <= MAX(āTableā[Rank]))
)
This is the code I used to produce the final chart:
Pareto value =
VAR customerrev = [Total Sales]
VAR Allsales = CALCULATE([Total Sales], ALLSELECTED(Sales))
VAR cumulative =
SUMX(
FILTER(
SUMMARIZE(ALLSELECTED(Sales), Customers[Customer Names],
āRevenueā, [Total Sales]),
[Revenue] >= customerrev),
[Revenue]
)
VAR paretoval =
DIVIDE(cumulative, Allsales, 0)
RETURN paretoval
@MubarakBabs I appreciate you having a go at it.
The code you provided is not directly related to generating a Pareto curve. The code seems to be creating a table with customer sales information and calculating the cumulative sales based on the customer ranking by sales.
To generate a Pareto curve, you need to plot the cumulative percentage of a certain variable (e.g., sales) against the percentage of the total population that contributes to that cumulative percentage. In other words, the Pareto curve shows the proportion of the total value that is attributed to the top X% of items, where X is on the horizontal axis, and the cumulative proportion of the total value that they represent, which is on the vertical axis.
Thank you for the explanation.
Calculating the cumulative sales was meant to be an intermediate step in building out the Pareto curve. Iām still confused as to why Power BI wouldnāt let me calculate the cumulative sales within that measure using virtual tables, while the exact same code would work when creating a physical table. Does this mean that virtual table columns have limits to their usage, which can only be overcome by using physical tables?
Case in point, in the calculation above, I referenced 'Table'[Rank] <= MAX('Table'[Rank])
, which succeeded in achieving the intended results. But when I made āTableā a virtual table within a measure, I was unable to reference the virtual column [Rank] within the MAX function. This seemed to be the main limitation I faced. Is it possible to overcome this limitation? Or should I just give up and accept that virtual columns donāt always function like physical columns do?
In Power BI, virtual columns, also known as calculated columns, can have some limitations compared to physical columns. One such limitation is that virtual columns cannot be directly referenced within aggregation functions like MAX, MIN, or SUM. This is because virtual columns are evaluated at the row level, while aggregation functions operate at a higher level, such as the filter context or groupings.
To overcome this limitation and calculate the cumulative sales within a measure using virtual columns, you can use the EARLIER function in combination with the virtual column. The EARLIER function allows you to refer to the value of a column in the previous row context.
I hope this helps and please donāt give up. Read up on virtual tables and columns. I am still learning new things every day.
Paul
Thanks for the suggestion @Paul.Gerber. The EARLIER function doesnāt allow me to reference the virtual column. I guess Iāll just move on from it until Iāve learned more.
Thanks once again
Amazing solution. So impressive
This is an excellent workout, I go through the Lab and understood another way of calculating cumulative sales on different context/types of data. Thanks
This text will be hidden
Cumulative Cost =
VAR _custCost = [Total Costs]
RETURN
SUMX(
FILTER(
SUMMARIZE(
ALLSELECTED( Sales ),
Customers[Customer Names],
"Cost", [Total Costs]
),
[Cost] >= _custCost
),
[Cost]
)
Pareto Chart =
VAR _allCost = CALCULATE( [Total Costs], ALLSELECTED( Sales ) )
RETURN
DIVIDE( [Cumulative Cost], _allCost, 0 )