CustomerAnalytis - Pareto Analysis

thanks to team edna for such a great portal and the contents!

I built a pareto analysis, taking inspiration from enterprise DNA portal content, which provides a great insight into who our key customers are, however i could use some help in addressing the performance issues.

The report takes a really long time as it creates a virtual table for creating a cumulative sales for each customer, so wanted to know if others have faced this issue and if you any suggestion to overcome is much appreciated…

  • Cumulative sum measure that is causing performance issue is written as

      CumulativeSales = 
          var totalsales = CALCULATE(SUM(FactSales[AmountAvgUSD]),ALLSELECTED(FactSales))
          var currentsales = SUM(FactSales[AmountAvgUSD])
          var summarydata = SUMMARIZE(ALLSELECTED(FactSales),FactSales[CustomerKey],"Revenue",SUM(FactSales[AmountAvgUSD]))
          var cumulativesum = 
              SUMX(FILTER(summarydata,[Revenue] >= currentsales),[Revenue])
              Return (cumulativesum)
    

the table is laid out as below

image

Hi @sanappi09,

After a quick look, I can see that you are calculating below measure in 3 places and I would assign it to variable and use in 3 places. In this way, it will be evaluated once.

var sales_amt = SUM(FactSales[AmountAvgUSD])

So, the formula would become and hopefully it will evaluate faster.

CumulativeSales =

VAR Sales_amt =
    SUM ( FactSales[AmountAvgUSD] )
VAR totalsales =
    CALCULATE ( Sales_amt, ALLSELECTED ( FactSales ) )
VAR currentsales = Sales_amt
VAR summarydata =
    SUMMARIZE (
        ALLSELECTED ( FactSales ),
        FactSales[CustomerKey],
        "Revenue", Sales_amt
    )
VAR cumulativesum =
    SUMX ( FILTER ( summarydata, [Revenue] >= currentsales ), [Revenue] )
RETURN
    ( cumulativesum )

thanks hafiz, tried the solution you have provided… The performance is pretty bad as before, and the new measure is not calculing ‘cumulative sum’ either…

Create a version of the measure as -
CumulativeSalesv2 =

VAR Sales_amt2 =
    SUM ( FactSales[AmountAvgUSD] )
VAR totalsales2 =
    CALCULATE ( Sales_amt2, ALLSELECTED ( FactSales ) )
VAR currentsales2 = Sales_amt2
VAR summarydata2 =
    SUMMARIZE (
        ALLSELECTED ( FactSales ),
        FactSales[CustomerKey],
        "Revenue", Sales_amt2
    )
VAR cumulativesum2 =
    SUMX ( FILTER ( summarydata2, [Revenue] >= currentsales2 ), [Revenue] )
RETURN
    ( cumulativesum2 )

image

Hi @sanappi09,

Thanks for the quick test. If you have sample file, it would be much easy to test different solutions and analyze. You can also try GROUPBY/ADDCOLUMN-SUMMARIZE combination etc to test performance gain/degradation etc.

is it possible to get on a webconference call? it is a connected data model with real data…

Hi @sanappi09,

I am afraid, at the moment, I might not be able to do webcall. I will try to simulate it with some test data and get back to you.

Ok thanks Hafiz, look forward to hearing from you… once again appreciate your help!

1 Like

Hi @sanappi09,

I belive context transition in SUMX is creating delay in measure calculation. I have tried other way to create calculated column to bring same result. This below formula seem quite fine to me in terms of speed.

Cummulative = 

VAR rnk =
    VALUES ( 'Customer'[Rank1] )
RETURN
    IF (
        ISFILTERED ( 'Customer'[CustomerKey] ),
        CALCULATE ( [Total Sales], FILTER ( ALL ( Customer ), Customer[Rank1] <= rnk ) ),
        BLANK ()
    )

image

1 Like

Hi @sanappi09, Please check out this video by Sam to learn more about - Testing The Pareto Principle (80/20 Rule) in Power BI w/DAX

Sorry for the delayed response Hafiz… did try you updated dax and still have problem, dont get the cumulative sum working… i am still trying few other options but appreciate your help…

Also please check this tutorial as well

Hi @sanappi09, we’ve noticed that no response has been received from you since the 7th of March. We just want to check if you still need further help with this post? In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the checkbox. Thanks!

Hello, I have a problem with my Pareto charts, because I have sometimes 2, 3, 4, or even more unique values for the Pareto analysis. By using SUMMARIZE to create a virtual table and iterating through that, the Pareto line jumps because of the FILTER statement and the grouping of this unique values. Do you have a solution for this problem ? I guess there is a additional IF statement necessary, but I haven’t got the right idea yet.

1 Like

To make my problem more transparent:

I have the same problem. for a week by now i am trying to solve it! please tell me if you found a solution. i am still new here and i am searching for it.
thanks

please start a new topic as this is already solved
thanks

Hi Sam,
This is really great, and I’ve been able to repeat what you did in the video and create a pareto chart for ‘customers’ and a pareto chart for ‘products’. I’ve used the pareto to dynamically segment products and customers based on revenue. The first 80% of revenue from each category is labeled an A or B customer/product. Now, I’m trying to put these labels together to segment m sales into quads. An A Cust/A Prod= Quad1, A Cust./B Prod = Quad2, etc. However, when I put the two together I’m losing data. I should have total sales of $228 million, but I’m only getting total sales of $180 million when I do this segmenting. I get the full $228 million when I segment only on Customers or only on Products. It’s only when I put them together that I lose values.
I thought combining what you did in the pareto chart lecture with this lecture would work to do this, but I haven’t been successful.
Segment Customers Based On Historical Performance - Power BI & DAX Techniques (enterprisedna.co)

My Pareto formula:

Pareto Chart (Customers) =
VAR CustomerRev = [Total Sales]
VAR AllCustomerSales = CALCULATE( [Total Sales], ALLSELECTED( Sales) )

RETURN
DIVIDE (
SUMX(
FILTER(
SUMMARIZE( ALLSELECTED( Sales ), Sales[Cust],
“Revenue”, [Total Sales] ),
[Revenue] >= CustomerRev ),
[Revenue] ),
AllCustomerSales, 0)

My customer & product segmenting formula:

80% of Revenue Customers AB =
IF( [Pareto Chart (Customers)] <= 0.8, “A” , “B”)

My Quad segmenting formula:

Quad 1 Sales = CALCULATE(‘Formula Table’[All Sales], Customers[Customer Type Local] = “A”, Products[Product Type Local] = “A”)

Lastly, I’ve attached my table relationships.


I hope this is possible and thanks for any help you can provide.