Hey all,
Can someone please explain to me why the Sumx measure in my model is not right. (Side note, I am very new to BI and yes I have taken a few of Sam’s Dax courses , and plan to take more as it is clear I have more to learn). As you can see in my example I tried setting up measures to help sum by shipping type (inbound, outbound, stat). I know that the Sumx measures are wrong and the calculate ones are right.
However I would appreciate further explanation as to why my Sumx measures are wrong and what the expression is really doing, (ex: Sumx Inbound =
SUMX( FILTER( Sheet1, Sheet1[Type] = “Inbound Shipping”), [Total Cost]) )
I thought in my Sumx measures (I know that it is iterating (thanks to my coursework!), that I was saying in the expression; look row by row to see if the type = Inbound, if yes than sum that expression.
As you can see in my example that is not the case and the numbers are, for the most part, WAY off. It is interesting on the lower counts of shipments (about 12 or less) the 2 types of measures (Calculate and SumX ) do still produce the same results???
The reason why “CALCULATE()” version is yielding the correct results versus the “SUMX()” is due the following reasons. Let’s see first how “CALCULATE()” evaluates the results -
Important Note:-
1). Just couple of days back, I’ve provided the exact same explanation over “CALCULATE()” on a different thread. I would sincerely recommend you to please go through that thread for more better and in-depth understanding. Below is the link of that thread provided for the reference.
2). I’ll be considering only one “Type” as an example i.e., “Type = Inbound Shipping” and that same will be applicable, in the case of “Outbound Shipping” and “Stat Shipping” as well.
For “Calculate Inbound”, you’ve written the measure like this -
It means that, “CALCULATE()” automatically creates a row and filter context specifically over the “Type” column only and not over the entire “Sheet” table i.e., over the Fact table, in this case and then performs the context transition. And therefore, results are evaluated correctly. (Please refer the link of the thread about how CALCULATE executes the order/steps, row and filter context i.e., evaluation context, context transition and context modifiers. It contains all the details in that thread.).
Now, let’s see why “SUMX()” is yielding incorrect results in-depth since this part is not covered in that thread.
Note: Please go through the link/thread first which I’ve provided above to have better understanding about the “context transition” otherwise this explanation will still be not understood to it’s fullest extent.
Firstly, these are the mistakes that are committed in the data model (including measures) -
You’re simply using the “Flat File” in the data model i.e., there’re no dimension tables.
In the “SUMX()” related measure’s, you’re iterating the measure - “Total Cost” over the entire table i.e., over a flat file in this case.
It’s not a recommended best practice at all to iterate over the entire table, especially over the fact table. Always limit your iterations to the requisite/necessary columns. That is, in this case, “Type” column.
Explanation:-
Since you’re using a single flat file, the “Sheet” table is considered as a fact table, in this case. One of the characteristics of the fact table is that, they tend to contain “Duplicate Values”. And when you iterate over the entire fact table they provide an unexpected results which are hard to trace. See how -
This is the measure you’ve written for “Total Cost” -
Total Cost =
SUM( Sheet1[Amount] )
And then, you’ve referenced that same measure inside the “SUMX()” measure for calculating “Inbound”. This is the measure which you’ve written -
This is where things start to go downhill. As I suggested, fact tables tend to contain the duplicate values and on top of that it’s iterated over the entire fact table. So see, how that exact same measure is evaluated at the back-end of the Power BI engine.
Below is the formula alongwith the screenshot of the results evaluated at the back-end by Power BI provided for the reference - (Note: I’m showcasing the results where “Type = Inbound Shipping”, “Company Name = P” and “Amount = 6.61” to simply the explanation but that same logic is applicable throughout.)
That is, measure of “[Total Cost]” is surrounded by “CALCULATE()”. When we reference a previous measure in a new measure, previous measure is always surrounded by “Hidden CALCULATE()” in a new measure. And when this happens, context transition will automatically take place.
So since “Type = Inbound Shipping”, “Company Name = P” and “Amount = 6.61” are identical values and they’re repeated 10 times so all 10 rows will be muliplied with the “Amount = 6.61”. That is, “6.61 x 10 rows = 66.1”.
Similarly, when you’ve “Type = Inbound Shipping”, “Company Name = P” and “Amount = 1” where they’re repeated 4 times so all 4 rows will be muliplied with the “Amount = 1”. That is, “1 x 4 rows = 4”. See this -
The reason why this happens is because Power BI is not able to identify any unquie values for each row and therefore, it consider them as one single group. So the logic that gets applied is -
“1 single group x No. of Rows within that group x [Measure Referenced] (in this case, [Total Cost])”
This is why, I wrote those 3 three points before the explanation. If you would have written your measure like this inside the “SUMX()” -
Wherein context would’ve been applied/retained over the “Type” column and not over the entire table. In that case, context transistion would’ve happened over the “Type” column only and not over the entire table and finally, “SUMX()” would’ve provided the aggregated results and most importantly same results as “Calculate Inbound” measure . See this -
In case, you want to cross-verify my explanation with the results that I’ve showcased then I’m also attaching the working of the PBIX file for the reference purposes.
Hoping you find this useful and helpful.
EDIT:- The reason why I had to do multiple edits is because I had used the [Sumx Inbound] measure by mistake inside the ADDCOLUMNS() section rather than using [Total Cost] measure which did double context transition.