Why is this measure working? It shouldn't?

Diane · July 8, 2020, 4:11pm

I’m trying to help optimize a PBIX for someone pretty new with DAX. I dug into the code and they’ve got the following GROUPBY measure (this is also the main culprit for performance for somewhat obvious reasons). What I don’t get is how/why its working without CURRENTROW(). It does provide the correct values, just very slowly. Can you shine any light on what’s happening with this? The only way I can think of to replicate it is a summarize within the sumx, but that is going to also be a performance killer.

Measure = 
sumx(
    GROUPBY(
        'Sales',
        'Sales'[Record Num]        
    ),
    if(
        [OtherMeasure] < ([OtherMeasure2]),
        divide(
            (OtherMeasure3]*[OtherMeasure4]),
            [OtherMeasure2]
        ),
        0
    )
)

AntrikshSharma · July 8, 2020, 4:24pm

Can you try this once for me and share if there is any performance gain? Also what are the definitions of othermeasure2…4?

Measure  =
SUMX (
    FILTER (
        GROUPBY ( Sales, Sales[Record Num] ),
        [OtherMeasure] < [OtherMeasure2]
    ),
    DIVIDE ( [OtherMeasure3] * [OtherMeasure4], [OtherMeasure2] ) + 0
)

BrianJ · July 8, 2020, 4:28pm

@Diane,

Do you have a copy of The Definitive Guide to DAX, 2nd ed? Starting on page 421 they talk about this issue specifically, and recommend that in the vast majority of cases the ADDCOLUMNS/SUMMARIZE pair will accomplish the same thing as GROUPBY, but with much better performance.

Brian

Diane · July 8, 2020, 4:29pm

I could - but I was boggled that groupby is working without currentgroup. Everything I’ve read says that its required. I don’t want to tell them to use a function that will only work sporadically. The other measures are pretty complicated as well, but return results instantly.

BrianJ · July 8, 2020, 5:35pm

@Diane,

GROUPBY is not a function I use, so I had to do some additional digging on this at lunch and frankly what I found was really confusing. The article below from SQLBI does indicate that CURRENTGROUP() is necessary, but they also make reference to the fact that Microsoft may change this (and the SQLBI article is at least a few years old):

However, in the Microsoft documentation on this function it does not indicate that CURRENTGROUP() is required.

Finally, in the Definitive Guide to DAX, it says “columns added by GROUPBY need to be iterations over CURRENTGROUP…”

My takeaways from this:

try to steer clear of GROUPBY if ADDCOLUMNS/SUMMARIZE will work instead
if you have to use GROUPBY, use in conjunction with CURRENTGROUP even if technically the measure will run without it.

Hope this is helpful.

– Brian

BrianJ · July 8, 2020, 5:49pm

@Diane,

Sorry – one more thing. Russo indicates that the main difference between SUMMARIZE and GROUPBY is that the latter can group by columns whose data lineage does not correspond to columns in the data model, while the former cannot.

However, since you are grouping on Sales[Record Num] which is a physical column within the data model, then the use of GROUPBY is not necessary, and can be replaced by ADDCOLUMNS/SUMMARIZE.

Brian

Diane · July 8, 2020, 6:10pm

Unfortunately, inside the sumx, they’re using measures that utilize MAXX and others that have SELECTEDVALUE - which means that the addcolumns method won’t work as a measure. This is also the reason that currentgroup doesn’t work. Though I’m considering making it a calculated table with extra columns for the selectedvalue choices (since they are limited) and then using a measure to select the answer.

I can’t share this PBIX as is, but may try to put together a pseudo copy to explain the issue if that doesn’t work.

Nick_M · July 8, 2020, 9:27pm

@Diane
This is an interesting topic for sure. Though with the info provided already it will be hard, if not impossible, to really fix this issue as their are just too many unknowns. If you can, a redacted version of the PBIX of what is going on would be ideal.

But high level:

What does the data model look like? What type of relationships are there?
What do the tables/columns look like? The higher the level of unique values in a column will slow down performance and increase storage
I’d be weary of Grouping/Summarizing a Fact table with a column from the fact table, probably want to use a column from a Dimension. That is why you will usually see something like :

EVALUATE

GROUPBY(
Sales
,‘Date’[Date]
)
What are those other measures really doing?
Can try setting those as variables and see if that helps
Or can even try using the fully extended code and see
DIVIDE will slow down performance try to catch errors in the measures and can just use “/”

Diane · July 8, 2020, 10:46pm

This code is just a nightmare, to be honest. I received it as a single fact table (that shouldn’t be) and a bunch of dimensions that are kind of simulating a star schema - emphasis on “kind of”. The previous version was also a giant lump of spaghetti as far as the measures were concerned, which made it worse. Its like someone set out to make a “here’s what not to do” example and then needed to know how to make it usable.

I’m going to try to pare it down to something I can scrub and share, as I’m really at my limit for ideas on how to make this better without making the performance even worse. I’ll post later today. Maybe fresh eyes will see it in a way I can’t right now.

EnterpriseDNA · July 9, 2020, 1:24am

Hii @Diane, please be sure to check out our DAX Clean Up tool it’s a great way to ensure to make your DAX code easy to read. Thanks!

https://analysthub.enterprisedna.co/dax-clean-up

Nick_M · July 9, 2020, 11:26am

I think we’ve all been in that situation one way or another. But the first step in optimizing a DAX querry is making sure the data model is set up as best we can to get DAX to work most effecitively (star schema (long skinny tables are always better than short wide ones), reduce cardinality as best we can, remove columns we don’t need, etc). If we fix that as much as we can and the DAX is still slow it will be easier to troubleshoot and we can do some different optimizing techinques, but why go through all of that if we can just fix the data model?!

See what you can come up with. In the meantime, are you familiar with DAX studio? If so, can you connect to your PBIX file and run teh “View Metrics” in the Advanced tab? (be sure to have the latest version of DAX studio)

That will provide a new tab “Vertipaq Analyzer Metrics” with some more tabs in there. Could you screen shot the “Tables” , “Columns” (please sort by “Col Size”), “Relationshi[s” (please expand them all)

And if possible a screenshot of the actual data model would be great.

Diane · July 9, 2020, 4:30pm

Hi Nick,

Yes, I do use DAX studio and have already done some of that. Unfortunately, these measures were written as strange, cascading spaghetti with most only there to use in calculations with each other, most of which utilize slicers with selectedvalue. If not for that, a lot of this could be done in power query.

It’s a huge amount of data to scrub, so what I’m going to do is cut the underlying data down by at least half the columns (all just more fact columns that aren’t needed for these calcs) before I upload something. Hopefully I’ll have the time today to get it there as well as the Vertipaq screenshots.

Thanks!

EnterpriseDNA · July 13, 2020, 2:46am

It’s great to know that you are making progress with your query @Diane. Please don’t forget if your question has been answered within the forum it is important to mark your thread as ‘solved’. Also, we’ve recently launched the Enterprise DNA Forum User Experience Survey, please feel free to answer it and give your insights on how we can further improve the Support forum. Thanks!