Help Optimize DAX Measure

Hi all,

I have created a complex DAX measure that runs on a filtered dataset but becomes unusable on the full dataset. The calculations are accurate, but I need to optimize it for performance.

The Problem
Here is the error I get when trying to run it on the full dataset:

eyates_0-1692841157673.png

However, when I use some filters, it works as intended:

In the screenshot above, the numbers on the side represent customer ids. The numbers in the values section of the matrix are indicators showing when a customer is counted in the ‘beginning balance’ or ‘new sale’, etc.

The matrix acts as a type of ledger table that tracks when a customer is considered active or inactive.
Begin Balance means that a customer is active as of the first of the month.
New Start means that a customer became a new member or returning member during that month.
Canceled and Expired indicate that a customer is no longer active within that month.
End balance means that the customer is active as of the end of the month.

The ledger works by the following formula
Begin Balance + New Start - Canceleld - Expired = End Balance.

Each measure is built similarly so I will choose the Expired measure since it is the most complex

.

Model and Sample Data:

Here is a screenshot of my model:

The relationships are as follows:

Branch Service Agreements by Customer

Branch Sk 1:* Branch Sk

Agreement Event Type Service Agreements by Customer

Agreement Event Type Sk 1:* Agreement Event Type Sk

Location Service Agreements by Customer

Location Sk 1:* Location Sk

Here is a sample of what the Service Agreements by Customer data looks like:

Location Sk Service Agreement Sk Branch Sk Agreement Event Type Sk Offset Agreement Start Date Agreement Begin Date Agreement Expired Or Canceled Date Agreement End Date Previous Agreement Expired Or Canceled Date Location Sk - Copy Service Agreement Id Previous Service Agreement Id Future Agreement Start Date
7707 1270133 54 5 1/7/2023 1/8/2022 4/8/2023 1/8/2023 7707 1308929 1/7/2023
7707 1356440 54 6 1/7/2023 4/6/2024 1/7/2024 4/8/2023 7707 1395552 1308929
12864 1268235 37 4 11/8/2022 5/11/2023 5/11/2023 12864 1304419
12874 1266588 41 6 11/6/2022 2/4/2024 11/6/2023 12874 1304024

DAX Code Summary

Here is a summary of the code. Thanks to Chat GPT :slightly_smiling_face:

Variables

  1. MaxDate: Calculates the maximum date in the Calendar table.
  2. CurrentMonthStart: Finds the first day of the month based on MaxDate.
  3. CurrentMonthEnd: Finds the last day of the month based on MaxDate.
  4. AgreementsByCustomer: Creates a summarized table from the ‘Service Agreements by Customer’ table. It includes columns like Location Sk, Agreement Event Type Sk, Agreement Begin Date, and Agreement Expired Or Canceled Date. It also removes any filters on the Calendar table and excludes rows where Location Sk is blank.
  5. PreviousAgreements: Filters AgreementsByCustomer to only include agreements that both started and expired or were canceled before the start of the current month.
  6. CurrentAgreements: Filters AgreementsByCustomer to only include agreements that expire or are canceled within the current month and have an Agreement Event Type Sk of 5.
  7. FutureAgreements: Filters AgreementsByCustomer to only include agreements that will expire or be canceled after the current month ends.

Additional Calculations

  1. IterationTable: Adds two new columns to CurrentAgreements:
  • PreviousActiveAgreement: Checks if there are any previous agreements for the same location that expired or were canceled on or after the current agreement’s expiration or cancellation date.
  • FutureActiveAgreement: Checks if there are any future agreements for the same location that begin before the current agreement’s expiration or cancellation date.
  1. FilterTable: Filters IterationTable to only include rows where both PreviousActiveAgreement and FutureActiveAgreement are 0 (i.e., there are no overlapping previous or future agreements).

Final Output

  1. Result: Counts the number of unique Location Sk in FilterTable.
  2. RETURN Result: Returns the final count as the output of the DAX expression.

DAX Code:

Expired =
VAR MaxDate =
    CALCULATE ( MAX ( Calendar[Calendar Date] ), ( Calendar ) )
VAR CurrentMonthStart =
    EOMONTH ( MaxDate, -1 ) + 1
VAR CurrentMonthEnd =
    EOMONTH ( MaxDate, 0 )
VAR AgreementsByCustomer =
    CALCULATETABLE (
        SUMMARIZE (
            'Service Agreements by Customer',
            'Service Agreements by Customer'[Location Sk],
            'Service Agreements by Customer'[Agreement Event Type Sk],
            'Service Agreements by Customer'[Agreement Begin Date],
            'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
        ),
        ALL ( Calendar ),
        NOT ISBLANK ( 'Service Agreements by Customer'[Location Sk] )
    )
VAR PreviousAgreements =
    FILTER (
        AgreementsByCustomer,
        'Service Agreements by Customer'[Agreement Begin Date] < CurrentMonthStart
            && 'Service Agreements by Customer'[Agreement Expired Or Canceled Date] < CurrentMonthStart
    )
VAR CurrentAgreements =
    FILTER (
        AgreementsByCustomer,
        'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentMonthStart
            && 'Service Agreements by Customer'[Agreement Expired Or Canceled Date] <= CurrentMonthEnd
            && 'Service Agreements by Customer'[Agreement Event Type Sk] = 5
    )
VAR FutureAgreements =
    FILTER (
        AgreementsByCustomer,
        'Service Agreements by Customer'[Agreement Expired Or Canceled Date] > CurrentMonthEnd
    )
VAR IterationTable =
    ADDCOLUMNS (
        CurrentAgreements,
        "PreviousActiveAgreement",
            VAR CurrentAgreementExpired = 'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
            VAR LocationSk = 'Service Agreements by Customer'[Location Sk]
            VAR PreviousActiveAgreements =
                FILTER (
                    PreviousAgreements,
                    'Service Agreements by Customer'[Location Sk] = LocationSk
                        && 'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentAgreementExpired
                )
            RETURN
                IF ( COUNTROWS ( PreviousActiveAgreements ) = 0, 0, 1 ),
        "FutureActiveAgreement",
            VAR CurrentAgreementExpired = 'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
            VAR LocationSk = 'Service Agreements by Customer'[Location Sk]
            VAR FutureActiveAgreement =
                FILTER (
                    FutureAgreements,
                    'Service Agreements by Customer'[Location Sk] = LocationSk
                        && IF (
                            ISBLANK ( 'Service Agreements by Customer'[Agreement Begin Date] ),
                            FALSE (),
                            'Service Agreements by Customer'[Agreement Begin Date] < CurrentAgreementExpired
                        )
                )
            RETURN
                IF ( COUNTROWS ( FutureActiveAgreement ) >= 1, 1, 0 )
    )
VAR FilterTable =
    FILTER (
        IterationTable,
        [PreviousActiveAgreement] = 0
            && [FutureActiveAgreement] = 0
    )
VAR Result =
    COUNTROWS (
        SUMMARIZE ( FilterTable, 'Service Agreements by Customer'[Location Sk] )
    )
RETURN
    Result

@Everett,

That is definitely a complex measure.

Donald Knuth famously said, “Premature optimization is the root of all evil,” which underscores the importance of understanding when to optimize. But demands of real-world applications often compel us to commit this very sin :slight_smile: With Power BI, the efficiency of DAX measures can dramatically influence UX and report responsiveness—especially with large datasets—and we sometimes find ourselves in a position where (early) optimization is unavoidable. We are in one such scenario here.

Some things you might want to try to potentially improve performance include:

  1. The variables CurrentAgreementExpired and LocationSk are being calculated twice within IterationTable. Consolidating these could potentially improve performance:

    This consolidates the repeated filtering of AgreementsByCustomer into a single IterationTable using the SWITCH() function and may work for you:

          .
          .
          .
          VAR IterationTable = `ADDCOLUMNS()`(
              AgreementsByCustomer,
              "AgreementType", 
                  `SWITCH()`(
                      TRUE(),
                      'Service Agreements by Customer'[Agreement Begin Date] < CurrentMonthStart &&
                      'Service Agreements by Customer'[Agreement Expired Or Canceled Date] < CurrentMonthStart, "Previous",
                      'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentMonthStart &&
                      'Service Agreements by Customer'[Agreement Expired Or Canceled Date] <= CurrentMonthEnd &&
                      'Service Agreements by Customer'[Agreement Event Type Sk] = 5, "Current",
                      'Service Agreements by Customer'[Agreement Expired Or Canceled Date] > CurrentMonthEnd, "Future",
                      BLANK()
                  )
          )
          .
          .
          .
    
  2. AgreementsByCustomer is being filtered 3x to create PreviousAgreements, CurrentAgreements, and FutureAgreements. That might lead to performance overhead. It may be possible to filter the table once and then derive the other tables from the filtered table.

  3. The ALL() function removes all the filters applied to the table. This can lead to performance overhead, especially on large datasets. In your Expired measure, you are using ALL( Calendar ). If you can avoid that or limit its scope, it might help.

    ALL() removes all filters applied to a given table or column(s). You’re telling DAX to “forget” any context and look at the entire table. When you remove filters from a large table, Power BI has to load more data into memory. This means any subsequent calculations, aggregations, or operations will be performed on a much larger dataset, leading to more memory consumption and potentially longer computation times.

    Sometimes, ALL() is used more broadly than necessary. For example, if you only need to remove filters from specific columns rather than an entire table, you can use ALL() on just those columns. By limiting its scope, you can potentially improve performance.

  4. Your relationship view didn’t show your Calendar table, but if there is a relationship between the Service Agreements by Customer table and your Calendar table, you can leverage it instead of using the ALL() function. By “leveraging” I mean making use of a relationship to your advantage.

    For instance, if you have a relationship between the ‘Service Agreements by Customer’ table and the ‘Calendar’ table, you can use related functions like RELATED() or RELATEDTABLE() to access columns from the related table. You can also use the USERELATIONSHIP() function to activate a specific relationship if there are multiple relationships. By using the relationships effectively, you can reduce the need for removing filters, which can improve performance.

  5. The COUNTROWS() and SUMMARIZE() combination can be performance heavy on large datasets. Some possible alternatives include:

    • GROUPBY() is similar to SUMMARIZE(), but it allows for the inclusion of aggregate functions directly. It can be faster than SUMMARIZE();

    • Instead of summarizing and then counting rows, you can use iterators like SUMX() or COUNTX() to iterate over the table and aggregate the results directly. This can be more efficient;

    • If you can, avoid creating intermediate tables (like with SUMMARIZE()) and instead perform your calculations and aggregations directly on the source table.

  6. Additionally, instead of using IF( COUNTROWS( PreviousActiveAgreements ) = 0, 0, 1 ), you can use IF( ISEMPTY( PreviousActiveAgreements ), 0, 1 ). ISEMPTY() is more efficient than COUNTROWS() for checking if a table is empty.

The main things I considered were minimizing redundant calculations, efficiently filtering tables to reduce computation overhead, leveraging table relationships, and opting for more efficient DAX functions where applicable. The goal is to streamline computations, reduce memory usage, and ultimately speed the evaluation process.

hth

2 Likes

Hi @HufferD , I appreciate your thorough reply. I realize how my code could be overcomplicated, but my background is in SQL, and I’m not as familiar with DAX as I’d like to be.

That said, I tried out some of your solutions and can see some improvements. However, I was hoping you could help me better understand how to implement some of your solutions because I’m running into an issue I’ve had in the past, but don’t know how to solve with this new implementation.

The Problem
Here are some screenshots of the ledger table with the new “Expired” measure added. However, it is currently marking “Expired” for every “Agreement Expired or Canceled Date,” which is not the intended behavior. Even if an agreement is marked as expired, the customer should only be marked as expired if they have no active agreements. Therefore, in this example, the customer should only be marked as expired on 2/11/2022.



image

Note that NEW EXPIRED is not included in the End Month Balance.

Here is my revised DAX code with your implementation of the IterationTable

VAR MaxDate =
    CALCULATE( MAX( Calendar[Calendar Date] ), ( Calendar ) )
VAR CurrentMonthStart = EOMONTH( MaxDate, -1 ) + 1
VAR CurrentMonthEnd = EOMONTH( MaxDate, 0 )
    
    VAR AgreementsByCustomer =    CALCULATETABLE(
        SUMMARIZE(
            'Service Agreements by Customer',
            'Service Agreements by Customer'[Location Sk],
            'Service Agreements by Customer'[Agreement Event Type Sk],
            'Service Agreements by Customer'[Agreement Begin Date],
            'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
        ),
        ALL( Calendar ),
        NOT ISBLANK( 'Service Agreements by Customer'[Location Sk] )
    )

VAR IterationTable =
    ADDCOLUMNS(
        AgreementsByCustomer,
        "AgreementType",
            SWITCH(
                TRUE(),
                'Service Agreements by Customer'[Agreement Begin Date] < CurrentMonthStart &&
                'Service Agreements by Customer'[Agreement Expired Or Canceled Date] < CurrentMonthStart, "Previous",
                
                'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentMonthStart &&
                'Service Agreements by Customer'[Agreement Expired Or Canceled Date] <= CurrentMonthEnd &&
                'Service Agreements by Customer'[Agreement Event Type Sk] = 5, "Current",
                
                'Service Agreements by Customer'[Agreement Expired Or Canceled Date] > CurrentMonthEnd, "Future",
                BLANK()
            )
    )

VAR FilterTable =
    FILTER(
        IterationTable, [AgreementType] = "Current"
    )
        
VAR Result =
    COUNTROWS(
        SUMMARIZE(
            FilterTable,
            'Service Agreements by Customer'[Location Sk]
        )
    )
RETURN
    Result

Hi @Everett!

This represents a common issue when transitioning from SQL to DAX: thinking in set-based operations vs. row-wise evaluations. The problem seems to be the unintentional flagging of all expired agreements, regardless of the current active status.

The [Expired] measure is constructed by first creating a summarized table of all agreements (removing any date context with ALL( Calendar ) ) and then iteratively adding the “AgreementType” based on the date context. The final evaluation counts rows in the “Current” type agreements.

if your requirements are that a) you want to flag as “Expired” only those customers who have no active agreements and b) the “Expired” status should be marked only on the Agreement Expired Or Canceled Date then you could adjust the measure as:

Expired = 
VAR MaxDate =
    CALCULATE( MAX( Calendar[Calendar Date] ), ALL( Calendar ) )
VAR CurrentMonthStart = EOMONTH( MaxDate, -1 ) + 1
VAR CurrentMonthEnd = EOMONTH( MaxDate, 0 )

VAR AgreementsByCustomer = CALCULATETABLE(
    'Service Agreements by Customer',
    ALL( Calendar ),
    NOT ISBLANK( 'Service Agreements by Customer'[Location Sk] )
)

VAR CurrentAgreements = FILTER(
    AgreementsByCustomer,
    'Service Agreements by Customer'[Agreement Begin Date] <= MaxDate &&
    (ISBLANK('Service Agreements by Customer'[Agreement Expired Or Canceled Date]) || 
     'Service Agreements by Customer'[Agreement Expired Or Canceled Date] > MaxDate)
)

VAR ExpiredAgreements = FILTER(
    AgreementsByCustomer,
    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] = MaxDate
)

VAR Result = 
    IF(
        COUNTROWS(CurrentAgreements) = 0 && COUNTROWS(ExpiredAgreements) > 0,
        1,
        0
    )

RETURN Result

That is untested, of course, because I don’t have your .pbix file or a toy dataset :wink: but i think it will perform as expected and it’s a more optimized than your original.

Hey @HufferD

Thank you for your response. I tried your solution, but I did not get any results.

I don’t understand why you’re using MaxDate to compare dates. It always returns the latest date in my calendar table, which results in no rows. I tried switching out MaxDate for CurrentMonthStart or CurrentMonthEnd, but that didn’t work either.

I would share my pbix file, but unfortunately, I can’t because it is sensitive data.

Okay, sorry for that.

The use of MaxDate in the measure was intended to dynamically determine the current context’s maximum date, which would help in defining the “current month” for the analysis. The idea was that by looking at the largest date in the filtered context, we could define the boundaries of the current month (i.e., the start and end dates).

However, if the MaxDate consistently returns the latest date in the entire Calendar table, it indicates that the calculation isn’t respecting the current context because of the use of ALL( Calendar ). This would make all comparisons in the context of the latest available date in the calendar.

could you try dropping the MaxDate VAR definition and replacing CurrentMonthStart and CurrentMonthEnd with:

VAR CurrentMonthStart = CALCULATE(MIN(Calendar[Calendar Date]))
VAR CurrentMonthEnd = CALCULATE(MAX(Calendar[Calendar Date]))

Please try integrating this change into the measure and see if it resolves the issue.

Otherwise, you might want to think about mocking-up some toy data and a sample pbix file.

Hi @HufferD

Thanks again for your reply. Unfortunately, that didn’t work either.
This is one of the versions I’ve tried, but it didn’t work for me.

VAR CurrentMonthStart = CALCULATE(MIN(Calendar[Calendar Date]))
VAR CurrentMonthEnd = CALCULATE(MAX(Calendar[Calendar Date]))

VAR AgreementsByCustomer = CALCULATETABLE(
    'Service Agreements by Customer',
    ALL( Calendar ),
    NOT ISBLANK( 'Service Agreements by Customer'[Location Sk] )
)


VAR CurrentAgreements = FILTER(
    AgreementsByCustomer,
    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentMonthStart &&
    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] <= CurrentMonthEnd &&
    'Service Agreements by Customer'[Agreement Event Type Sk] = 5
)

VAR ExpiredAgreements = FILTER(
    AgreementsByCustomer,
    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] = CurrentMonthEnd
)

VAR Result = 
    IF(
        COUNTROWS(CurrentAgreements) = 0 && COUNTROWS(ExpiredAgreements) > 0,
        1,
        0
    )

    
RETURN Result

I made some alterations to my previous one and still get accurate results, but it won’t work on the full-size dataset. I’d like to implement your solution but I don’t know how I’d be able to reference a future or previous agreement like how I did in my implementation of the IterationTable like this:

FILTER(AgreementsByCustomer, [AgreementType] = "Current"),
            "PreviousActiveAgreement",
                VAR CurrentAgreementExpired = 'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
                VAR LocationSk = 'Service Agreements by Customer'[Location Sk]
                VAR PreviousActiveAgreements =
                    FILTER(
                         AgreementsByCustomer, 
                        'Service Agreements by Customer'[Location Sk] = LocationSk
                        && [AgreementType] = "Previous"
                            && 'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentAgreementExpired

Full Set of DAX code

VAR MaxDate = CALCULATE( MAX( Calendar[Calendar Date] ), ( Calendar ) )
VAR CurrentMonthStart = EOMONTH( MaxDate, -1 ) + 1
VAR CurrentMonthEnd = EOMONTH( MaxDate, 0 )

    VAR AgreementsByCustomer =
    CALCULATETABLE(
            ADDCOLUMNS(
                GROUPBY(
                    'Service Agreements by Customer',
                    'Service Agreements by Customer'[Location Sk],
                    'Service Agreements by Customer'[Agreement Event Type Sk],
                    'Service Agreements by Customer'[Agreement Begin Date],
                    'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
                ), "AgreementType",
                SWITCH(
                    TRUE(),
                    'Service Agreements by Customer'[Agreement Begin Date] < CurrentMonthStart &&
                    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] < CurrentMonthStart, "Previous",
                    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentMonthStart &&
                    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] <= CurrentMonthEnd &&
                    'Service Agreements by Customer'[Agreement Event Type Sk] = 5, "Current",
                    'Service Agreements by Customer'[Agreement Expired Or Canceled Date] > CurrentMonthEnd, "Future",
                    BLANK()
                )
            )
    , ALL( Calendar )
    --, 'Service Agreements by Customer'[Location Sk] = 4110790
    )

  


    VAR IterationTable =
        ADDCOLUMNS(
            FILTER(AgreementsByCustomer, [AgreementType] = "Current"),
            "PreviousActiveAgreement",
                VAR CurrentAgreementExpired = 'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
                VAR LocationSk = 'Service Agreements by Customer'[Location Sk]
                VAR PreviousActiveAgreements =
                    FILTER(
                         AgreementsByCustomer, 
                        'Service Agreements by Customer'[Location Sk] = LocationSk
                        && [AgreementType] = "Previous"
                            && 'Service Agreements by Customer'[Agreement Expired Or Canceled Date] >= CurrentAgreementExpired
                            
                    )
                RETURN
                    IF( ISEMPTY( PreviousActiveAgreements ), 0, 1 ),
                    
             "FutureActiveAgreement",
                VAR CurrentAgreementExpired = 'Service Agreements by Customer'[Agreement Expired Or Canceled Date]
                VAR LocationSk = 'Service Agreements by Customer'[Location Sk]
                VAR FutureActiveAgreement =
                    FILTER(
                        AgreementsByCustomer, 
                        'Service Agreements by Customer'[Location Sk] = LocationSk
                        && [AgreementType] = "Future"
                            && IF( ISBLANK( 'Service Agreements by Customer'[Agreement Begin Date] ), FALSE(), 'Service Agreements by Customer'[Agreement Begin Date] < CurrentAgreementExpired )
                            
                    ) 
                RETURN
                    IF( ISEMPTY( FutureActiveAgreement ), 0, 1 )
        )
        
    
    VAR Result =
        COUNTROWS( 
            GROUPBY( FILTER( IterationTable, [PreviousActiveAgreement] = 0 && [FutureActiveAgreement] = 0 )
                , 'Service Agreements by Customer'[Location Sk]
            )
        )
        
    RETURN Result

@HufferD

Here is a sample PBIX file that contains the ledger measures.
As a recap, we discussed the ‘# Expired’ measure.

Note that I’ve anonymized this data and have changed the source to Excel files.

The file only contains 2 customers that cover 2 different scenarios:

  • Customer starts and stops an agreement in the same month while having an already active agreement.
    Location ID: 2809061

  • Customer’s previous agreement expired after the current agreement was canceled
    Location ID: 2826865

Memberships Sample File.pbix (536.3 KB)

Hey @HufferD, Sorry to bother, but I was hoping you could take a look at my latest version. I hope this will be the final stretch of this puzzle :slight_smile:

I made progress with the new version, but there’s an issue with the V2 Expired measure. The 1 indicator only shows up when I select that customer’s Location SK. I want it to appear regardless of the selection, but I’m not sure how to handle the filters. Any help would be appreciated.

Latest PBIX file:
Memberships Sample File.pbix (532.9 KB)