Calculating Difference Between Dates in Same Column with Multiple Events on Same Day

Hi. Relatively new to Power BI and having trouble calculating the difference between dates in the following situation. The first four columns below are part of a report I am creating to track HR hiring actions by the action ID and the individual process steps. I am trying to create a measure or column (DaysBtwSteps) that calculates the elapsed days between each step and the previous one for each HR action.

I’ve tried to adapt the DAX code I’ve found in similar forum posts, but where I get stuck is the situation below where multiple steps get completed on the same day. In this situation, I want DaysBtwSteps to return the elapsed days for the first event on that day, and 0 for all subsequent events on that same day. The DaysBtwSteps column below is how I would like the measure/column to work.

Any assistance on this would be greatly appreciated.

Thanks.

  • Brian

This missing part here is how do you want to show this?

Do you only want this is a calculating column or should you be working on doing this via a measure (which I highly highly recommend)

Measures allow you to create more dynamic formulas that are re-usable, so I would really recommend going down this path compared to where you are right now.

This is how you would do it with a measure

DaysBetween = 
VAR CurrentPurchaseDate = MAX( 'New Table'[Purchase Date] )
VAR PriorMaxDate = CALCULATE( MAX( 'New Table'[Purchase Date] ), FILTER( ALL( 'New Table' ), 'New Table'[Purchase Date] < CurrentPurchaseDate ) )
VAR FirstOverallDate = CALCULATE( MIN( 'New Table'[Purchase Date] ), ALL( 'New Table' ) )

RETURN
IF( MIN( 'New Table'[Purchase Date] ) = FirstOverallDate,
    1,
        VALUE( CurrentPurchaseDate - PriorMaxDate ))

This also works for the multiple dates scenario. See below

image

See how you go with this.

Measures are the way to go in nearly all cases when running calculations. Check out here to learn more.

Thanks very much for your response – this is really helpful. I definitely preferred to do this as a measure rather than a column, but my previous attempts involved the EARLIER function, and I ran into problems using that in a measure. I find your approach using variables instead of the EARLIER function to be a lot more intuitive. I have adapted your DAX code to aggregate on my HR ID field (comparable to your Customer ID). It’s working perfectly, except for the situations in which I have multiple events occurring on the same day. The graphic below illustrates how the code is currently working, versus how I would like it to work.

My plan is to use this measure within other Sum and Cumulative Total Measures, but until I fix this problem they will return overstated values. Below is the modified DAX code I wrote based on yours. Any help in fixing the multiple event on the same date problem would be greatly appreciated - it’s critical to my model and has me totally stumped.

Thanks!

  • Brian

DaysBetweenProcessSteps = 
VAR CurrentDate =
    CALCULATE (
        MAX ( FactHrActionsStatus[ProcessStepCompletionDate] ),
        ALLEXCEPT (
            FactHrActionsStatus,
            FactHrActionsStatus[HrActionID],
            FactHrActionsStatus[ProcessStepID]
        )
    )
VAR PriorMaxDate =
    CALCULATE (
        MAX ( FactHrActionsStatus[ProcessStepCompletionDate] ),
        FILTER (
            ALLEXCEPT ( FactHrActionsStatus, FactHrActionsStatus[HrActionID] ),
            FactHrActionsStatus[ProcessStepCompletionDate] < CurrentDate
        )
    )
VAR FirstOverallDate =
    CALCULATE (
        MIN ( FactHrActionsStatus[ProcessStepCompletionDate] ),
        ALLEXCEPT ( FactHrActionsStatus, FactHrActionsStatus[HrActionID] )
    )
RETURN
    IF (
        MIN ( FactHrActionsStatus[ProcessStepCompletionDate] ) = FirstOverallDate,
        1,
        DATEDIFF ( PriorMaxDate, CurrentDate, DAY )
    )

Ok I see, I thought it was the other way.

I think this actually isn’t to difficult at all then but needs a slightly different approach to what I just had.

I think your existing table is actually setup well because you basically need an index column to do this.

All you need to do is this.

First you need to make sure you have an index. I didn’t in mine so I created one in the query editor

image

Now I have it in my table

Then I can actually use this column to working out the day difference.

DaysBetweenSteps =
VAR IndexNum = MAX( Testing[Index] )
VAR PreviousIndexNum = CALCULATE( MAX( Testing[Index] ), FILTER( ALLSELECTED( Testing ), Testing[Index] &lt; IndexNum ) )
VAR CurrentDate = VALUE( SELECTEDVALUE( Testing[Purchase Date] ) )
VAR PriorDate = VALUE( CALCULATE( SELECTEDVALUE( Testing[Purchase Date] ), FILTER( ALL( Testing ), Testing[Index] = PreviousIndexNum ) ) )

RETURN
CurrentDate - PriorDate

See how you go with these ideas

1 Like

Thanks. I thought I understood the logic of your revised approach, but when I modified it to aggregate on HR ID and also to deal with blanks in the ProcessStepCompletionDate field when calculating the MAX of the index, I clearly broke something. The CurrentDate variable appears to be working fine, but the PriorDate appears stuck on the date immediately prior to the MAX date within each HR ID. Per the sample screenshot below, instead of DateBetweenProcessSteps calculating dynamically as the difference between CurrentDate and PriorDate, instead it is calculating as CurrentDate minus 04/02/19 (static). Hopefully, this is an obvious and easily fixable error, but I have spent the better part of today debugging the DAX code below and can’t spot the problem.

DaysBetweenProcessSteps =
VAR IndexNumMax =
    CALCULATE (
        MAX ( FactHrActionsStatus[ProcessStepIndex] ),
        ALLEXCEPT ( FactHrActionsStatus, FactHrActionsStatus[HrActionID] ),
        NOT ( ISBLANK ( FactHrActionsStatus[ProcessStepCompletionDate] ) )
    )
VAR PreviousIndexNum =
    CALCULATE (
        MAX ( FactHrActionsStatus[ProcessStepIndex] ),
        FILTER (
            ALLSELECTED ( FactHrActionsStatus ),
            FactHrActionsStatus[ProcessStepIndex] < IndexNumMax
        )
    )
VAR CurrentDate =
    SELECTEDVALUE ( FactHrActionsStatus[ProcessStepCompletionDate] )
VAR PriorDate =
    CALCULATE (
        SELECTEDVALUE ( FactHrActionsStatus[ProcessStepCompletionDate] ),
        FILTER (
            ALL ( FactHrActionsStatus ),
            FactHrActionsStatus[ProcessStepIndex] = PreviousIndexNum
        )
    )
RETURN
    DATEDIFF ( PriorDate, CurrentDate, DAY )

Any further insight and assistance would be greatly appreciated. Thanks much.

  • Brian

I’m a bit confused as you formula above doesn’t look anything like the solution I’ve provided here? This is likely why it’s not working.

Why can’t you just use what I’ve provided? It should work.

I would need to see a demo file to assist further here as I worked on this for a little while and thought I provided what you required.

One thing I recommend here is to break this out in a table. Work through every piece of the formula and try to identify where it is breaking down. This is exactly how I worked through it myself.

Thanks
Sam

You’re absolutely right – I was WAY overcomplicating this. With a fresh look at it, all I needed to do to adapt the solution you provided was to create one more variable to calculate the first date for each action, and then wrap the RETURN expression in a CALCULATE ( IF ( ) ) to handle the first date issue and any subsequent blank dates.

Greatly appreciate your expertise and willingness to work through this with me. I’ve learned a ton through this process, and will be able to reuse this measure repeatedly across a lot of different models that deal with sequential process data.

Here’s the final measure:

DaysBetweenProcessSteps = 
VAR FirstOverallDate =
    CALCULATE (
        MIN ( FactHrActionsStatus[ProcessStepCompletionDate] ),
        ALLEXCEPT ( FactHrActionsStatus, FactHrActionsStatus[HrActionID] )
    )
VAR IndexNum =
    MAX ( FactHrActionsStatus[ProcessStepIndex] )
VAR PreviousIndexNum =
    CALCULATE (
        MAX ( FactHrActionsStatus[ProcessStepIndex] ),
        FILTER (
            ALLSELECTED ( FactHrActionsStatus ),
            FactHrActionsStatus[ProcessStepIndex] < IndexNum
        )
    )
VAR CurrentDate =
    VALUE ( ( SELECTEDVALUE ( FactHrActionsStatus[ProcessStepCompletionDate] ) ) )
VAR PriorDate =
    VALUE (
        (
            CALCULATE (
                SELECTEDVALUE ( FactHrActionsStatus[ProcessStepCompletionDate] ),
                FILTER (
                    ALL ( FactHrActionsStatus ),
                    FactHrActionsStatus[ProcessStepIndex] = PreviousIndexNum
                )
            )
        )
    )
RETURN
    CALCULATE (
        IF (
            MIN ( FactHrActionsStatus[ProcessStepCompletionDate] ) = FirstOverallDate,
            1,
            IF (
                SELECTEDVALUE ( FactHrActionsStatus[ProcessStepCompletionDate] ) = BLANK (),
                0,
                CurrentDate - PriorDate
            )
        ),
        FactHrActionsStatus[HrActionID]
    )

Thanks again!
Brian

That’s great

Sam

Hi,
I tried to apply this pattern to my case but I couldn’t make it.
I have to create a measure (a column would fit but the source is about 1M of rows) that calculate the difference in seconds between the activities of the same change case.
Summing up the gaps durations of a single case change would give the duration of the case.
The duration of cases with a single activity is zero.
Thanks for your assistance and help

Roberto

elapsed.xlsx (169.2 KB)

Hi,
I think I made it.
I had to use ALLEXCEPT to filter out all but Change ID

Any suggestion is very welcome

Thanks

Roberto

Elapsed correct:=VAR
CurrentPurchaseDate = MAX(changes[Activity timestamp])
VAR
PriorMaxDate = CALCULATE( MAX(changes[Activity timestamp]), FILTER( ALLEXCEPT(changes, changes[Change ID]), changes[Activity timestamp] < CurrentPurchaseDate ) )
VAR
FirstOverallDate = CALCULATE( MIN(changes[Activity timestamp] ), ALLEXCEPT( changes, changes[Change ID] ) )
RETURN

IF( MIN(changes[Activity timestamp]) = FirstOverallDate,    0,    DATEDIFF( PriorMaxDate ,CurrentPurchaseDate,SECOND))

elapsed.xlsx (169.4 KB)

Not sure if you have seen this video here, but it also details how to do this.

Thanks
Sam

Hi.
My solution worked on a limited number of rows, but when I try to use it on 800,000 records it simply hangs.
Trying to use the pattern with Index I realized that in Excel SELECTEDVALUE does not exist. is there any other way?

Thanks

Roberto

This does do some heavy computation, so for something of that size it could run a bit slow. This is unfortunately the reality sometimes.

There may be some further optimization but will depend on you data model and data.

Regarding an alternative to SELECTEDVALUE

Try IS( HASONEVALUE( column), VALUES( column ), BLANK )

…something like this will return the same thing.

Sam

1 Like

Where we can Get the resource please? send me the link

Hi Sam,

After trying your solution, I got the results I expected but the performance is really slow. I have a large data set to deal with (around 30,000 rows for the demo and 400,000 rows for the real data set). The solution for this issue also can be done by using calculated columns according to the Curbal video but I am using Datamart to deliver this report and Datamart does not support calculated columns or tables. Is there any efficient way to work around with this?

@trungsoaica ,

Welcome to the forum - great to have you here!

Two things that will help you get a good, quick answer to your question:

  1. Don’t post questions in threads that have already been solved. Many of us who scan the forum regularly for questions to answer focus only on open threads, so a question in a closed thread will often get overlooked.

  2. Generally, don’t address questions to specific people. There is a large group of very knowledgeable experts and members who repond to questions, and by addressing you question to a particular individual you limit the scope of people who possibly could help you. (also, while @sam.mckay answered lots of Qs in the earlier days of the forum, he rarely does so now).

Note that it’s fine to add a link back to a closed thread in your message to provide additional context re: your problem.

So, when you start a new thread on this topic, I do have some thoughts on improving performance of this solution that I will provide.

Thanks!

  • Brian
1 Like