Identify non duplicates, to remove, and then calculate time spend on two steps, and average as well

@Jetuke_ES

I know that Table functions are generally more optimized, so I changed the M code for that step into:

Table.AddColumn(#"Grouped Rows for Duration", "TempTable", each Table.AddColumn(Table.AddColumn([AllRows], "Min", (x)=> Record.Field(Table.Min([AllRows], "Created On"), "Created On")), "Max", (y)=> Record.Field(Table.Min([AllRows], "Completed On"), "Completed On")))

Which does exactly the same as the previously used List function, I was hoping however to also be able to add a Buffer step but I just learned that is not possible for nested tables :slightly_frowning_face:

Fingers crossed
eDNA - Identify data sets to keep (2).pbix (37.0 KB)

1 Like

Thank you so much, giving it a go now.

Ok here are some other variations to try…

eDNA - Identify data sets to keep (3).pbix (40.2 KB)

eDNA - Identify data sets to keep (4).pbix (40.5 KB)

Now I’m all out of ideas, so let’s hope one of these will do the trick. :crossed_fingers:

1 Like

Somewhere between start and to remove duplicates and M Magic from 3MB it turns into 1.67GB. Do you think its because of group by or something? When i did the “Remove Duplicates” it reduced it significantly. I did manage to finish it this time though - Yaaay!! i wonder though what will happen when i will start adding data on a monthly basis :slight_smile:

Check out this thread hopefully you can implement some other optimizations.

1 Like

Ok so I’ve been exploring ways to avoid the second Group By operation and succeded :wink: So would you be so kind to test these variatons as well and let me know if performance improved?
Thanks @Jetuke_ES

Now for future reference, this is what I’ve done:

Next I extracted the Min value for [Created On] in two different ways:

Did the same for the Max value for [Completed on] also two approaches:

Here are the updates files. I hope this is helpful.
eDNA - Identify data sets to keep (3b).pbix (44.7 KB)
eDNA - Identify data sets to keep (4b).pbix (40.6 KB)

1 Like

Thank you Melissa!! Made such a difference. Thank you so much for your patience and help! :smile: Stay safe!

Excellent. That’s great to hear.
Thanks for letting me know Erika.

Melissa, i wanted to calculate average duration in Hours per month, would you use the Quick Measures for that? : )

Hey Brian,
Sorry to be bothering again. If i wanted to amend measure for Average per month, could you help amend the above dax for that possibly? So to see how many Hours (not days) does it take to complete both steps. I tried to Change to elapsed Hours instead, and removed the *60 in your measure. Something didn’t work though. Any help would be really appreciated. : )

@BrianJ made it work… :slight_smile: please disregard!!

@BrianJ is there way to exclude weekends for the Min and the Max dates for the duration/ elapsed calculation?

@Jetuke_ES,

Funny you should mention that - @Melissa were talking about that yesterday in another context. There are lots of ways to do this, but I think the easiest way is to include the NETWORKDAYS custom function within your date table. See this forum post for details:

Then what you will do, is subtract the NETWORKDAYS column values associated with your beginning and end date values to give you the duration not including weekend days (you can also exclude holidays if you want).

Take a look at this, and if you have any questions about implementing this within your calculations, I’m happy to work through the details with you.

  • Brian

That would be amazingly helpful, Brian. I tried the link, but got this message: " Oops! That page doesn’t exist or is private." If you do have a second, could help implement, or higher level step by step, or I could work it out from pbix if you would create it in the example. By the way, your dax worked perfectly with a couple tweaks. was so pleased!!

@Jetuke_ES,

I’ve been getting that message too when I click on forum links that others have sent to me. Thought it might be a problem at my end, but it looks like it’s a more widespread issue. I will alert @sam.mckay and the @EnterpriseDNA team.

The search function is still working though, so just click on the magnifying glass at the top right of the screen, enter “networkdays” and you should see the following post at the top of the list:

Click on that, and it will bring you to the instructions and the M code for the NETWORKDAYS function.

After you implement the function, take a look at your date table in the Data View and you will see exactly how Imke implemented this, and how incredibly powerful and straightforward it now makes this sort of calculation. Sort of inexplicable that Microsoft has not included NETWORKDAYS as a functioning DAX, given that it has existed in Excel for years.

  • Brian

i tried this one to calculate duration with excluding weekends. created Date table, also custom column for Workdays. And tried this dax with couple variations, but dont seem to make it work…

Closure Time =
CALCULATE(SUM(Dates[IsWorkdayDay]),
DATESBETWEEN(Dates[Date], Table1[Min Created On], Table1[Max Completed On]-1),
Dates[IsWorkdayDay] = TRUE,
ALL(Table1[Overall Process])
)

any tips would be really appreciated, please…

@Jetuke_ES,

Going back through this thread, I’ve lost track of what the most current version of the PBIX we are working with is. If you could please post that, I’d be glad to work through the NETWORKDAYS calculation.

Thanks.

  • Brian

I can see why it would be confusing. So i made this version work and it didn’t increase too massively when working with the actual file. I created Date table with “IsWorkingDay” - True/False, in hopes to somehow be able to exclude False/Weekends from the “Elapsed Days” and “Average” calculations. Thank you Brian for any help.
eDNA Forum - Process Duplicates Solution.pbix (49.9 KB)

@Jetuke_ES,

OK, I think I got it working, and it’s an interesting scenario for the use of the NETWORKDAYS function, which I first applied to the Dates table. Here’s the measure that does the heavy lifting. Basically, it computes the fractional days on the min and max ends (zeroing out one or both if they fall on weekends) and then adding those two fractional days to the full net work days in between:

Total NetWorkDays = 

VAR FirstDay = [Min Created On]
VAR YrMin = YEAR( FirstDay )
VAR MoMin = MONTH( FirstDay )
VAR DyMin = DAY( FirstDay )
VAR FirstDayWeekend = 
    SWITCH( TRUE(),
        WEEKDAY( FirstDay ) = 7, 0,
        WEEKDAY( FirstDay ) = 1, 0,
        1
    )

VAR LastDay = [Max Completed On]
VAR YrMax = YEAR( LastDay )
VAR MoMax = MONTH( LastDay )
VAR DyMax = DAY( LastDay )
VAR LastDayWeekend = 
    SWITCH( TRUE(),
        WEEKDAY( LastDay ) = 7, 0,
        WEEKDAY( LastDay ) = 1, 0,
        1
    )

VAR FirstDayEnd =
    DATE( YrMin, MoMin, DyMin ) + TIME(23, 59, 59 )

VAR LastDayBegin =
    DATE( YrMax, MoMax, DyMax ) + TIME( 0, 0, 0 )

VAR SecondsPerDay = 24 * 60 * 60

VAR FirstDayDays =
    DIVIDE(
        DATEDIFF(
            FirstDay,
            FirstDayEnd,
            SECOND
        ),
        SecondsPerDay,
        0
    )

VAR LastDayDays =
    DIVIDE(
        DATEDIFF(
            LastDayBegin,
            LastDay,
            SECOND
        ),
        SecondsPerDay,
        0
    )

VAR NetWorkDaysDiff = 
[Networkdays Created On] - [Networkdays Completed On]

VAR FullNetWorkDays = 
IF(
    NetWorkDaysDiff < 2,
    0,
    NetWorkDaysDiff - 1
)

RETURN
(FirstDayDays * FirstDayWeekend) + FullNetWorkDays + (LastDayDays * LastDayWeekend) 

Then computing the average is a straightforward matter of substituting the measure above into your prior AVERAGEX calculation:

Average NetWorkDays Elapsed = 

VAR vTable =
    FILTER(
        Data,
        Data[Keep Record] = "Yes"
    )

VAR vTable2 =
    ADDCOLUMNS(
        SUMMARIZE(
            vTable,
            Data[Overall Process]
        ),
    "Total NWD", [Total NetWorkDays]
    )

RETURN
AVERAGEX(
    vTable2,
    [Total NWD]
)

I hope this is helpful. Full solution file attached below.

1 Like

To complement another excellent post by @BrianJ

Here’s a Power Query approach using a custom function made by Bill Szysz, for more details on how to implement that see this thread. Time format 37:30:55 greater than 24 hours - #11 by Melissa

Net Duration incl workhours, calculates the difference between the Min and Max datetime values and excludes non-workinghours, weekends and if provided holidays.

Net Duration excl non workdays, calculates the difference between the Min and Max datetime values and excludes weekends and if provided holidays.

Simple Duration, calculates the difference between the Min and Max datetime values.

.
Hope this is helpful.
eDNA - Identify data sets to keep (5).pbix (53.5 KB)

1 Like