Calculating Date Periods

I’m trying to this in this SQL, planning to chuck it into a view and plug it into PowerBI - but if it’s possible in Power Query then that’s fine! Couldn’t see a SQL section.

I’m trying to calculate instances and date ranges of sickness. My data is running downwards like this:

image

Sickness Dates Example.xlsx (12.9 KB)

It needs to take into account weekends/bank holidays though. So for example, with these example dates the results should be like this, with 11/6/21 - 14/6/21 being counted as one instance as it spans a weekend:

image

I have a SQL table with all the dates in, each marked up whether it’s a working day or not. My plan was to take the rolling year from this of the working days, index them from earliest to latest. Join this to my sickness dates, so each sickness day now has an ID as to their position in the rolling year of working days. Then planned to loop somehow to check if the next row is a sequential ID and group them all together with the minumum/maximum dates of each instance. I’m a bit rusty on recursive CTE loops in SQL so can’t seem to crack it.

Here’s my dates table to help:

Rolling Dates Table.xlsx (16.7 KB)

Hope explained that well enough and appreciate any help, as said fine in SQL or Power Query!

Hi @jamie.bryan. I’d do this in SQL before even getting to Power BI (Roche’s Maxim [slightly rephrased]: “Do your data transformations as far upstream as possible and as far downstream as necessary”). In order to help further, please provide an Excel sample dataset with the raw data, a full copy of the relevant period of your SQL [Dates] table, and a mock-up of the expected results.
Greg

1 Like

Hi @Greg ,

Thanks, I thought SQL may have been best. Those should be in the original post - please let me know if need anything else.

Thanks,

Hi @jamie.bryan. Unfortunately they’re not sufficient, for example your Employees sample table has only 5 rows for 1 employee and your Dates sample table has only [Date] and [DateOrder] columns. Please revise.
Greg

Fair enough, I’m only working with made up employee data at the moment, attached are some more, if need more please let me know. The expected results for each are there too:

Sickness Dates Example.xlsx (13.3 KB)

And here’s the dates table for the rolling year from today:

Rolling Dates Table.xlsx (29.3 KB)

Anything else please let me know.

OK @jamie.bryan, this looks a bit better. I’m still not clear on what you’re looking for: if its one-row per instance, is there a reason why the sick day for employee 1 on Aug 25 is not included in the expected results, or is it just a miss?
Greg

Sorry yeah that’s a miss, attached with that included.

Sickness Dates Example.xlsx (13.3 KB)

Yeah looking for 1 row per instance please. It’s for the bradford method of calculating absences if that helps gives context, which is Instances² x Total Duration. The sickness data is each day sick running downwards as given, so I need to combine any continuous periods of sickness with a start/end date to get the instances. Then I’ll get a sum of the working days between those dates for the duration. Sorry that may have helped to begin with!

Hi @jamie.bryan. Nope, never heard of Bradford method before … a quick online search shows it to be an HR metric.

I gave T-SQL a go, and while I made progress, it wasn’t starting out clean, and it looked like a prime candidate for DAX, so I gave that a go. Neither approach got me to a solution, though, as I wasn’t able to incorporate the rolling dates properly, so tried without. While trivial to look at and determine an instance manually, an algorithm didn’t jump out at me to define the end of an instance.

In any event, I’m attaching my work-in-progress SQL and PBIX attempts in case they help. (The T-SQL approach uses CTEs.)


-- Create View SickInstances_vw.sql

/*
-- check
SELECT x.* FROM dbo.SickDays_t AS x (NOLOCK);
SELECT x.* FROM dbo.SickInstances_vw AS x (NOLOCK);
*/

USE [Test]
GO

CREATE OR ALTER VIEW [dbo].[SickInstances_vw]
AS
WITH base_cte (Row_ID, Employee_ID, SickDate)
AS
(
	SELECT
		Row_ID		= row_number() OVER (ORDER BY sd.Employee, sd.[Date]),
		Employee_ID = sd.Employee,
		SickDate	= sd.[Date]
	FROM
		dbo.SickDays_t AS sd
),

interim1_cte (Row_ID, Employee_ID, SickDate, Next_Row_ID, Next_Employee_ID, Next_SickDate, DaysDifference)
AS
(
	SELECT
		curr_row.Row_ID,
		curr_row.Employee_ID,
		curr_row.SickDate,
		next_row.Row_ID,
		next_row.Employee_ID,
		[Next_SickDate] = 
			CASE
				WHEN curr_row.Employee_ID <> next_row.Employee_ID THEN NULL
				ELSE next_row.SickDate
			END,
		[DaysDifference] = 
			CASE
				WHEN curr_row.Employee_ID <> next_row.Employee_ID THEN NULL
				ELSE datediff(d, curr_row.SickDate, next_row.SickDate)
			END
	FROM
		base_cte AS curr_row
		LEFT JOIN base_cte AS next_row
			ON curr_row.Row_ID = next_row.Row_ID - 1

),

interim2_cte (Row_ID, Employee_ID, SickDate, Next_SickDate, DaysDifference)
AS
(
	SELECT
		d.Row_ID,
		d.Employee_ID,
		d.SickDate,
		d.Next_SickDate,
		d.DaysDifference
	FROM
		interim1_cte AS d
	WHERE
		d.DaysDifference IS NOT NULL
		AND d.DaysDifference < 7
)

SELECT * FROM interim2_cte AS x
GO

I imagine others can find additional solutions using PQ.

But ideally those more familiar with HR process already have online resources for an available, easier algorithm.

Hope it helps.
Greg
Create Table SickDays_t and Populate.sql (2.7 KB)
Create Table RollingDates_t and Populate.sql (188.8 KB)
Create View SickInstances_vw.sql (1.4 KB)
eDNA Forum - Sick Instances.pbix (50.3 KB)

Hi @Jamie.bryan , did the response provided by @Greg help you solve your query? If not, how far did you get and what kind of help you need further? If yes, kindly mark as solution the answer that solved your query. Thanks!

@Greg Thanks! I’ll take a look over those and see where I can get from there. I’m away for a bit after today so I’ll mark this one closed for now and come back/repost at a later date.