Latest Enterprise DNA Initiatives

Stuck with 'easy' data model (6 tables only)

Hello there.
Since more than 1 year I’m following EnterpriseDNA (meanwhile becoming a member to support this excellent platform).
I’m using Excel, PowerBI, PowerQuery…on a daily basis. I’m (normally) also familiar with creating effective data models (dimension -> fact tables).
But this time I’m a bit lost.

I have 3 dimension tables:

  • Sites (Address data of construction sites)
  • Employees (personal data)
  • Degrees (well…degrees)

and mainly 1 facts table:

  • SiteHrs (holds data of employees’ hours worked on Sites)

So far so good, BUT…
the hourly rates depend on the employee’s degree.
And this can change over time (one always starts as a beginner, than becomes advanced and finally ends up being a master). So there is a time dependancy.
I cannot integrate the table HourlyRates (See screenshot) into my model.

I’m thinking now about this since 2 days and no solution will come to my mind…

Every hint is welcomed.
BR Martin

Hi @Mjuds,

First thing i would look at is your naming of your data elements. Make sure all your data names are the same. for example emplNo (tbl _employee) same as employee (tbl_sitehrs)?

Tbl_roles using LastName will get you into trouble if you have more than one person with the same last name should be unique maybe use emplNo.

Site-no (Tbl_sitehrs) and SiteNo (tbl_site) should be the same naming.

Could you combine roles into the tbl Employees because its unique to the employee information

Where is your Date Table in your model?

I’m not sure about this but maybe you can use the Tbl_HourlyRates as lookup table and you don’t have to link them up

This is only my opinion but others in the group will have more experience then myself.

thanks
Keith

1 Like

@Mjuds,

This is a classic “Slowly Changing Dimension” problem. There are a lot of different approaches you can use to address SCDs, but this is the approach I would take:

  1. In Power Query, combine the three tables I’ve marked in red into a single employee table. Each employee may have multiple entries, depending on how many degrees they have.

  1. In PQ, merge the hourly rate info into that newly created employee table using Degree as the key

  2. In PQ, add an index column starting at 1 to the merged table created above

  3. In PQ (or as a DAX created column, though PQ is preferred) create a lookup function to add the index value from the Employee dimension table (3. above) to your fact table. This function will basically match on Employee and pull the index value corresponding to the MAX start date for that employee. @Martin has provided an example of this approach here:

  1. Join your consolidated Employee dimension table to your fact table using that index column in a 1:M unidirectional relationship, and you should be good to go.

I hope this is helpful.

  • Brian
3 Likes

@Mjuds,

BTW - @Keith is spot on that none of this will work properly w/o a valid date table marked as such. I strongly recommend using @Melissa’s awesome Extended Date Table.

Also, here’s a video that @sam.mckay did on slowly changing price dimensions that explores another way to solve this problem through disconnected tables and the use of TREATAS to create a virtual relationship to the fact table:

  • Brian
1 Like

Hi @Mjuds,
if you want to report on costs (or payment, from employee perspective) there is no need to integrate tbl_HourlyRates. You already have all the necessary information in your fact table. If you are interested in the applicable hourly rate, it’s tbl_SiteHrs.Costs diveded by tbl_SiteHours.Hours. If Costs contains additional positions to the hourly rate and you need the hourly rate then, in this specific case, I would store the applicable hourly rate with each fact row to solve the problem.
If you have additional functional requirements that cannot be solved with these approaches, please specify.
BR
Martin

1 Like

Solving this as a SCD is possible but be aware that this is kind of the “second derivative” of SCD: The degree of the employee changes over time and the rate for each degree changes over time independently. So you have dimension changes when the degree of the employee changes and when the rate of the current degree changes. The solution would be to join degrees and rates on the Employees tabel, then create a surrogate key on the Employee SCD table (e.g. using index), and then relate tbl_Employees and tbl_SiteHours using the new surrogate key. Depends on your functional requirements whether it’s worth the effort or not.

1 Like

Hi @Mjuds, did the response provided by the experts help you solve your query? If not, how far did you get and what kind of help you need further? If yes, kindly mark the thread as solved. Thanks!

Hi Keith,

thanks for your advices, I will take care of them.
The date table is not yet in the model, but I never create one without…

BR Martin

1 Like

Hello Brian.

Everything fully understood, after watching a few YT videos that went deeper into that SCD-problem.
During my nearkly 40 years working life I never came to this thing…interesting issue…interesting solutions.
I will will try out yours first and the see if it works for me.

Thx
Martin

@Mjuds,

It is an interesting issue. For me, it comes into play a lot in the HR context with employee titles/job categorizations changing over time as people get promoted, move into different roles, etc.

Here’s an article I’ve found to be a useful summary of the different approaches that can be taken to SCDs:

  • Brian
1 Like

You absolutely judged the situation right.
And I’m going to solve this for me using the join tables/surrogate key approach.

Thanks again to everyone who gave valuable input to me.

Best regards
Martin (Mjuds)

1 Like