Strange Data Model

Since I’m in an very early stage (and I foresee some problems when proceeding), I don’t have a PBIX-file right now.
In the past I built already several data models for customers and I’m familiar with snowflake- and star schema.
But this time it’s a bit weird when I sketch my model in Power Point (as I always do).

The situation is as follows:

  • There are real estates - they are unique (there are no two same buildings in the world)
  • Each real estate contains several apartments (they are also unique (by logic and in reality))
  • Each apartment can be rented ( e.g. from Jan - Mar, then from another tenant from Apr - Aug and so on); for every rental period there should be one entry in the database (possibly Excel)
  • Payments will be received every month for the duration of the rental period

Doesn’t sound too complicated, does it?

But I don’t see anywhere some of the usual data model structures here. So I’m fearing that I will struggle later when it comes to measures (DAX) and I will not be able to receive the correct or (worst case) no answers, because I don’t have the correct relations between the tables (this happened to me in my first business job ).

So therefore some advice would be very much appreciated.

Best regards
Martin

@Mjuds ,

The first thing I always do before starting the data modeling exercise is to draft an analysis plan, which in its simplest form is just a list of the questions you will ultimately need the analysis to answer. That way in evaluating potential data models, you’ve got a clear set of objectives that you need the model to be able to fulfill.

My initial take on this is that it you will have three main dimension tables (Dates, Tenants and Units). Units will be based on a concatenated key field capturing the building/real estate ID and the apartment ID. That unit ID will then be used in each of the two fact tables along with the tenant ID and a series of dates. For the rental fact table, you will have a start rental date and an end rental date for each tenant and unit combination so at least one of those will need to be an inactive relationship. The second fact table, payments, will have payment date, payment amount, tenant and unit.

I think this is quite a flexible structure that will allow for many different analyses, but it will be valuable to see your analysis plan question list to ensure that the PQ/M/DAX to accomplish that will be doable and efficient.

I hope that’s helpful. Happy to work more with you on this as it evolves with more detail.

– Brian

Yes, indeed, that’s helpful, especially due to the fact that I have tomorrow my first meeting with this customer.
And your tip concerning the list of questions comes exactly to the point: I can discuss them with my customer…(that was also my mistake during my first job: I didn’t ask and assumed a lot, and finally had to rework my data model completely).

So, tomorrow is the day and I will be prepared and hopefully a bit smarter.

Thanks a lot Brian and I’ll keep you updated (that’s for sure).

BR Martin

@Mjuds ,

Sounds good. In the interim, I’ll take a good look at the Order Management chapter of Kimball and Ross’ Data Warehouse Toolkit to see if some good info about structuring this type of data model can be gleaned from that.

  • Brian

@BrianJ ,
I visited the customer this week and showed him some draft, I had created.
It was very rudimentary, but the customer liked already couldn’t imagine to need more.
So, the result is, that I have to guess what he might need (or what I think I would like to know, if I were him).
Therefore I decided to start with also having a good look at Kimball and Ross’ Data Warehouse Toolkit, which I ordered yesterday (should arrive midth of next week); then try to make up my mind to think like someone who owns more than 100 appartments in Berlin…create some sort of plan how to get my questions answered and finally send out an offer to the customer.

I think I might come back, if I start creating my final data model and get lost at some point.
Hopefully Kimball and Ross can give me some directions how to deal with this new challenge.

Nevertheless, thanks a lot Brian for your help until now and see you next time (soon?)

Best regards
Martin

@Mjuds ,

Given the customer’s lack of clarity on what they want, I think your proposed approach is the way to go. What I tend to do in these sorts of cases is build out a model that handles the typical process well first, and then “stress test” it to see if it can handle a range of special cases (e.g., what if someone prepays 12 months’ rent up front?, what if someone is three months late on rent and then pays it all off in the third month?, etc.). You may even want to talk to the client up front about all the potential deviations from the standard rental/payment process that they may want you to capture in your model/report.

By stepping through the deviations after building the standard model, it often reveals new fields you need to add, reshaping (e.g., unpivoting) you need to do, or new tables and/or relationships you need to add.

I’m working from the 2nd Ed. of Kimball and Ross, and you probably ordered the new 3rd ed., but in my version, there’s a chapter on Order Management with a subsection on Invoice Transactions that I think you will find very helpful.

Sounds like you are on a good track here, but if you run into problems, just give a shout.

Good luck!

  • Brian

@BrianJ ,
looks like this is an international problem with customers not knowing what they want.

I was a bit smiling when I read your reply, especially the part with the ‘stress test’. I asked the customer already nearly exactly the same questions you mentioned.

The customer is completely new in this business and so he seems to be happy to be asked the questions he hasn’t thought about yet. Looks like some process consulting is needed in the first place, otherwise (and I had this case with another customer) the changes needed to get all requirements into a proper data model after he finally made up his mind about his processes, nearly blew my mind (and needed a lot of time).

Can’t wait to get Kimball and Ross’ books…to have a look in Invoice Transactions.

Keep you updated.

  • Martin
1 Like

@Mjuds ,

Honestly, it sounds like the hardest part of this project might be defining the within scope/outside of scope boundaries of the contract. Unless you are on a strict time and materials contract, clients like this can just end up killing you with continual change orders (“oh wait, now can we change the model to make it do this?”…).

Sounds like you are asking exactly the right questions and have been down this road before. I completely agree that starting with some pretty detailed process consulting is going to be time and money well spent.

Talk to you soon. All the best,

  • Brian

@BrianJ ,

even worse, sometimes clients like this tend to ask questions about some information that is not in the model (“Can we also see when the facility manager is on holiday?” “Sure you can, but you have to enter that information up front.” “Oh, I see, but that’s to much work…Is there another way?” “Ehm, no…” and so on).
But if we finally can convince the customer to follow a certain process flow or pay for changes, everything is fine and the fun of building everything that’s needed starts.
Looks like this time it will take a bit longer :wink:.

But since I’m relaxed that’s no problem.

Hear you next time.

  • Martin
1 Like

Hello @Mjuds, were you able to find the solution for your question?
Kindly mark as solution the answer that solved your query.

If not, how far did you get and what kind of help you need further?

Hello @Mjuds, we noticed no response from you since November 22nd.

Did the responses above help you with your inquiry?

I am currently waiting for customer input, before I proceed.
He promised to contact me beginning of next week.
So, until then there will be no progress here.

@BrianJ

So, finally I got (at least most of) my questions answered by the customer.

The result is that your and mine first considerations were right.
There are only 2 things I have to learn to deal with:

    1. : 2 FACT tables (never done before and I hope I will not run into too many difficulties).
    1. : a DIM-table that will ‘slowly change’ (SCD) over time (the rents will (for sure) increase)

Especially concerning the second point I watched already some youtube videos and read the corresponding chapter in Kimball/Ross’ ‘Data Warehouse Toolkit’.
Now I ‘only’ have to choose the correct Type (luckily there are only 7…).
At the moment of writing I tend to use Type 2, but I have to evaluate this a bit more (I have used this Type already in another project, and it worked and still works).

That said it looks like I can start setting up my tables and so on.
There’s one thing left I would like to ask your opinion about:

  • Are there pitfalls I should expect when dealing with 2 FACT-tables?

Any advice is (as always) highly appreciated.

Best regards
Martin

Here’s my final data model…

@Mjuds ,

OK, we’re making progress but we’ve still got ourselves some major issues.

First, I totally agree with you that Type 2 is the right way to go here for SCD. Want to maintain the full history. Would absolutely recommend going with Kimball’s suggestion to add a “current records” flag to that dimenion table, since it will make your DAX much easier and more performant if it’s not always having to run the calc to determine which record is the most current. So, good to go there.

But with regard to your data model, you’ve got some major violations here. In a star schema, dimension tables are not connected to each other, nor are fact tables.

Your current model is chockablock with ambiguous paths will undermine the integrity of the DAX results. For example, Dates can filter Payments in one of three (!) different ways:

1-2-3 or 4 or 5/6-3.

To resolve this, remove the relationships between Rentals and Payment and Date and Apartments.

Also, connecting Buildings and Apartments in a Snowflake structure while not wrong has a lot to argue against it (see Accelerator Week #2 for an in depth discussion of this). I would prefer to see you denormalize these two table and combine into one.

Once you get all this worked out, I don’t see any immediate problems from having multiple fact tables. However, at that point I’d like to take a relook at the revised model and see if denormalizing the two fact tables into one makes any sense.

I hope that’s helpful.

  • Brian

@BrianJ

First of all, I have to apologize for my ‘data model’, it is no such thing, not even close.
I totally mixed my mind, because I sketched what is supposed to show the relationships between the tables in the Excel application I’m currently developing for the customer.

The data model in Power BI of course has to have another layout.

Although I can follow your suggestions mostly, I struggle a bit with some topics.

The denormalization of Buildings and Apartments is no problem.
But it seems I have a key problem in understanding how I can get reliable results from measures that involve Rentals and Payments, if they don’t have a connection (e.g. I want to show detail information from Rentals when calculating Payments for a specific Rental).
And also the same is true for the missing connection between Dates and Apartments.
Apartments need to have a date column, because when I want to calculate the development of the rent for Apartments over time, how can this be done without having a date table connected?

What I know so far is, that a filter flows from 1 to many, not vice versa…

Looks like I have to go back and adjust my basic understanding.

Nevertheless, many thanks for your help until now. I think your tips have saved me from making a big mistake.

I would be very happy if you could give me some advice on the topics I mentioned above.

Best regards
Martin

@Mjuds ,

Whew! Much better. :grinning:

Let’s look at the question another way – what is preventing you from combining the payments and rentals table into a single Transactions table with a transaction ID, transaction type (rental or payment), amount, date and unit ID (tied to the denormalized building/apartment dimension table)?

Now that were getting very close to a final model, it might be helpful to have a full-scale mockup with the actual table and field structure and some dummy data.

What do you think?

– Brian

@BrianJ

from my point of view the combination of payments and rentals is not useful, because this will then become a table with totally different contents, because there are a lot of different columns for either type (P or R).
Additionally as MANY payments are based on ONE rental I don’t see the usefulness of combining them.

So, I don’t think I will go for it (at least not now).
Maybe I’m still having a lack of understanding here.

And yes, a full-scale mockup should and will follow soon.
Possibly it will help clarifying some topics.

Best regards
Martin

@Mjuds ,

You’re very likely right in that assessment that combining the two doesn’t make sense. However, we’re at the point where I’m losing the ability to take the next steps conceptually, and will actually need the full-scale mockup to play with.

When you send it, please include the PBIX, the XLSX or CSV, and a mockup of what you want the final results to look like. From that, I think we should be able to wrap this one up nicely.

Thanks!

  • Brian

@BrianJ

absolutely right…meanwhile a mockup is also needed by myself, to get the full picture.

But the customer has to do one more step , namely to send the order confirmation.
Until I get it I will pause any activity, because he promised to send one 1 week ago, I sent a friendly reminder, but still nothing.

So, we must not waste our precious time and work for ‘nothing’.
Beginning of next week I will make another attempt to get that confirmation, and then either proceed or stop finally (in negative case).

Many thanks, Brian, so far.

Best regards
Martin

1 Like

@Mjuds ,

Thanks for the update- makes sense. I hope the client follows through. This has been a fun problem, and I’m eager to see it through to the end.

  • Brian