One Dimension, Multiple Facts -and- Multiple Interconnected Facts

CStaich · January 27, 2020, 7:37pm

I am in the midst of refactoring my PBI Data Model.

I have a unique situation; my company is a drop-shipper, so for each Sales Order we may have multiple corresponding Purchase Orders (one per vendor). This means our Fact Tables have an interesting relationship I’m trying to model with mixed results.

There are several relationships I have so far opted not to model to avoid circular relationships. Very rarely (if ever) will we need to look up a specific line of a Sales Order in a different fact table; the header-level grain is sufficient for foreseeable uses.

Purchase Order Line has foreign keys to Sales Order Line
Purchase Receipt Line has foreign keys to Sales Order Line and Purchase Order Line

These connected Fact tables make for some interesting dimension connections. I believe I’ve determined how best to model each of them, but would like more eyes on the diagram to ensure I won’t get any major issues. Take Dim Vendor as an example. This dimension links to the line-level tables on the sales documents, but the header-level tables on the purchase documents. This is possible because the function that generates purchase orders and sales orders from a quote always generates

Exactly 1 Sales Order
Exactly 1 Purchase Order per Vendor

In addition to general feedback on my data model diagram above, here are some specific questions I am stuck on.

Dim Date: I have multiple dates I would like to do time intelligence on; Fact Sales Invoice Header[Posting Date] and Fact Sales Order Header[Order Date] as an example. These are the two primary dates we want to report by, but these two Fact Tables are already linked in a 1:many relationship. How do I build Dim Date into the model and build time intelligence functions on both of these two date fields?

→ Is this a case where USERELATIONSHIP() is necessary? If so, I could use some guidance on the starter measure to branch from.

Dim Date: Same question as above, except both values are in the same table. For example, Fact Sales Order Header[Order Date] and Fact Sales Order Header[Award Date].
Dim Contract: This dimension connects to both the Sales Header and Sales Line tables. The values are not necessarily the same on the header and lines. How do I model both relationships and build measures to show either header contract or line contract? I will also need to connect to Sales Invoice Lines and Sales Order Lines, while also capturing Sales Order Header value.

I’m making good progress on this, but the following guidance and resources would be most helpful to me now;

Specific cases when inactive relationships are recommended, and how to judge this for myself
Specific DAX Measures and formulas to use when relationships are inactive
Enterprise DNA tutorials on inactive relationships and complex data models such as mine
Enterprise DNA tutorials on multiple interconnected fact tables, including my order:invoice and order header:order lines tables

Note: This conversation began as a different topic in Enterprise DNA concerning modeling my Sales Invoices and Sales Orders, and how best to build measures for each. Here is that post

mickeydjw · January 27, 2020, 8:54pm

The short version is yes, you use the USERELATIONSHIP() command. The starter measure I’d use it with would be something along the lines of Total Sales by Delivered Date = CALCULATE ( [Total Sales], USERELATIONSHIP ( ‘Dates’ [Date], ‘Sales’ [Delivery Date] ))

BrianJ · January 28, 2020, 4:39am

@CStaich,

Per your questions, here are some recommended videos and forum posts:

BTW, nice data model graphic. What program did you use to create that?

Brian

CStaich · January 28, 2020, 1:49pm

@BrianJ Draw.io is very easy to use for diagrams. I find myself drawing process flow diagrams on the site very often and exporting them to png or pdf for requirements documentation.

Thank you for all the resources. It took me a small epiphany to get to this point with the data model. I feel like Power BI’s learning curve is a very bumpy roller coaster, and I’m climbing another hill now trying to understand the indirect relationships and how to properly use them.

I will watch some of these tutorials and most likely return with more questions.

BrianJ · January 28, 2020, 4:25pm

@CStaich,

I think you are absolutely right - my experience is that the PBI learning curve is definitely not linear. I’ll just keep hammering away at an issue or concept, and one day it will suddenly make complete sense to me. However, it looks like you’ve made amazing progress since we worked together in December

Thanks for the tip on Draw.io. I’ll definitely check that out.

Brian

mickeydjw · January 28, 2020, 5:09pm

One thing I didn’t answer was, “How do you determine which relationship to make active?”

The way I determine which I’m going to make my primary active relationship is what data will be manipulated the most. When I’m working with support teams, the primary will be created date/time, with an inactive on resolved and closed dates.

I hope this helps.

BrianJ · January 28, 2020, 6:02pm

@CStaich,

With regard to your question of “Specific DAX Measures and formulas to use when relationships are inactive”, I use the same measures and formulas I do when the relationship is active, but just add a USERELATIONSHIP statement in the appropriate place to activate the inactive relationship.

Piggybacking off of @mickeydjw’s comment, another decision is when to use physical (active and inactive) relationships versus virtual relationships (handled via TREATAS). The decision usually comes down to the relative granularities of the tables being connected. Same granularity typically connected via physical relationship, different granularities connected virtually. The latter is not a requirement - just a personal preference to avoid more complex constructs in the data model, like bridge tables, that may be needed to address many-to-many relationships that can result from the granularity mismatch, if you are using physical relationships. @sam.mckay covers all of this thoroughly in the videos I cited above.

Brian

CStaich · January 28, 2020, 7:26pm

Here’s my plan:
Leave all connections to ‘Dim Date’ inactive and specify in my base measures which date relationship to utilize for that measure. Then, when I branch from my base measures, the date relationship desired will be part of that branch.

Ext. Cost = 
	CALCULATE(
		SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date]))
		
Ext. Price = 
	CALCULATE(
		SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit Price]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date]))

Gross Profit = [Ext. Price] – [Ext. Cost]

Inv. Ext. Cost = 
	CALCULATE(
		SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date]))
		
Inv. Ext. Price = 
	CALCULATE(
		SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit Price]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date]))

Inv. Gross Profit = [Inv. Ext. Price] – [Inv. Ext. Cost]

This seems to quite nicely summarize a given sales order by date

And finally, ‘Dim Contract’ is my main remaining issue. This table ought to connect to all of the following:

‘Fact Sales_Order’
‘Fact Sales_Order_Line’
‘Fact Sales_Invoice_Line’

All of these fact tables are already modeled together.

Here’s the real-life scenario: Sometimes I want to filter orders or invoices by header contract, and sometimes I want to filter them by line contract. I want to be able to filter using the contract’s name rather than the code in either case.

The way I’ve modeled it today, filtering on ‘Dim Contracts’[Contract Name] will filter by header contract, ignoring line contracts, because ‘Dim Contracts’ is only tied to ‘Fact Sales_Order’

Is this a case where duplicating my dimension table is the best practice? I’m not having any other ideas here. Building a DAX Measure doesn’t seem to make sense in my head, because I’m not aggregating anything here, just trying to use a different relationship for a filter.

mickeydjw · January 28, 2020, 8:11pm

I don’t know that it’s best practice, but it is something I’ve had to do in the past and people looked at me funny for it. In the query editor, it was something along the lines of “Table Part Deux = Table1”

I told the PM in question that if a solution works, it’s not stupid.

CStaich · January 29, 2020, 10:34pm

@BrianJ, @mickeydjw

I’ve run into trouble!

Here are two measures:

Inv. Ext. Cost = 
	CALCULATE(
        SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date])
	)

Ext. Cost = 
	CALCULATE(
        SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date])
	)

In the screenshot below, everything works as expected.
The Sales Order No is from ‘Fact Sales_Order’
The Sales Invoice No is from ‘Fact Sales_Invoice’

However, in the screenshot below, when attempting to do the same thing except with the Ext. Cost, something goes wrong. The Ext. Cost value is correct for each Sales Order, but the mapping between Sales Invoice No and Sales Order No is all wrong and is showing every possible combination between the two values.

For some reason, the Sales Invoice No. is being repeated for every single Sales Order. Considering the model shown below, I would except that Sales Invoice Line can be traced back to a single Sales Order, and that Ext. Cost measure would intake the Sales Order No in my visualization as context.

When only providing Sales Order No , everything works as expected.

However, something’s screwy when providing Sales Invoice No.

I think my Ext. Cost measure needs some additional context in order to total by Sales Order, or something’s wrong with my data model, but I’m struggling to find out the specific issue here. Any guidance or links to tutorials would be greatly appreciated.

Something something filter propagation? Something something sales order lines aren’t connected correctly? Something something my calculate functions need another FILTER?

BrianJ · January 29, 2020, 11:42pm

@CStaich,

Can’t speak for @mickeydjw, but we passed the point of complexity where I can diagnose what’s going on just by looking at screenshots about two exits ago.

If you can please post your PBIX file, I’d be glad to dig in and see if I can figure out the problem here.

Thanks.

Brian

mickeydjw · January 29, 2020, 11:55pm

I’m wondering why there’s a Sales_Invoice and Sales_Invoice_Line as well as well as the same with Orders. Why the 2 tables? When I’ve seen issues where totals aren’t doing what they’re supposed to, it’s been when data has been getting duplicated because of data getting counted multiple times through table relationships.

Also, when it comes to simplifying, a Key_Measures table fixes a ton, that way you’re not recreating calculations within new measures and instead measure branching.

CStaich · January 30, 2020, 3:56pm

@mickeydjw

I’m wondering why there’s a Sales_Invoice and Sales_Invoice_Line as well as well as the same with Orders. Why the 2 tables?

In my original model, I had a single query for both Sales_Invoice and Sales_Invoice_Line, combined.
Similarly, I had a single query for both Sales_Order and Sales_Order Line, combined. Header values were repeated on every line.

I didn’t think it would be good to leave those header and line tables combined. The main thing driving me to refactor this data model was to create relationships between my various fact tables, specifically:

Sales Orders [1:*] with Sales Invoices,
Sales Orders [1:*] with Purchase Orders,
Purchase Orders [1:*] with Purchase Receipts
Sales Orders [1:*] with Sales Shipments.

If I leave these fact tables connected as they are (see original post’s data model diagram), I think I need lines and headers in separate tables to preserve the 1:* relationships between fact tables. Conversely, If I was to combine header and line tables, I wouldn’t be able to connect my fact tables to each other without creating : relationships.

Also, when it comes to simplifying, a Key_Measures table fixes a ton, that way you’re not recreating calculations within new measures and instead measure branching.

I basically do have a Key Measures table, but am having a great deal of difficulty determining how to create both Booked (Sales Order) and Invoiced (Sales Invoice) measures, which is the current issue I’m facing.

Some of my key measures, all of which are struggling to cope with my data model:

Ext. List =
	CALCULATE(
SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit List]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date])
	)

Ext. Cost = 
	CALCULATE(
SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date])
	)

Ext. Price = 
	CALCULATE(
SUMX('Fact Sales_Order_Line','Fact Sales_Order_Line'[Quantity] * 'Fact Sales_Order_Line'[Unit Price]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Order'[Order Date])
	)

Ext. Discount = [Ext. List] - [Ext. Price]

Gross Profit = [Ext. Price] - [Ext. Cost]

Inv. Ext. List = 
	CALCULATE(
SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit List]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date])
	)

Inv. Ext. Cost = 
	CALCULATE(
SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit Cost]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date])
	)

Inv. Ext. Price = 
	CALCULATE(
SUMX('Fact Sales_Invoice_Line', 'Fact Sales_Invoice_Line'[Quantity] * 'Fact Sales_Invoice_Line'[Unit Price]),
		USERELATIONSHIP('Dim Dates'[Date], 'Fact Sales_Invoice'[Posting Date])
	)
	
Inv. Ext. Discount = [Inv. Ext. List] - [Inv. Ext. Price]

Inv. Gross Profit = [Inv. Ext. Price] - [Inv. Ext. Cost]

@mickeydjw, @BrianJ

Here’s some demo data specifically pertaining to this issue. PBI Demo Order Invoice Relationship (2).zip (148.5 KB)

I felt so confident two days ago, and now I’m reconsidering whether I’ve had an epiphany or just a rush of hubris. There’s some crucial rule I don’t know I’m violating in my data model, and some part of my approach is entirely wrong. I don’t know where to look.

mickeydjw · January 30, 2020, 7:05pm

Here’s what I tossed together. Is this what you were looking to achieve?invoice cost updated.pbix (132.0 KB)

CStaich · January 30, 2020, 7:37pm

@mickeydjw,

Hmm, so the bi-directional relationship could be a missing ingredient.

You’ve got a bi-directional many:many relationship here, and have combined the header and line data into a single table.

I need to go train on bi-directional relationships. @sam.mckay has specifically forbidden me from utilizing them, but he also doesn’t include multiple linked fact tables in most (any?) of his content.

I’ll also need to train on many:many relationships, because again I feel I have been shepherded away from using them whenever possible. I need to make sure I know what I’m getting into before I rebuild this model using these methods.

If you know of any resources, which may need to be external to Enterprise DNA, please share.

[Edit: I wonder if I can leave my headers and lines separate, turn on this bi-directional relationship, and not have to deal with the many:many portion. Researching…]

mickeydjw · January 30, 2020, 8:44pm

Everyone will tell you to shy away from bidirectional relationships. Myself included. The only thing that I did to what you sent was simplify the fact tables and ensure the calculations worked. As you add complexity to your model, bi-directional filtering becomes more risky, and isn’t recommended.

CStaich · January 30, 2020, 8:59pm

I think a saving grace may be that all the bi-directional relationships I would like to add are in essentially the same layer of the waterfall, and are always between facts. I’m hoping these consistencies will squelch some of the ill effects I would otherwise see.

BrianJ · January 30, 2020, 10:05pm

@CStaich,

Below are some excellent free resources on how to handle many-to-many relationships. However, once you’ve been through the Enterprise DNA data modeling resources, the next one I would highly recommend if you or your office are willing to spend some $ is the SQLBI data modeling course. I took it in person at MBAS last year, but their video training also is excellent (and much cheaper):

Articles and videos:

Different-types-of-many-to-many-relationships-in-Power-BI.pdf (1.8 MB)

I’ll also take a look at your PBIX file. May be able to avoid bi-directional entirely via data modeling changes, but if not, may be able to make the necessary relationship inactive, but utilize it as needed via CROSSFILTER.

Hope this is helpful.

Brian

BrianJ · January 31, 2020, 5:27am

@CStaich,

I think I struck gold here - check out this link to Google Books, containing the full text of the chapter on invoice/order data models from Russo and Ferrari’s, “Analyzing Data with Power BI and Power Pivot for Excel”, which walks through in great detail how to properly set up exactly the model you are wrestling with.

Brian

BrianJ · January 31, 2020, 1:39pm

@CStaich,

One other thought – have you taken a look at this showcase?:

It doesn’t have the header/multiple line structure that makes your specific case more complex, but you may be able to draw some useful insights from it regardless.

I agree with @mickeydjw that denormalizing your model down to two fact tables, one for orders and one for invoices is going to be key.

Brian