Fact table merge columns

Hi there!

Does it matter how many columns the fact table grows to, given that I am merging my tables into a single fact table?

I have merged all my tables into a single fact table and included mainly foreign keys and columns I need to do counts and date calculations on. The rest of the columns in each table, I model as lookups. Most of the calculations are DAX count and various date calculations. However, I am finding a few tables that have no unique ids. How do I merge those that need calculations against some of the table’s columns?

So, after merging my tables into a single fact table I am finding columns names that are the same or similar. What is the best practice for this situation? Does this require a good business understanding because I’m sure one needs to know which columns to use in the DAX formulas when filtering and doing calculations.

Thank you!

Hi @powerbideveloper. Please upload a work-in-progress PBIX (with sample data if desired) that illustrates your situation for the forum members to review. I always strive to make fact tables as thin as possible, and to only include the data that is required to answer the questions that is the report’s intent. An option might be to create separate reports each with thinner fact tables to answer separate sets of questions, but it’s hard to answer generally as each data situation is unique.
Greg

1 Like

Hello @powerbideveloper, just following up if the response above help you solve your inquiry?

If you do, kindly provide the information the experts requested above so they can help you further.

In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved.

Hi @Greg, I will put something together and upload today.

Hello @powerbideveloper, it’s been a while since we got a response from you.

Just following up the information the experts requested above so they can help you further.

Please be reminded that In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved.

@Greg, I am trying to find the time to put the sample together, but I will get to it, so hopefully I can do so today. Thanks for your patience!

Hello @powerbideveloper, following up the sample that @Greg requested.

Hi @EnterpriseDNA, @Greg, yes, I should be done with the sample by tomorrow PST time. There are about 90 columns for me to get through and try and carefully anonymize, so thanks for your patience.

2 Likes

Hi @Greg, @EnterpriseDNA!

Thanks for your patience. I have attached a pbix and he Excel data files which contains the sample.

I first joined all my tables in SSMS and exported the file as csv. I then did my best to anonymize the data. You’ll notice that the one file has the entire SQL merged data in one data set and the other file has that exact same data but broken up into individual tables. I loaded the large merged data set into Power BI.

After joining the tables in SQL Server I noticed that duplicate column names were numbered.

SQL Merge Model.zip (16.6 MB)

SQL Merge Anonymized 2.xlsx (13.8 MB)

1 Like

Hi @powerbideveloper

As @Greg already mentioned that Fact table shall be as thin as possible. Looking at your tables and data, it will be advisable to go with SQL Merge Anonymized 1.xlsx where Data is split into multiple tables instead of single ones.

Also, it is advisable to have minimum Fact tables as possible. Can further look to join few tables like Bids/Projects etc into single Fact table and keep Contenders/Offers etc separate to have only few Fact tables.

Ensure to only load the columns that are required for Calculations. This will make Analysis simpler and easier.

Thanks
Ankit J

1 Like

@ankit,

I actually merged the the tables because there were issues with some of the relationships of the tables when they were separate tables.

So, based on what you’re saying, it seems I have to split this merged data set up again. The question then becomes: on what basis should I split the data set up? The lookup tables are easy to figure out and extract, but deciding on the fact tables is a bit of a challenge and I’m hoping for some more guidance on that.

Thanks!

Hi @powerbideveloper. Following on from @ankit’s and my comments, a long and thin fact table with short and wide lookup tables is a more desirable model. From your data, a modelling exercise is a needed next step, along with identifying the questions the report is designed to ask and a mockup of the desired outcome for each of the visuals you’ll be using to answer these questions.

The PBIX you provided contained only a single table of data with no model or visuals; as such, very little about the report’s purpose could be gleaned. Nevertheless, after an initial review, here’s a few comments:

  • have one table for [Projects] (this will be a lookup table, I expect)
  • have one table for [Bids] (this will be a lookup table, I expect)
  • have one table for [Bid Steps]; I’m guessing this is the fact table you’re looking for) (which I expect should have the [Bid Id] and also have the [Project Id])
  • remove the [Bids & Projects] table
  • name all columns as clearly as possible
    • e.g., don’t leave the [Id] column in the [Projects] table alone, but rather rename it to [Project Id]
    • e.g., don’t leave the [Name (2)] column in the [Projects] table alone, but rather rename it to [Project]
  • add a dedicated [Dates] table and mark it as such
  • remove all unnecessary columns from all tables
    • I doubt there is a need for the [Created Date] and [Updated Date] columns, but the questions will provide insight

(Unclear if/how a project can have multiple bids (I’m guessing there’s only one bid per project, but multiple bid steps per bid)… the data suggests it but, again, without the questions, its hard to evaluate.)

It sounds like you have access to the SQL source for this data; I would probably do another round of SQL view development using SSMS (SQL Server Management Studio), setting all of the appropriate column names in the SQL views before even pulling the data into Power BI, so you can verify the linking of data in SQL before using Power BI to link the separate tables. (If you can’t link the data properly in SQL, Power BI will unlikely to be of assistance.)

Have another go and start a new thread with a specific question(s) if you need more help.

Greg

1 Like

@Greg and @Ankit Thank you for always providing this group with great solutions. We greatly appreciate your efforts and your share of knowledge.

We hope this helped you @powerbideveloper

If not, how far did you get and what kind of help you need further?

If yes, kindly mark as solution the answer that solved your query.

@Greg and @ankit,

Thanks for your time and effort!

I was hoping for help with building the model, and how to get there, so the solutions fell a bit short of what I was hoping for, but I will choose one. However the important points you both made are noted and relevant.

I will build this out further and pursue additional questions as I move this along.

Thank you!