I need to consolidate data from three SQL database schemas having the same table structures into a single dataset. Using views, the tables can easily be appended together. However there are common primary keys across all three schemas resulting in duplicates and triplicates. What is the standard solution to resolve such an issue?
My first thought would be to add an extra column to the views that represent the dim and fact tables in the data source. This column would store the Power BI PK as a decimal number. If the db primary key is 2256 then the Power BI primary key would be 2256.1 when originating from the first schema, 2256.2 when originating from the second schema and so on.
Is there a better way to do this? This is probably a common scenario in data warehousing but I have nil experience there.
You’re right, if you want the records to be easily individually identifiable and traceable, you should have a unique key before you append records. The best practice would be to do this as far upstream as possible, so if you can alter the SQL views (say, to create a unique primary key with the decimal point as you say or perhaps another unique field), that would be worth a first try. If you can’t alter the SQL views, then compose a new “composite” key in Power BI before you append the tables together (e.g., perhaps by adding the “table name” to the primary key, etc.)
If you’re not going to be using the new “composite” key in any visuals, however, you might want to just not import the primary key at all.
(Hard to offer an opinion without seeing the data, so …)
I did test the proposed solution and it works. For each fact and dim table, I created a view that is a UNION query consolidating data from each site (schema). In the view I added extra columns to represent the primary/foreign key + 0.x. These were aliased as PBI_ID_XXX . In the view I also removed the now unnecessary original PK and FK so as to reduce PBI import size.
After importing in Power BI, I was able to replicate in a table visual a couple of existing legacy Crystal Reports just to test the data. Everything matches.