Multi-Company Data Consolidation Data Model

Hi,

I want to get the best practice for data consolidation from multiple companies. Each company has its own way wording or pattern of product items or main accounts. We created a common word for all and will be use in our Power BI reports.

My question is, what is the best data model approach to handle this scenario. As much as possible we need it to be extensible as more company will join this consolidation project.

Please see this link for sample table format data.

Thanks,
JazZ

Dear Jazz,

There are several ways for doing this.

  1. Option 1, If desired value for all the products is already identified and created, you can need to create a data base link table. You can link, company A ( first column) with Company A Data, Company B (2nd column) with Company B Data and you can slide and dice on the bases of desired value.

  2. Option 2, If Desired Value is Not you prepared, You can try to develop a common Syntex ( Say first 5 words) and this will be your desired value.

  3. You can also do the product group ( right click on the product table and make a group.)

I am sure there will be several other ways of doing this. You, as a business user, or other experts in EDNA can advice.

Regards,
KP

I had done a similar exercise and the most flexible was to build a mapping model schema structure. you create group codes and then map individual account codes for each company under these group codes. The set up process was tedious but it offered the most flexibility. I had built the structure in sql server but you can ofcourse replicate the same process in power bi by using the group functionality. Define a parent child hierarchy where parent is the group code and the child are the individual account codes from the chart of accounts

Assumption is each company have their own chart of account

image

@JazZ0003,

I certainly agree with @piryani that moving forward you will need to develop clear business rules on nomenclature. However, I interpreted the problem as being that you are starting with a lot of inconsistent “dirty” data and looking for how to standardize that. If that’s the challenge, then Power Query definitely seems to me to be the way to go with this. Taking your example data, here’s how I cleaned it:

  1. took the original data

image

and transposed it so that you are working in columns, rather than rows

image

  1. for the first column use the Split transformation to break the data into three components (first number(A), x and second number (B). Then created the following custom function to place the smaller number first in the concatenation

  1. used simple split transformations to extract only the relevant piece for columns 2 and 3

  2. re-transposed the cleaned data back to original form

image

Here’s the full M code for all the transformations:

let
    Source = Excel.Workbook(File.Contents("C:\Users\brjul\Desktop\Company Data.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Company A", type text}, {"Company B", type text}}),
    #"Transposed Table" = Table.Transpose(#"Changed Type1"),
    #"Duplicated Column" = Table.DuplicateColumn(#"Transposed Table", "Column1", "Column1 - Copy"),
    #"Split Column by Character Transition" = Table.SplitColumn(#"Duplicated Column", "Column1 - Copy", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c)), {"Column1 - Copy.1", "Column1 - Copy.2"}),
    #"Split Column by Character Transition1" = Table.SplitColumn(#"Split Column by Character Transition", "Column1 - Copy.2", Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9"}, c), {"0".."9"}), {"Column1 - Copy.2.1", "Column1 - Copy.2.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Character Transition1",{{"Column1 - Copy.1", Int64.Type}, {"Column1 - Copy.2.2", Int64.Type}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Column1 - Copy.1", "A"}, {"Column1 - Copy.2.1", "x"}, {"Column1 - Copy.2.2", "B"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Custom", each if [A] <= [B] then Number.ToText([A])&[x]&Number.ToText([B]) else Number.ToText([B])&[x]&Number.ToText([A])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1", "A", "x", "B"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Removed Columns",{{"Custom", "Column 1"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"Column 1", "Column2", "Column3"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Reordered Columns", "Column2", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Column2.1", "Column2.2"}),
    #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column2.1", type text}, {"Column2.2", Int64.Type}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type3",{{"Column2.1", Text.Trim, type text}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Trimmed Text",{"Column2.2"}),
    #"Renamed Columns2" = Table.RenameColumns(#"Removed Columns1",{{"Column2.1", "Column 2"}}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Renamed Columns2", "Column3", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Column3.1", "Column3.2"}),
    #"Changed Type4" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column3.1", type text}, {"Column3.2", Int64.Type}}),
    #"Trimmed Text1" = Table.TransformColumns(#"Changed Type4",{{"Column3.1", Text.Trim, type text}}),
    #"Removed Columns2" = Table.RemoveColumns(#"Trimmed Text1",{"Column3.2"}),
    #"Renamed Columns3" = Table.RenameColumns(#"Removed Columns2",{{"Column3.1", "Column3"}}),
    #"Transposed Table1" = Table.Transpose(#"Renamed Columns3"),
    #"Renamed Columns4" = Table.RenameColumns(#"Transposed Table1",{{"Column1", "Company 1"}, {"Column2", "Company 2"}})
in
    #"Renamed Columns4"

I hope this is helpful. Full solution file and original data file posted below, so you can take a look at the detailed applied steps in Power Query if you want to.

In terms of turning the result into a usable fact table, you may want to unpivot it in the following way, Into the “narrow and long” shape well-suited to DAX:

  • Brian

Company Data.xlsx (8.6 KB) eDNA -Standardizing Company Data.pbix (23.3 KB)

Edit: looking at the two responses above that came in as I was putting mine together, I realize I may have completely missed the boat on this one – misinterpreting the question, and providing a detailed but highly irrelevant answer. If that’s the case – sorry, my bad…

From a modelling perspective, I would go with a link table and bidirectional filtering. (Point 1 from @piryani)

Products Company A <-> Link Table <-> Product Company B

Be VERY cautious with bi-directional filtering, often it seems like the ‘easy’ solution, but it can trip you up unexpectedly down the road and break your relationship logic.

1 Like

100% agree with Heather. I know I’ve posted the article below before, but I think it should be almost mandatory reading for anyone considering using bidirectional relationships in their data models. Bidirectional relationships can not only break the logic of your model, but can do so in a way that produces results that are not accurate, but on the face of them not obviously wrong - which IMO is a lot worse than producing results that are obviously crazy.

  • Brian
1 Like

Thank you for everyone for great inputs.

I really considering the bi-directional filtering, but with the updates from @Heather and @BrianJ, This is not a good solutions.

I can’t consider also transforming the text as its very tedious and data involve comes on different formats and sometime desired data doesn’t match from raw data.

I can consider what @C9411010 is suggesting. This might work for us.

Still open for other suggestions.

Hi @JazZ0003, a response on this post has been tagged as “Solution”. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the check box. Also, we’ve recently launched the Enterprise DNA Forum User Experience Survey, please feel free to answer it and give your insights on how we can further improve the Support forum. Thanks!