Combining Databases

I have 18 companies each with their own databases. All these are stored in the same location. The databases all have the same database structure. [The 18 databases are subsidiaries of a parent Company]

How can I have these treated as one database when using Power Query (or Power BI)? I want to be able to choose tables and have the tables from all, or a selection, of the databases queried. The end target is I get the selected tables from the database(s) I selected in a data model.

Hi @AllisterB

When you choose to get data from your selected database, there is an option to write a query, skip that step and select “Ok” and on next instance, you’ll see all your tables is a database and you can select your tables from there.

Thanks
Ankit

Hello @AllisterB,

Not sure why would you need that but it is possible and I would do it like this:

  1. Create a single connection to the SQL server, SO NOT choose any DB and set it to NOT LOAD
  2. Reference first connection, connect to each DB and NOT LOAD
  3. For each DB connection add a custom column (with the same name) with the DB Name
  4. Append ALL individual DB queries in a single table with ALL DATA from ALL DB
  5. Reference this One table and choose tables you need for the model

Please see below the Query Dependency Sample I built for your reference

If you liked my solution please give it a thumbs up :+1:.

If I did answer your question please mark my post as a solution :white_check_mark:.

Thank you !

Cristian

3 Likes

Thank You Ankit
I am wanting to combine all 18 Databases so that I can easily select the same table(s) (eg the Customer Table) without having to select the table(s) in each of the 18 databases separately.

Can you explain how you solution will do this

Thank You.

Allister

Hi @AllisterB,

It’s not clear to me if you have in fact one DB containing 18 Companies or not and a suitable answer will depend on what type of DB you actually have…

If you have one DB try to connect to the outer DB level so without selecting a single Company. You should see a list of Companies with all available Tables across the columns for that DB.
Like @cristian.angyal suggested I would keep this base query in tact and (right-click) Reference it for all others.

When you’ve created a Reference, select the Companies you need from the List, then select a single Table by removing all other columns. That way you end up with a 2 column table the first column contains the Company and the second the data in a nested table object. All that’s left is to expand the selected Table column containing the nested Table object and you are done.
Repeat the process for every table you need.
.

However if you’ll need to connect to each DB and Company separately. Consider the use of Dataflows if you are on a Pro licence - that way you’ll only have to set up the Combined tables for all Companies once and not for each report you’re going to built…

I hope this is helpful.

1 Like

Thank You, Melissa

For Clarity, I have 18 subsidiary companies each with their own db. The 18 dbs have the same structure…

So am I right to say that for Power BI I will need to use Dataflow - is there an alternative.

What is the approach if I want to use Power Query

Thank You

Allister

For the creation of Dataflows you will use Power Query online… So either way Power Query rules :wink:

The difference is that you will set up and design once and can than use the output over and over again because it is stored in Azure Datalake gen 2. If you do a search on Dataflows within the forum you’ll find several topics with more details and links to MS documentation.

I hope this is helpful.

1 Like

Would the easiest solution to have Database 19.
This then has a view called customer and the view is created as
Select * from Database1.Customer
UNION
Select * from Database2.Customer
UNION
…
Select * from Database18.Customer

And then you just query one view. This assumes there is no issues with security etc.
E

1 Like

Hi @AllisterB, did the response provided by the contributors help you solve your query? If not, how far did you get, and what kind of help you need further?

This looks interesting… can expand / explain how to do the Custom Column in point 3. What is the code / function used for this.

thanks

Allister

Hi @AllisterB,

You can find “Custom Column” on the “Add Column” tab on the Ribbon.
This will open a dialog box where you can enter your logic in M code, for example:

This will create a DB Name column and when you Expand in step 4, that Name will be filled on all rows from that DB, allowing you to filter, slice and dice your data on Company level.

Alternatively you could leave the DB connection as is AND add a Company dimension table with that DB Connection and DB Name instead.

1 Like

Thank You Melissa (and cristian)

The solution looks very straightforward, When would one use the Dataflows approach rather than this approach - what factors would need to be considered in making the chioce.

Thanks

Hi @AllisterB,

Some advantages of dataflows in my view are:

  • design and set up the data for a specific Table just once.
  • the bulk of the transformations is done when you connect to that (dataflow) table in your model
  • it speeds up the building of new models (because the dataflow tables ready for consumption)
  • consistent output for each table: columns, datatypes, naming conventions etc.
  • easy to maintain, one dataflow instead of every report containing some version of a table.

Here are some links.

2 Likes

Hi @AllisterB,

I also encourage you to read documentation from Radacad, they nicely explain Dataflow & Dataset: https://radacad.com/?s=dataflow

I am also in the process of combining data sources with PBI Online Dataflows, and then aim to structure dataset for relationships and DAX formulas.

Best,
Alexandre

1 Like

I now find out that the 18 dbs are SQL dbs but are not on a SQL server. How could cristians’ approach work in this siuation?

Thanks @Melissa!

Hey @AllisterB.
Could you provide more details on SQL DBs not stored on SQL server?
Not sure I follow