Customers Table

Hi

I have a Customers table and a Sales table.

The Customers table is the whole database. I would like to reduce it so that it only contains the Customer IDs that are relevant to the Sales table - that is, if the Customer ID is not in the Sales table then we don’t need that Customer in the Customer table.

Not sure how to do this as it would have to be dynamic. That is, every time the data is refreshed there could be new sales from customers not in our current truncated Customer table that would need to be added.

Is this something that could be sorted in the Query Editor or would it need to be done in Power BI?

Thanks.

Before you start to design a model it’s important to keep a few things in mind:

  1. The business process
  2. The grain of the model
  3. The Dimension tables
  4. The Fact tables

I like to “only bring what I need” but mostly that’s reducing the number of columns not so much the number of rows. Generally this reduces the file size and improves performance, so Power Query is the way to go in my opinion.

To illustrate I made a small test data set with just 2 tables (customers and sales).

Loaded it into Power Query and added a custom column to Customers with List.Contains

Finally add a filter, so you’re left with only the TRUE’s and you’re done. You can remove the KeepRows columns after the filter or just hide it in your model. Here are my samples.

Data Customers Table.xlsx (10.7 KB)

Customers Table.pbix (30.0 KB)

If you using power query to connect to a customer SQL database you are querying the entire Customer list for the first time. You can filter down the rows but you need to fetch the list of customer ids from the sales table.

You can make the choice of customers dynamic by right clicking the CustomerId “coming from the Sales table” and selecting “Add new Query”. This will give you a query list where only the customerid sales exist and each time you refresh the data the list will be updated. Again right click and remove duplicates so you have a unique list.

You can now use this list to filter the CustomerId on the Customer table but it needs to be turned into a parameter first. Create a new parameter as shown below:

Once parameter is created filter the customerid on the customer table using the contains option as shown below:

Now the list should filter the customer id’s based on the dynamic list from the sales table. Let me know how you go

Hi Meliisa

Thanks. I tried your suggestion and it works - although it does slow down the refreshing.

I will also try to reduce the customer list at source.

Jonathan

Hi Garry

I’m using CSV files. Will I be able to do the same thing with CSV files?

Thanks.

It feels very inefficient but in Query Editor I merged the Customer table into the Sales table.

This removed the customers I didn’t want.

I then created a Location table and a Customer table from the Sales table and removed unwanted columns from the three tables.

I then imported into Power BI. It seems to be working but does feel like an unusual way to remove the unwanted customers.

Can you see if adding Table.Buffer improves the query speed? Example below

let
Source = Excel.Workbook(File.Contents(FileLocation), null, true),
tCustomers_Table = Source{[Item="tCustomers",Kind="Table"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(tCustomers_Table,{{"Customers", type text}, {"Name", type text}, {"Adres", type text}}),
BufferSales = Table.Buffer(Table.Distinct(tSales[[Customers]])),
#"Added Custom" = Table.AddColumn(#"Changed Type", "KeepRow", each List.Contains(BufferSales[Customers], [Customers]), type logical ),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([KeepRow] = true))
in
#"Filtered Rows"

The Table.Buffer(Table.Distinct(tSales[[Customers]])) will load the distinct Customers in memory - you need the double square brackets to turn it into a Table because a single Column is of type List.
As you can see, you do not reference the previous step in the BufferSales line but in the next line you reference the step before BufferSales.

@KieftyKids this should work with CSV files. Importing CSV files should be fairly fast. How many customer records do you have?

Once the data is power query you can create a dynamic customer sales list and filter the customer table.

You don’t need extra steps to merge the two data tables and then to create a third table with the results. It would obviously increase the size of your model. How big is your pbix file at the moment?