Latest Enterprise DNA Initiatives


Function needed for dynamically generated distance table

I am looking for some assistance in creating a distance table in Power Query that is dynamically generated for each combination of locations from my locations table.

My data model consists of the following table where every store and distribution centre is listed. A short sample is presented here. In reality my table holds 45.000 locations with their geographic longitude and latitude and receives a weekly update with new locations being added and others being removed.


From this table I would like to generate another table listing the distance between each and every location, with the condition that the distance is less than 3 kilometers. Given the 45.000 entries I reckon this will generate a table with max. 300.000 – 500.000 rows.

The resulting distance table should look something like this:

I managed to insert the proper formulas to calculate the distance between two geographic points (latitude, longitude) in Power Query. See attachment.

The main issue is how to generate a table holding all possible combinations of locations, with the condition that only combinations with a distance less than (e.g.) 3 kilometer should be kept.

I see how functions are used in Power Query to generate date tables, with a dynamic number of rows. Perhaps invoking a function is the correct way to generate this table?

Perhaps an approach is to have a function that generates a table with all possible combinations, calculate the distance and afterwards delete all rows with a distance > 3 KM. But it seems rather brute force to generate 45.000 x 45.000 = 2 billion rows in order to keep only some 400.000 of them.

Is it possible to have a function that already calculates the distance and based on the outcome creates a row for it or not? And how to update this distance table when de locations table gets a refresh?

Hopefully someone can help me out with this.

Regards,

SebastiaanForum question M-code distance table.pbix (34.8 KB)

Hi Sebastiaan, in order to calculate whether the distance is > 3 km, the initial calculation will have to be executed, so indeed 45000x45000 variations.
As a logistics consultant in location intelligence (in the Netherlands) I regularly come across this type of projects. With these numbers I turn to other software for the distance calculation, ( never tested this amount though) which on top gives me the actual distance as well as straight line distance. The distance table is then used in the datamodel. Is this an option for you?
Paul

Hi Paul, I see your point. In the current situation a solution through other software is not preferred. Perhaps the issue shouldn’t focus to much on the distance aspect.

The challenge is to create a Power Query function that generates a table with all combinations of location-ID’s that fit a certain condition. Instead of distance, the condition could also have been something like: ‘combinations with the same building colour’, if building colour was a given attribute in the location table.

Sebastiaan.

Hi Sebastiaan,
ok, was obviously focussing on distance. In this case you could create your selections in a separate table and merge this with the main table, so you keep the “valid” records.
If you create the tables in PQ from the source data, it will become dynamic.
Paul

Yes, you can do a cross join in Power Query to generate all the combinations by simply adding a custom column that reference the locations table. This could then have calculations and filters applied to it.

Check out this article that explains the concept.

Hi @Sebastiaan, we’ve noticed that no response has been received from you since the 31st of May. We just want to check if you still need further help with this post? In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the checkbox. Thanks!