Many 2 many and last value of a group

Roberto · August 21, 2020, 12:41pm

I have a couple of tables, one with a list of repositories which key is the compound of scope, org and repos name and some total, and a user table listing users of each repo. Users have a number indicating the order of the users for that repo (a repo can have then multiple users and a user can be in multiple repos).
I need to get for each repo the latest added user (the user with the higher user order).
I tried using a virtual relation between the two tables using the key as a link, but I cannot get the latest user for that tuple.
May you help?

Thanks in advance

Roberto

db.xlsx (17.0 KB) many2many.pbix (29.1 KB)

Rajesh · August 21, 2020, 1:28pm

Hi @Roberto

I created a calculated column. PFB the logic.

LastUser =
VAR _key = Repositories[Key]
VAR _Tab =
FILTER ( Users, Users[Key] = _key )
RETURN
CALCULATE ( MAX ( Users[User] ), TOPN ( 1, _Tab, Users[User order], DESC ) )

many2many.pbix (30.6 KB)

JarrettM · August 21, 2020, 1:31pm

@Roberto,

Here is my solution using a measure, not a calculated column. Both solutions are viable, but I prefer a measure over a calculated column as often as possible.

Thanks
Jarrett

Melissa · August 21, 2020, 1:32pm

Hi @Roberto,

Option 1. Power Query solution

Option 2. Measure
Because you’ve used dimensions from both tables this measure accounts for that.

Last User v2 = 
VAR MaxValue = CALCULATE( MAX( Users[User order] ), ALLEXCEPT( Users, Users[Key] ))
VAR vTable =
ADDCOLUMNS(
    SUMMARIZE( Repositories, Repositories[Key] ),
    "@LastUser", CALCULATE( MAX( Users[User] ), 
        FILTER( ALL( Users ),
            Users[Key] = EARLIER( Repositories[Key] ) &&
            Users[User order] = CALCULATE( MAX( Users[User order] ), ALLEXCEPT( Users, Users[Key] )))
        )
    )
RETURN

IF( ISINSCOPE( Users[Key] ),
    CALCULATE(
        MAX( Users[User] ),
        FILTER( ALLEXCEPT( Users, Users[Key] ), Users[User order] = MaxValue )
    ),
    MAXX( vTable, [@LastUser] )
)

.

Here’s your sample file. If you change the FileLocation parameter the queries will be restored.
eDNA - many2many.pbix (38.2 KB)

I hope this is helpful

Melissa · August 21, 2020, 1:54pm

This is amazing!
1 question, 4 solutions (and counting…)

Rajesh · August 21, 2020, 2:00pm

Yes @Melissa

This way we can learn more.

Let me know which one is best for this scenario.

I prefer calculated column for dimensions( They can use as a slicer in the report).
If it is measure they can’t use in slicer.

What do you think ?

Melissa · August 21, 2020, 2:03pm

If it’s a Dimension/Attribute then go for the Power Query solution.
From a performance standpoint, always try to push that as far back to the source as you can.

Rajesh · August 21, 2020, 2:23pm

Pushing back is good idea. But here we are merging two tables.

In real time(My personal experience) when we are merging tables (large datasets) will face some performance issues.

Melissa · August 21, 2020, 2:29pm

Hi @Rajesh,

I understand and if there is a performance issue with the Merge the process can be optimized.
There’s a multitude of ways to solve this in Power Query - I guess “cross that bridge when we get there…”

Roberto · August 22, 2020, 2:28pm

@Rajesh thanks for your elegant solution. My TOPN grasp still needs attention.
So I went through a video where @sam.mckay provides a clear explanation.
Thanks again!

Roberto · August 22, 2020, 2:46pm

@JarrettM thanks for your kind replay. I do not get how it works (it actually does ).
Does the function lastnonblank relies on some order? User order can be random

Roberto · August 22, 2020, 2:49pm

This is a very lively community!
In the meanwhile I found this post where @sam.mckay shows these 2 techniques
Calculate the most recent value with DAX in Power BI - #2 by sam.mckay

Roberto · August 22, 2020, 2:55pm

In this scenario, slicing is required and it will make building the report easier (focus is on users).
I’ll try both in production where data is massive. Unfortunately, I do simple things, but being the size of data always overwhelming then even if approaches provide the same result, some are not applicable when the going gets tough.

Roberto · August 22, 2020, 3:03pm

Thanks @Melissa! Your DAX solution will take me some time to be understood. I’ll try also the PQ as well, but as mentioned with large data set I found Power Query is not always the most efficient solution

Melissa · August 22, 2020, 3:31pm

Hi @Roberto,

If you provide a small sample, column headers and query names are more important than the actual data, I’m confident we can improve performance.