Ranking Equal Values based on alternative parameters

Anferrig · April 19, 2020, 7:01pm

Dear Enterprise DNA forum,

I am working with some football data and have some issues in understanding how to solve a sorting/ranking issue. Basically, when two teams have the same number of points I would need them to be ranked using as second parameter, the goals difference. And potentially, there could be the need of using a third parameter if also the goal difference is the same (not in this case though).

What would you suggest?

Thanks a lot in advance for your support

Ciao

Mark · April 19, 2020, 7:24pm

Hi @Anferrig,

I think this post from Sam should give you what you need:

It essentially creating one measure to give a rank based on 3 combined ranks.

Cheers

Mark

BrianJ · April 19, 2020, 8:03pm

@Anferrig,

Here’s how I do this:

do a RANKX of your secondary criteria first, but make sure you ASC as the sort order, so the best performer gets the highest rank, not a rank of 1
CONCATENATE your primary metric (in this case points) with the result of the reverse RANKX above using “.” as your delimiter. For example, Juventus would get a value of 90.20 (90 goals, and the 20th reverse ranking in 1). Be sure to wrap your CONCATENATE statement in a VALUE() function so that it returns a number
now do your primary RANKX on the values created in 2) above. This will properly separate out ties in points based on the secondary criteria, unless for the tied point values the secondary criteria are also tied. In that case, you can add in a second tie-breaking criteria and concatenate that onto the value created in 2) above.

I hope that makes sense. If you have any problems, please just post your PBIX and we can work through the specific DAX.

Brian

Anferrig · April 19, 2020, 8:43pm

Thanks a lot for your support @Mark

Will have a look at it soon and try to make it work for my case

Ciao

Anferrig · April 19, 2020, 8:46pm

Thank you very much for your support @BrianJ

The solution is great, it works perfectly and it’s also pretty easy to implement!

Really Helpful

Grazie Mille

Ciao

BrianJ · April 19, 2020, 9:04pm

@Anferrig,

Great – glad to hear that worked well for you. Just looking at your data made me realize how much I miss sports during this quarantine…

Brian

Anferrig · April 20, 2020, 9:27am

@BrianJ

Same here, missing sports a lot, especially Tennis in my case. PowerBI is the only ‘sport’ I am watching and practicing these days Perfect time for improving my skills, really happy to be member of Enterprise DNA with it’s enormous learning potential

Regarding the topic above, it’s getting tougher as the overall goal difference is only the 3rd alternative parameter, here them in order:

Do you have any idea on how to work out the Head to Head parts? I guess there’s the need to work with Virtual Tables, true? Was thinking to refer to something like this, but do not know how to include in the calculations (just picked random example):

Thanks a lot for your help

Best Regards

BrianJ · April 20, 2020, 1:30pm

@Anferrig,

OK, that’s an interesting curveball (to mix sports metaphors…). While I’m a huge fan of using virtual tables for a lot of things, I don’t think that’s the way to go here. Because the solution is going to need a regularly updated supporting table with the head-to-head information in it anyway, we can use a physical table for our head-to-head calculations, since it doesn’t need to be dynamic (i.e., change within the course of a session – recalculating upon refresh/reload will be fine here) and physical tables afford us a level of simplicity and flexibility that virtual tables don’t, which will be beneficial since the solution is going to get complicated as is.

Conceptually, I think what we need to do here is evaluate every head-to-head matchup for both records and goal difference, and assign the results to the proper decimal places to be added to the initial points metric before doing the primary RANKX. I think this will use an ADDCOLUMNS (CROSSJOIN) ) combination to evaluate all possible combinations of head-to-head against the supporting table.

If this doesn’t provide sufficient information for you to move forward, please post a copy of your PBIX with the head-to-head statistics information included as a disconnected supporting table in your data model, and I’ll be glad to work with you on the specific DAX to resolve this.

Very cool problem - enjoying working with you on this.

Brian

Anferrig · April 20, 2020, 7:00pm

Hello @BrianJ

Thanks a lot for your support, maybe too big challenge for my current Power BI knowledge and skills, but really great learning opportunity

Here attached the whole PBIX with the head-to-head statistics included as disconnected tables. Hope I understood what you meant. I created two of them, one keeping also the match day (giornata) detail, maybe it could help look at the ranking dynamically over time

Really appreciate your help, thanks a lot once again

Serie A 18-19 II.pbix (111.2 KB)

BrianJ · April 21, 2020, 4:53am

@Anferrig,

This actually might be the most difficult problem I’ve ever seen on the forum. Here are some of the things that make this so tough:

Per your results below, you can have multiple different ties, and each of those ties can consist of a different number of teams. Thus your measures have to be able to identify the different “pools’” of ties, construct the round robin dynamically based on the number of teams in each tie pool, and evaluate those results in a way that breaks the ties but doesn’t alter the correct order.
In addition, unlike goal differential, I haven’t yet figured out a way to utilize the head-to-head secondary criteria until AFTER you’ve run the first level ranking, since you won’t know who is involved in ties (and thus how to filter your huge cross-joined head to head table down to the relevant records) until after you run your initial RANKX
Finally, even if you do 1) and 2) correctly, is still possible you end up with ties after the second level RANKX, especially in the case of the three way tie, like you have below. DAX is unable to do recursion without the assistance of Python functions, so you will need to keep building conditionals until you are fairly certain that the possibility of subsequent ties is minuscule.
There is absolutely no prior work in this area I could find to build on. Did all sorts of searches on RANKX, DAX, ranking, matchups, head-to-head, etc. and found nothing. This is literally the first time I’ve ever come up empty on a search like this related to DAX and Power BI.

If you still want to give this a go, I’ll be glad to assist but I’ll admit at this point I don’t have anything more than a very, very vague idea of how to craft a solution for this one. I think the best we can do is break it down piece by piece, starting with the identification of the different ties pools.

Dumb question - how do you interpret the results of these head-to-head matchups. Is the first column the squad score, and the second the opponent score? If so, what’s the third column?

Let me know what you think of all this, and whether you still want to give it a go.

Brian

Anferrig · April 21, 2020, 4:05pm

Dear @BrianJ,

thanks a lot for your feedback on this, I had the feeling it was a pretty complex problem, but was not sure to what extent as had no idea about how to solve it. Now it’s much clearer and I would say, it looks almost impossible for my skills

The only solution I found around is an excel one, without VBA, but looks very complex and it’s based on a large number of tables. Am still studying it to understand how it works. Anyhow, even when done with the understanding, I won’t be in the position of saying that even if doable in excel, it can be done also in PBI… but there can be nice ideas hidden there. I am posting it here in case you would like to see it (have just randomly populated the match scores).

Regarding the last question, in order they are points (obtained in the direct HtH matches), goals scored and goals conceded.

Here the example:
Atalanta 6 points vs Bologna (2 Wins), 6 goals scored, 2 conceded
Bologna 0 points vs Atalanta (2 Losses), 6 goals conceded, 2 scored

I would say, if you agree, we can try to work some little pieces to see if we get somewhere, maybe with the ideas from the excel file. If too demanding in terms of effort or time, we can abandon it shortly

Let me know what you think, anyway I am really grateful to you for the time spent on this so far

Thanks a lot

Ciao

SerieA_2019-2020.xlsx (203.7 KB)

BrianJ · April 22, 2020, 5:14am

@Anferrig,

Sure – I’m game to give this a go if you want to. I definitely agree with the idea of attacking this one in small pieces. It’s not surprising to me that the solution was found in Excel. The one thing I’ve found much tougher in Power BI than Excel is analysis that involves vertically traversing a column (e.g., comparing a value in one row to the row above it). In Excel, you can obviously just point to each cell individually. In Power BI, getting the value in the previous row looks like this:

Previous Value ToTOU =

VAR PreviousRow =
TOPN(
    1,
    FILTER(
        pTable,
        pTable[MonthnYear] < EARLIER( pTable[MonthnYear] )
    ),
    [MonthnYear],
    DESC
)

VAR PreviousValue =
MINX(
    PreviousRow,
    pTable[TotOU]
)

RETURN
PreviousValue

And this solution is going to involve a LOT of vertical traversing in identifying and analyzing ties.

If you aren't already very familiar with it, I would recommend deep diving into the use of EARLIER, and the more modern construct of it using variables:

http://portal.enterprisedna.co/courses/108877/lectures/13346826

This thread may also be quite relevant and helpful:

https://forum.enterprisedna.co/t/count-consecutive-months-that-a-measure-was-positive-or-negative/5725/15?u=brianj

See what you think.  Would be interesting to see how far we and others on the forum who are interested can get on this...

- Brian

Anferrig · April 23, 2020, 12:39pm

@BrianJ

Thanks a lot, I am going to deep dive into the suggested topics, to get more familiar with them and let’s see

Will come back posting here as soon as I manage to sort out and move on with something

Thanks a lot

BrianJ · April 25, 2020, 4:34am

@Anferrig,

Check out this video just posted today. Super relevant to our identification and counting of the “tie pools” in the ranking results:

Brian

Anferrig · April 25, 2020, 7:17pm

@BrianJ

Super interesting and really useful.

With this technique I have been able to do a kind of sub sorting of the teams with equal number of points based on a second type of ranking. Have tried to combine Points and Goals Difference and it works perfectly.

Most of the columns are not needed, they are there just for me to understand how to proceed:

If I understand correctly, a combination of the Earlier formula could be also used to extract the subset of Teams (with equal points) for which the other conditions (Head-to-Head) should be tested. Working just with those pools should help reducing the amount of evaluations needed toward the HtH results tables (as not needed for all the Teams but only for those with equal points).

What I still have no idea about is how to combine the information in the HtH table with these small virtual pools and what key to use

Let’s imagine for this pool:

We would need to use this table to extract the HtH Data:

For pools of 3 teams is even more complex as they should be considered in pairs, am I right?

Thanks a lot

Ciao

BrianJ · April 28, 2020, 6:36am

@Anferrig,

Great progress. You clearly mastered the use of EARLIER (one of the more confusing DAX functions IMO) quickly, and are using it exactly as envisioned to identify the tie pools. I started working on a CROSSJOIN application of the three-way tie pool when I realized if we are going to make this work end-to-end, we are going to need to take a different approach. Rather than starting with DAX, I think we should pull the data directly into Power Query from the web, clean it, identify the tie pools via PQ Groupby, break out the tie pool rows into separate supporting tables, merge them with the head-to-head data all still in PQ and THEN perhaps start our DAX work.

Tonight, I worked out how to automatically identify the tie pools, number them, count the members in each pool, and identify the specific members using the PQ Group by, Aggregate and Index functions. I’ve attached my PBIX so that you can look at the individual applied steps, and here’s the M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TZDBDoIwDIbfZWcOikLgqEaNRo0RPREOTVi0ybaSDTj49K4bCR6/b3+7tnUtzsMoTT84kYhyIZqkFjfoSKHnogy86UGB6cGbPJqT6aWd8Yr+nbEI+CAdsnmgJ1k0xLwKfIEvMmaxtgLdtT7CFVlMbEnR27BYr6eMc4OXbGLk1aKRTs6i6kAxpYEOSNYvhbHJMrgdvBWCxdncweq/wFEaYlzFLfZ6usLEB0vOL2L40zSLLT8oR54q9Ydrfg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Squad = _t, Points = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Squad", type text}, {"Points", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Points"}, {{"All Rows", each _, type table [Squad=text, Points=number]}, {"Dist", each Table.RowCount(Table.Distinct(_)), type number}}),
    #"Expanded All Rows" = Table.ExpandTableColumn(#"Grouped Rows", "All Rows", {"Squad"}, {"Squad"}),
    #"Grouped Rows1" = Table.Group(#"Expanded All Rows", {"Points", "Dist"}, {{"All Rows", each _, type table [Points=number, Squad=text, Dist=number]}}),
    #"Filtered Rows" = Table.SelectRows(#"Grouped Rows1", each ([Dist] <> 1)),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 1, 1),
    #"Renamed Columns" = Table.RenameColumns(#"Added Index",{{"Index", "Tie Pool Number"}, {"Dist", "Number Squads in Tie Pool"}}),
    #"Expanded All Rows1" = Table.ExpandTableColumn(#"Renamed Columns", "All Rows", {"Squad"}, {"Squad"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded All Rows1",{"Squad", "Points", "Tie Pool Number", "Number Squads in Tie Pool"})
in
    #"Reordered Columns"

Next step I think is will need to figure out how to dynamically break down each tie pool automatically into a separate table and merge it with the head-to-head data. Agree?

Long ways to go, but I think we’re making progress.

Do you have a website where we can pull all the data we need directly in through the PBI web connector?

Brian
eDNA Forum - Ties an HtH Sol1.pbix (59.8 KB)

BrianJ · April 28, 2020, 2:52pm

@Anferrig,

OK - it looks like the Table.Split function is going to be the key that gets us past the next milestone in this solution.

https://www.poweredsolutions.co/2019/12/19/split-segment-partition-section-your-table-in-power-bi-power-query/

Do you want to take a crack at implementing the Table.Split process on the Tie Pools table I created above?

Brian

Anferrig · April 28, 2020, 7:15pm

@BrianJ

Thanks a lot Brian, fantastic.

Once again, many great learning points: had never used the columns profile, distribution and quality before (really cool feature) and even if I was familiar with the group by transformation, had never used the All Rows operation. It’s super useful as it allows grouping, get the column with the summarized info needed, and ungrouping again to show up something that was previously grouped like the teams in our case (awesome technique).

So, on the first piece I am aligned, have created both points and goal difference tie pools tables:

The second part, yes, I would like to give it a try. I am going to review the posted article. I have used in the past some splitting techniques, hope I manage to apply it to our case. Will come back to you soon with the elaborated file and the connection to the data source.

Ciao

BrianJ · April 28, 2020, 8:39pm

@Anferrig,

Source data link and file will be great, but hold off for a moment on the Table.Split approach. Player 3 (@Melissa) has now joined the game, and has a better approach to the merge and pairing problem then using Table.Split, which she will be posting soon…

Brian

Melissa · April 28, 2020, 9:37pm

Hi @Anferrig,

I’m late to the party but hope I can contribute

Thought @BrianJ did an amazing job pushing back some of the analysis to Power Query - made a few changes to limit the number of GroupBy transformations and thought I could make a suggestion in how to handle the Tie Pool merge with the Head2Head table.

So I added a Custom Column, created a variable for the Squad and Opponent from the Outer table (HtH Sample (2)) and inside Table.SelectRows I called the Tie Pools (2) query and filtered on the Squad from the Inner table ( Tie Pools (2)), so to end up with max 2 rows.

I hope this is helpful and that we can get this web connection in place… here’s the updated file.
eDNA Forum - Ties an HtH Sol1.pbix (87.7 KB)