Near Duplicates - Power Query

Ericadyson · September 24, 2020, 7:54pm

OK. This can’t be that difficult. I’m 50% of the way there but not 100%

I have COVID 19 people that have gone into isolation more than once, but I want to analyze the date difference/s.

So, I was fine with using PQ to group by ID number and get the date for each “duplicate” or even “triplicate” row but how do I compare the dates between rows with the same ID?

So my data looks like this

A date 1
A date 2
B date 1
B date 2
B date 3
C date 1
C date 2

All help would be much appreciated (of course)!
Thanks
Erica

Melissa · September 24, 2020, 8:15pm

Hi @Ericadyson,

It’s not clear to me what you’re after, especially since the sample provided doesn’t seem to match your description about multiple rows nor does it include actual dates making it hard to work off.

Please provide more details and a mock up with desired results would be greatly appreciated as well.

EnterpriseDNA · September 25, 2020, 12:36am

Hi @Ericadyson, we aim to consistently improve the topics being posted on the forum to help you in getting a strong solution faster. While waiting for a response, here are some tips so you can get the most out of the forum and other Enterprise DNA resources.

Use the forum search to discover if your query has been asked before by another member.
When posting a topic with formula make sure that it is correctly formatted to preformated text </>.
Use the proper category that best describes your topic
Provide as much context to a question as possible.
Include demo pbix file, images of the entire scenario you are dealing with, screenshot of the data model, details of how you want to visualize a result, and any other supporting links and details.

I also suggest that you check the forum guideline https://forum.enterprisedna.co/t/how-to-use-the-enterprise-dna-support-forum/3951. Not adhering to it may sometimes cause delay in getting an answer.

Ericadyson · September 25, 2020, 4:37am

Hi Melissa

Thanks for reaching out. This is what I want to do. I have people who have been insolation more than once. I want to check the days’ difference for each of them. So when I say duplicates… they aren’t “duplicates” in the strict sense but their ID appears more than once in the list of people who are or who have been in isolation is a given locality.

I have grouped the IDs and identified where they appear more than one. Now say I have 3 rows for that ID and 3 different dates … how do I compare each date with the previous, and then move onto the next person with multiple isolations?

Does that make it a bit clearer? Hope you can help me out.

Thanks so much.

Erica

Melissa · September 25, 2020, 11:11am

Let me know if you have any questions.

Ericadyson · September 26, 2020, 4:52am

Hi Melissa

Thanks for getting back to me. I did go through your solution… great but I had difficulty adding the first date as a second column… What did you mean by " Note that if both dates should be included in the day count you’ll need to add 1 to the calculation.". Where do I add +1 and why does it work?

Melissa · September 26, 2020, 12:34pm

Hi @Ericadyson,

A couple of techniques showcased in this video could be helpful in your scenario as well, you may want to review that if you have the time.

As for what solution to go for…
Let’s start by stating that it’s not a competition to write complex- or as little M as possible. Choosing a solution you understand which performance wise is acceptable is very important. I fully agree with you there.

Kudos to you for exploring other solutions to your own question that is so important for increasing your understanding of M - there is no better way than just diving in and explore…

Ericadyson · September 26, 2020, 12:56pm

Ooops. I meant to write "you’re a real powerhouse WITH endless patience! So so sorry… for writing the wrong word.