Power Query- Solution

Hi guys,

I have a question. I have 4 rows in a table with the same name and email id.

These 4 records are basically enquiries on the website.

The first enquiry came at (Created On Date) 06 then 08 and then 09.

Because there is just 1 enquiry on 06 and 1 enquiry on 08 then we need to capture both enquires but on 09 we have 2 enquires we need to capture only one enquiry that is basically the enquiry where Contact id and Opportunity is not null otherwise we need to take the first enquiry (oldest enquiry).

Any idea how to come about this in Power Query please see the sample dataset attached.
Data Need to sorted.xlsx (9.7 KB)

1 Like

Hey Federico,

are you only concerned by the oldest enqury by date or will you want it to consider the oldest enquiry per Fullname?

1 Like

Thanks David,

The name is not important thus the oldest enquiry.

Federico

1 Like

I’ll need to come back to this later, Valentines meal. So if anyone else can jump in on this.

I’d create a date only field, then group by on that field using the minimum of the date enquired which has date & time and also use ALLROWS to being back the data. Conditional column to compare new grouped date column vs min date/time enquired. Then you will need a conditional column which will flag when you have more than one enquiry on the date and whether the two columns is not null (this was the bit i struggled with )

Fed1.pbix (61.4 KB)

This is a very rough way of doing it, not elegant at all.

DJ

2 Likes

@FPastor ,

Very interesting problem. I think @DavieJoe had exactly the right approach, but in his rush out the door to dinner didn’t have the logic quite right. I picked up from him, built out the logic and created a custom function (RowsToKeep) to process the selected data table and remove the rows that don’t meet the logic condition.

Full solution file attached. Here’s the M code for the custom function:

(EnquiriesTable as table) as table =>
  let
    #"Changed Type" = Table.TransformColumnTypes(
      EnquiriesTable,
      {
        {"LeadID", type text},
        {"ContactID", type text},
        {"OpportunityID", type text},
        {"MYAReference", type text},
        {"DateEnquiried", type datetime},
        {"Fullname", type text},
        {"Email", type text}
      }
    ),
    #"Inserted Date" = Table.AddColumn(
      #"Changed Type",
      "Date",
      each DateTime.Date([DateEnquiried]),
      type date
    ),
    #"Grouped Rows" = Table.Group(
      #"Inserted Date",
      {"Date"},
      {
        {"MinDateTime", each List.Min([DateEnquiried]), type nullable datetime},
        {
          "Grouping",
          each _,
          type table [
            LeadID = nullable text,
            ContactID = nullable text,
            OpportunityID = nullable text,
            MYAReference = nullable text,
            DateEnquiried = nullable datetime,
            Fullname = nullable text,
            Email = nullable text,
            Date = date
          ]
        },
        {"NumEnqOnDate", each Table.RowCount(_), Int64.Type}
      }
    ),
    #"Expanded Grouping" = Table.ExpandTableColumn(
      #"Grouped Rows",
      "Grouping",
      {"LeadID", "ContactID", "OpportunityID", "MYAReference", "DateEnquiried", "Fullname", "Email"},
      {"LeadID", "ContactID", "OpportunityID", "MYAReference", "DateEnquiried", "Fullname", "Email"}
    ),
    #"Added Custom" = Table.AddColumn(
      #"Expanded Grouping",
      "IsFollowedUp",
      each if List.AllTrue({[ContactID] <> "", [OpportunityID] <> ""}) then 1 else 0
    ),
    #"Added Custom2" = Table.AddColumn(
      #"Added Custom",
      "IsMinRecord",
      each if [DateEnquiried] = [MinDateTime] then 1 else 0
    ),
    #"Grouped Rows1" = Table.Group(
      #"Added Custom2",
      {"Date"},
      {
        {
          "AllData",
          each _,
          type table [
            Date = date,
            MinDateTime = nullable datetime,
            LeadID = nullable text,
            ContactID = nullable text,
            OpportunityID = nullable text,
            MYAReference = nullable text,
            DateEnquiried = nullable datetime,
            Fullname = nullable text,
            Email = nullable text,
            NumEnqOnDate = number,
            IsFollowedUp = number,
            IsMinRecord = number
          ]
        },
        {"SumFollowedUp", each List.Sum([IsFollowedUp]), type number}
      }
    ),
    #"Expanded AllData" = Table.ExpandTableColumn(
      #"Grouped Rows1",
      "AllData",
      {
        "MinDateTime",
        "LeadID",
        "ContactID",
        "OpportunityID",
        "MYAReference",
        "DateEnquiried",
        "Fullname",
        "Email",
        "NumEnqOnDate",
        "IsFollowedUp",
        "IsMinRecord"
      },
      {
        "MinDateTime",
        "LeadID",
        "ContactID",
        "OpportunityID",
        "MYAReference",
        "DateEnquiried",
        "Fullname",
        "Email",
        "NumEnqOnDate",
        "IsFollowedUp",
        "IsMinRecord"
      }
    ),
    #"Added Custom1" = Table.AddColumn(
      #"Expanded AllData",
      "Keep",
      each
        if [SumFollowedUp] > 0 then
          (if [IsFollowedUp] = 1 then 1 else 0)
        else
          (if [IsMinRecord] = 1 then 1 else 0)
    ),
    #"Changed Type1" = Table.TransformColumnTypes(
      #"Added Custom1",
      {
        {"NumEnqOnDate", Int64.Type},
        {"IsFollowedUp", Int64.Type},
        {"IsMinRecord", Int64.Type},
        {"SumFollowedUp", Int64.Type},
        {"Keep", Int64.Type}
      }
    ),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type1", each ([Keep] = 1)),
    #"Removed Columns" = Table.RemoveColumns(
      #"Filtered Rows",
      {"NumEnqOnDate", "IsFollowedUp", "IsMinRecord", "SumFollowedUp", "Keep"}
    )
  in
    #"Removed Columns"

Hope this is helpful.

  • Brian

eDNA Forum - PQ Rows to Keep Logic Solution.pbix (109.8 KB)

3 Likes

@FPastor Here is my solution as well:
Federico - Rows To Keep.pbix (23.2 KB)

let
    Source = Table.FromRows (
        Json.Document (
            Binary.Decompress (
                Binary.FromText (
                    "hdDNSkQxDAXgV5G7njhpm6bJXZn+7Vy6kMssLoIiOI44vj92UBRn4UChUNLzcbIsUzAKlKsAmxCQ9A6CWkA8Ng3WUgw0bSYMrWhFAlelQCvOgVRmQIwYWZQ42Biz6oVTVGDCDhSYQKwGsEyUnZfGlsbY7b1pYOfxlKxb9FuP3l95Nzs/nu6O+/V13I9v6/Hj8H7ztF+fX64fDvtpt1kmqbXLyIPIaEBWEBTNA49wZ6n2WvL4/HW+JUdnEs7+suRKlBFWoZsbu2EZRUJKoNo9JWnYXforRdV4kuRHcjrTZamVoIW6g2yio5N2yDEz5JDMh0Yq551iinyS+FdKM8b/pd0n",
                    BinaryEncoding.Base64
                ),
                Compression.Deflate
            )
        ),
        let
            _t = ( ( type nullable text ) meta [ Serialized.Text = true ] )
        in
            type table [
                LeadID = _t,
                ContactID = _t,
                OpportunityID = _t,
                MYAReference = _t,
                DateEnquiried = _t,
                Fullname = _t,
                Email = _t
            ]
    ),
    DuplicatedColumn = Table.DuplicateColumn ( Source, "DateEnquiried", "DateEnquiriedDate" ),
    ChangedType = Table.TransformColumnTypes (
        DuplicatedColumn,
        {
            { "LeadID", type text },
            { "ContactID", type text },
            { "OpportunityID", type text },
            { "DateEnquiried", type datetime },
            { "DateEnquiriedDate", type datetime }
        }
    ),
    ExtractedDate = Table.TransformColumns (
        ChangedType,
        { { "DateEnquiriedDate", DateTime.Date, type date } }
    ),
    KeepDates = Table.SelectColumns ( ExtractedDate, { "DateEnquiriedDate" } ),
    DistinctDates = Table.Distinct ( KeepDates ),
    MergedQueries = Table.NestedJoin (
        DistinctDates,
        { "DateEnquiriedDate" },
        ExtractedDate,
        { "DateEnquiriedDate" },
        "RemovedDuplicates",
        JoinKind.LeftOuter
    ),
    Custom = 
        Table.AddColumn ( 
            MergedQueries, "Custom", each 
            let 
                Data = _[RemovedDuplicates],
                RowCount = Table.RowCount ( Data ),

                // When there are more than 1 row
                DoesBlanksExists = 
                    Table.SelectRows ( 
                        Table.AddColumn ( Data, "BlankCheck", each List.AllTrue ( { [ContactID] ="", [OpportunityID] = "" } ) ),
                        each [BlankCheck] = false
                    ),
                RemoveBlankCheckColumn = Table.RemoveColumns ( DoesBlanksExists, { "BlankCheck" } ),
                FinalTable = if Table.RowCount ( DoesBlanksExists ) > 0 then RemoveBlankCheckColumn else Data,
                FirstEnquiry = List.Min ( FinalTable[DateEnquiried] ),
                SelectTheFirstEnquiry = Table.SelectRows ( FinalTable, each [DateEnquiried] = FirstEnquiry ),
                RemoveDateColumn = ( SourceTable as table) as table => Table.RemoveColumns ( SourceTable, {"DateEnquiriedDate"}),
                RowCountCheck = if RowCount = 1 then RemoveDateColumn ( Data ) else RemoveDateColumn ( SelectTheFirstEnquiry )
            in
                RowCountCheck
        ),
    RemovedOtherColumns = Table.SelectColumns ( Custom, { "Custom" } ),
    ExpandedCustom = Table.ExpandTableColumn (
        RemovedOtherColumns,
        "Custom",
        {
            "LeadID",
            "ContactID",
            "OpportunityID",
            "MYAReference",
            "DateEnquiried",
            "Fullname",
            "Email"
        },
        {
            "LeadID",
            "ContactID",
            "OpportunityID",
            "MYAReference",
            "DateEnquiried",
            "Fullname",
            "Email"
        }
    ),
    ChangedType2 = Table.TransformColumnTypes (
        ExpandedCustom,
        {
            { "LeadID", type text },
            { "ContactID", type text },
            { "OpportunityID", type text },
            { "MYAReference", type text },
            { "DateEnquiried", type datetime },
            { "Fullname", type text },
            { "Email", type text }
        }
    )
in
    ChangedType2
3 Likes

Thank you David, you know I am bias - your solutions is the best :+1:

1 Like

Thank you very much Antriksh, will test it and keep you posted.

Federico

1 Like

Thanks Brian, will check it out and back to you if I need further clarifications.

2 Likes

It really isn’t my friend lol…but thank you all the same.

@BrianJ & @AntrikshSharma - I will check your solutions out later

1 Like

@DavieJoe unfortunately it does not work the way intended, still my best no solution :raised_hands:

1 Like

Ah, where does it fall down?

Can we help?

1 Like

Hi Antriksh,

Following on this I tested your solution and it works fine but my dataset contains 33,000 rows and take quite a bit to refresh, any idea how to fix or make it shorter?

Thanks once more for your help.

Federico

1 Like

@FPastor What about @BrianJ’s solution? is it fast?

@AntrikshSharma back to you with your answer in a few minutes. Thanks

This is still ongoing… any clue by the pic.

@FPastor The performance can be incresed by 11x but you need to create a Date Column in the Excel file itself.

Tested with 30,000 rows.

Federico - Rows To Keep Optimized.pbix (20.5 KB)

Data Need to sorted (1).xlsx (1.1 MB)

3 Likes

Nice! I think the solution on this and the subsequent optimization would make for an excellent video. Just sayin’…

  • Brian
3 Likes

@BrianJ Yeah, let’s see if it is actually optimized for his original data.

3 Likes