Need help in Power Query Removing Duplicates, below is the logic for duplicates removal:
[PS Status] = Complete & Sent and [Date Out] is in Current Month then Keep the row which is latest. If [Date Out] is null check the Report Date and keep the latest row based on [Report Date] column.
[PS Status] = Closed then check the [Date Out] if its null check [No Bid Date ] and if it’s in current month keep that row otherwise check the Report Date and keep the latest Report Date Row.
Thank you!
Hi @kurmi.nikhil3886 ,
let
Source = YourSourceHere,
AddIndex = Table.AddIndexColumn(Source, “Index”, 1, 1, Int64.Type),
SortTable = Table.Sort(AddIndex, {{“PS Status”, Order.Descending}, {“Date Out”, Order.Descending}, {“No Bid Date”, Order.Descending}, {“Report Date”, Order.Descending}}),
GroupTable = Table.Group(SortTable, {“PS Status”}, {
{“KeepRows”, each
let
FilterCompleteSent = if [PS Status] = “Complete & Sent” then
let
DateOutCurrentMonth = Table.SelectRows(, each [Date Out] <> null and Date.IsInCurrentMonth([Date Out])),
LatestDateOut = if Table.RowCount(DateOutCurrentMonth) > 0 then Table.First(DateOutCurrentMonth) else null,
ReportDateNull = if LatestDateOut = null then Table.First(Table.SelectRows( , each [Report Date] <> null and Date.IsInCurrentMonth([Report Date]))) else LatestDateOut
in
ReportDateNull
else null,
FilterClosed = if [PS Status] = “Closed” then
let
DateOutNull = Table.SelectRows(, each [Date Out] = null),
NoBidDateCurrentMonth = Table.First(Table.SelectRows(DateOutNull, each [No Bid Date] <> null and Date.IsInCurrentMonth([No Bid Date]))),
LatestReportDate = if NoBidDateCurrentMonth = null then Table.First(Table.SelectRows( , each [Report Date] <> null and Date.IsInCurrentMonth([Report Date]))) else NoBidDateCurrentMonth
in
LatestReportDate
else null,
Result = if [PS Status] = “Complete & Sent” then FilterCompleteSent else FilterClosed
in
Result
}}
),
ExpandTable = Table.ExpandTableColumn(GroupTable, “KeepRows”, Table.ColumnNames(Source))
in
ExpandTable
For detailed support and tailored assistance, you can also post your query on Data Mentor (https://mentor.enterprisedna.co/queries ).
Cheers,
Enterprise DNA Support Team
@kurmi.nikhil3886 ,
You can check Data Mentor thread available at:
This analysis guides on deduplicating data in Power Query by keeping rows with the latest dates, based on conditions related to columns like [PS Status], [Date Out], [No Bid Date], and [Report Date]. Detailed code and steps are provided
Hope it helps.