Name split in PQ

Hi Team,
I have the data from ERP loaded as attached and I will require the data as per expected description. Can you please guide me how to do that in power query.
Thank you.
Regards,
A
QSplitNamesforPQ.xlsx (16.5 KB)

@ammu,

Give this a go. It’s not elegant, but it works… :grinning:

let
    Source = Excel.Workbook(File.Contents("C:\Users\brjul\Downloads\QSplitNamesforPQ.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Description", type text}, {"Expected Decription", type text}}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Description", Splitter.SplitTextByEachDelimiter({"-"}, QuoteStyle.Csv, true), {"Description.1", "Description.2"}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter1", "Description.1", Splitter.SplitTextByEachDelimiter({"-"}, QuoteStyle.Csv, true), {"Description.1.1", "Description.1.2"}),
    #"Split Column by Character Transition1" = Table.SplitColumn(#"Split Column by Delimiter2", "Description.1.1", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c)), {"Description.1.1.1", "Description.1.1.2"}),
    #"Added Custom1" = Table.AddColumn(#"Split Column by Character Transition1", "Custom", each if Text.Length([Description.1.2] ) < 4 then null else [Description.1.2]),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Added Custom1", {{"Custom", null}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Replaced Errors",{"Description.1.1.1", "Description.1.1.2", "Description.1.2", "Description.2", "Custom", "Expected Decription"}),
    #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Description.1.2"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Final", each if [Description.2] = null then [Description.1.1.2] else if [Custom] = null then [Description.2] else [Custom]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Expected Decription", "Final"}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Removed Other Columns",{"Final", "Expected Decription"}),
    #"Split Column by Character Transition" = Table.SplitColumn(#"Reordered Columns1", "Final", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c)), {"Final.1", "Final.2"}),
    #"Added Custom2" = Table.AddColumn(#"Split Column by Character Transition", "Custom", each Text.Select( [Final.1], {"A".."Z", " ", "&", ".", "/"})),
    #"Reordered Columns5" = Table.ReorderColumns(#"Added Custom2",{"Final.1", "Custom", "Final.2", "Expected Decription"}),
    #"Reordered Columns4" = Table.ReorderColumns(#"Reordered Columns5",{"Final.1", "Custom", "Final.2", "Expected Decription"}),
    #"Reordered Columns2" = Table.ReorderColumns(#"Reordered Columns4",{"Final.1", "Custom", "Final.2", "Expected Decription"}),
    #"Removed Columns3" = Table.RemoveColumns(#"Reordered Columns2",{"Final.1"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Columns3","PM E A ","",Replacer.ReplaceText,{"Custom"}),
    #"Merged Columns" = Table.CombineColumns(#"Replaced Value",{"Custom", "Final.2"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Cleaned")
in
    #"Merged Columns"

2 Likes

@ammu It looks as if you not only want a name split, but also a name correction, Leo.

If the names never change you can do this by writing it in the code.

The upper lines bear another problem, they include a dash - which makes the splitting process a lot more difficult. These would be easy without that one dash in your expected description, but with it gets difficult to differentiate [upper letters behind the dash].

image

Again, if the names never change you can solve this single dash issue by writing it in the code.

let
    Source = Excel.CurrentWorkbook(){[Name="Description"]}[Content],
    #"Split Column by Character Transition1" = Table.SplitColumn(Source, "Description", Splitter.SplitTextByCharacterTransition({"0".."9"}, {"-"}), {"Pref", "Description"}),
    #"Replaced Value" = Table.ReplaceValue(#"Split Column by Character Transition1",null,each[Pref],Replacer.ReplaceValue,{"Description"}),
    #"Split Column by Character Transition" = Table.SplitColumn(#"Replaced Value", "Description", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c)), {"Description.1", "Description"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Split Column by Character Transition",null,each[Description.1],Replacer.ReplaceValue,{"Description"}),
    #"Trimmed Text" = Table.TransformColumns(#"Replaced Value1",{{"Description", Text.Trim, type text}}),
    #"Split Column by Character Transition2" = Table.SplitColumn(#"Trimmed Text", "Description", Splitter.SplitTextByCharacterTransition({"-"}, {"A".."Z"}), {"Description.1.1", "Description.2", "Description.3"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Split Column by Character Transition2",each if [Description.1.1] = "-" or Text.Length([Description.1.1])<5 or Text.Upper([Description.1.1])<>[Description.1.1] then [Description.1.1] else null,null,Replacer.ReplaceValue,{"Description.1.1"}),
    #"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2",each if [Description.2] = "-" or Text.Length([Description.2])<5 or Text.Upper([Description.2])<>[Description.2]then [Description.2] else null,null,Replacer.ReplaceValue,{"Description.2"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value3", "Description.2", Splitter.SplitTextByEachDelimiter({"-"}, QuoteStyle.Csv, false), {"Description.2"}),
    #"Replaced Value4" = Table.ReplaceValue(#"Split Column by Delimiter", each if [Description.3] = "-" or Text.Length([Description.3])<5 or Text.Upper([Description.3])<>[Description.3]then [Description.3] else null,null,Replacer.ReplaceValue,{"Description.3"}),
    #"Merged Columns" = Table.CombineColumns(#"Replaced Value4",{"Description.1.1", "Description.2", "Description.3"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Description"),
    #"Added Check" = Table.AddColumn(#"Merged Columns", "Check", each if [Expected Decription]=[Description] then null else [Description]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Check",{"Description", "Expected Decription", "Check"})
in
    #"Removed Other Columns"

QSplitNamesforPQ.xlsx (29.1 KB)

2 Likes

Hi @ammu,

The current POTW is relevant to your question, you can find all about that here.

.
Wow amazing to see so many responses already to this question.
Here’s my attempt - just paste this M script into a new blank query.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVjbcuJIEv2VjH7aiRj1SOKqfitEGYSFpJEEXkdvPxRQxooWEquLPd6v3yxdjAAB7p7eWD84oOpkVuXtZBZfv34ayP3OH7Ki9BR50JNGC8syrIkHD8SlU3vh0U+/f2pZ/Pb710+GtZSVgSrdw79yWVb74INuUiKg4FF3aejUQ+mr2wc93b+tR9HknjRx6QSmi8m0kGl+KyGa1pXVoUTWiWRYhm8QE6aPE4NawtDTlWMRyaWWb98bJtQ4x38E0x+j5MUtoUKTB0O8GLnzbQu8KaX+nPrELO53tlh4ZPkliF4kWVZki9yPFobpN846WxIiug7g8T2osjSZ0ZG9MH3JC7ZRCk9xAmy3ykMWrbk4s9o+FUMDRoYlzVkQZTwSYAgyvisVvARhyMTZBehU1JtjhkhjtuZvAsghC3YrnhSSSbzi8IyiBejCoSYdgxlsnzPYhohPL540mxu+a2PMp5LFX+E7fxPYw2rtPmMpl3+qqsgPxNenju02nXi+Vp9kr7ODTTpLgmi7Ys+w4tH6GVLOssq69MymSrKyyeVPebQpfHDJqErgPWAO3/A0SwIWgYhdW7BORbw955sCnX4E7udJwnbiH3jsiWdvMA+SJE7aZK345Uh2lLD1d56VCXHj3FPZEp6+J+T7PX5MGFblHa5IzafEsKgnjflzvgs2wdMbF4dUyzfRwRkas6mrDu8My6ITYhUcNPAAv1MQZThGHkKJ6/uVHllT5J466EzJhLj+Y43V7bljUp+Ca9t3gtb+YXkPv4GDaj+KrM0a8/Wx68ogH5jA/uttyzG92jxYCZeZ74T5NoU5S+IcnQ4u0c/y/Qj/IGLERMKLD9mzSKs8vCTzfsH6Pi3pjg4jvb42oGRiUhjTJTVtZ45ECxPXXgjvXNqppfuKqvyctLjqDMtQVST9kbhjkz5Kep5kSI7IaKWhLMvY+nnHxcVr0EXhB5ZmHDZBuo9TFrZJ3PGVkPCXdD6nrnGPTJxx5IIQtnG8EVX2vnUqc8a/Jemy/7BkAwnfs+AsDrWkNzWoOfYkEmL+R0G+g32I58I2YU9ZLEyrIe2HFtqTI/WCfq0ldi95oMlqx3cNcMzFfNRs2A0mvr7fVNjtqiNstBaxriq8jamUYm7NqD8yiX5fgqnb1HN574RExo6L6qUNRw/yhIVBihXwKqJXlNy4yJASdFY7tf9HeRRhp0khexX1gzTX5vrT6vHZvq2QT4OLtbxboXbYsyRLWxTP8vBNKNadhWlSS6IhX2MfWmPyea9Bthb9u9orRUAXqeUnLErZOgviSHKSeC5yPAH61z5OMiD5FlWiYO1uRZG7fXVCTIe6VllrDW+3r18abXzq+UCsMfhkAgucuVrIQ+kOO7Y3txFFTapjnuk4mHm2ufAN2xIMeG23UjLo6y4xJqAT10GKoC6296kFrrEUCVFNpLpNPJzdUJHw04/g64P/H8c1MrisZb8i7RDCwywm+CMss7Gd/InvT+mDJ/lxHIrY8yjl5WxU7ZyK1D23zCxe9qURZhuHNMT/be36tFx8nJMg5GxTSmfiawQZ216pmsNVdaxRbGoib1svWtUPmREkaTJDklvzYJ9B/IQMzlMewj5P1s9MmFmDLhafW/iPY58QN95Uvp0krO2qRwN1yF6jXfyKJZXy5CUopvfjIFThq8vWE21pxbbpabmeQ7Gx/zvnHBa7VcJxtm8TqXpYNQnHebQOQmD7PQbp4LtrTUVnaRDFgszgYMtTzsMrbaUxDGdJHLbUdXc49B4xzalv6EgXxPPAc6iOry7D80VZX9s9cAMx6T9BN+aYZ6ZpiEgeL1T3KmlMckeuPTaoJB5HSbBDK5IoyPIEe3qO750YNiI6FeqCrHEmWzyt2JpF57J1KpiuIZo+Pj4dbEa0mFP0hetjfcDUtu+FwRXmfczsyYNBR1MV0wGLCDp7MMa0wbXt67VrNFW7/opGkIcNuTccdvuy49roZTx+DPgRKRy1NRsuyt+CNE+vFXe1X6wYzerfNAtB3eugxqPKmdmebdVjoSigcuU6ToyPx9Di3I5883JVWDvyT4W1dKymaf1f7Nj6WsOfzLbODY+XoNspqWhd5XbsKhqsYuLt4u9iLmIJZxDs9kn8UhB1SzCrJlIJzoN0LWgz4nGOLSiJd1BPcBdF77HAHywTicARv7GIn7TqlctQnTjISi69W7ijaxI1TfyZY2/DF18jK5vsQBwwopcvgM7qwnVvNaD9j0OHH4Bayy9Q/DSj9jUgpgl3RDdM4yjH/jALRr6016Jq8DdVlQb05A/aWj5x1B5+uBM/+tFDpcyRm32KTxC9/FHx2n6Lyv4vVIkGqR8N3lAGauhTG4dwR6gTT6dqghxT3XaJX/7U8QFQU29n0Jcse6nK4PnUmVILRgtvii1WdOqTlaZcxSv9Dvw4sTSHOA9veydGDTGJ4VtVTHEs2QHOb8XD9QA4dsdA+5+4A81SVVlV4CPzx/tl+pqiDmH2mX7WPxPDxZHeGhvC9Mb7VjxZrwO+ffsv", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Start = _t, #"Expected Decription" = _t]),
    GetName = Table.AddColumn(Source, "Custom", each 
        let
            fxSplit = Splitter.SplitTextByAnyDelimiter({"0".."9"}&{"/","-"}, QuoteStyle.Csv),
            InvokeSplit = fxSplit([Start]),
            RemoveBlanks = List.RemoveItems( List.RemoveFirstN( InvokeSplit, 1), {"", null, "A", "NOV", "CAPEX REFURB", "REPLACEMENT CURTAIN HOOKS", "TEST AND TAG UNIT"}),
            KeepStrings = Text.Replace( Text.Trim( Text.Combine( List.Select( RemoveBlanks, each Text.Upper(_) = _ ), " " )), "P L", "P/L")
        in
            KeepStrings, type text)
in
    GetName

I hope this is helpful.

3 Likes

@Melissa,

That is an absolute work of art. :clap::clap:

  • Brian
2 Likes

@ammu Leo, @BrianJ called the solution from @Melissa an absolute work of art. It is, I used 11 lines for the split itself and she only 5.

But even with this advanced code you will still have the same issues I described above (plus STEPHEN BUSHELL). You would need to deal with these cases individually in the code.

EDIT: The exceptions were added, so you are now fully covered for the given data.

I thought about how to cover as many cases as possible for the given data, without needing to write long text fragments for the exceptions. This combination of Text.BetweenDelimiters and Text.SplitAny does it quite well.

let
    Source = Excel.CurrentWorkbook(){[Name="Description"]}[Content],
    #"Added BetweenDelimiters" = Table.AddColumn(Source, "Custom", each if Text.Length( Text.BetweenDelimiters( [Description], "-", "-"))>3 then Text.BetweenDelimiters( [Description], "-", "-") else [Description] ),
    #"Added SplitAny" = Table.AddColumn(#"Added BetweenDelimiters", "Split", each List.Select( Text.SplitAny([Custom], "01234567899,-"), each _<>"" and _<>"/" and _<>"A" and _<>"NOV" and not Text.Contains(_, "INV") and not Text.Contains(_, ":") and Text.Upper(_)= _) ),
    #"Extracted Values" = Table.TransformColumns(#"Added SplitAny", {"Split", each Text.Combine(List.Transform(_, Text.From), " "), type text}),
    #"Trimmed Text" = Table.TransformColumns(#"Extracted Values",{{"Split", Text.Trim, type text}})
in
    #"Trimmed Text"
3 Likes

Many thanks for all your help, Matthias, Melissa and Brian J.
I am so much appreciated.
I can use these code repeatedly in the future.
Regards,
Aye

Hi @Ammu, did the response provided by the users and experts help you solve your query? If not, how far did you get and what kind of help you need further? If yes, kindly mark as solution the answer that solved your query. Thanks!