PQ remove a string of numbers

Hello EDNA - looking for help w/ removing job # from job description column in Power Query. Please see screen shot below. attaching file.

PQ_RemoveJ_No.pbix (28.5 KB)

the most I managed to remove w/ find replace were the Parentheses :slight_smile:

Thank you in advance,

Frankee.

@Frankee,
That’s a nice task and the first rows can be cleaned relatively safely if you assume as there is always a combination of numbers and a dash within parantheses. You can already see there that you have a problem with inconsistent writing, but because of the combination of conditions it can be done safely.
Last row is without parantheses, so I am not sure how you would deal with e.g. missing spaces like you see in the first row.

This transforms the given data in Before in two steps:

let
    Source = Sheet1,
    #"Cleaned (numbers & -)" = Table.TransformColumns(Source, {{"Before", each Text.Combine(List.Select(Splitter.SplitTextByAnyDelimiter({" (","(",") ",")"})(_), each Text.Remove(_, {"0".."9", "-"}) <> "")," "), type text}}),
    #"Cleaned numbers & -" = Table.TransformColumns(#"Cleaned (numbers & -)",{{"Before",each Text.Combine(List.Select(Splitter.SplitTextByDelimiter(" ")(_), each Text.Remove(_, {"0".."9"}) <> "-")," "), type text}})
in
    #"Cleaned numbers & -"

And this one does the same in a new column After:

let
    Source = Sheet1,
    #"Added After" = Table.AddColumn(Source, "After", each Text.Combine(List.Select(Splitter.SplitTextByAnyDelimiter({" (","(",") ",")"})([Before]), each Text.Remove(_, {"0".."9", "-"}) <> "")," ")),
    #"Cleaned numbers & -" = Table.TransformColumns(#"Added After",{{"After",each Text.Combine(List.Select(Splitter.SplitTextByDelimiter(" ")(_), each Text.Remove(_, {"0".."9"}) <> "-")," "), type text}})
in
    #"Cleaned numbers & -"

Regards,
Matthias

1 Like

Hi Frankee,

As I understand you want to remove parentheses and whatever is inside it and it seems you have just one parenthesis or nothing.

The below code would be one solution:

=Table.AddColumn(Source, "Custom", each try Text.RemoveRange([Before],Text.PositionOf([Before],"("),Text.PositionOf([Before],")")-Text.PositionOf([Before],"(")+1) otherwise [Before])

Hello @Frankee

Did the responses above help solve your query?

If not, can you let us know where you’re stuck and what additional assistance you need?

If it did, please mark the answer as the SOLUTION by clicking the three dots beside Reply and then tick the check box beside SOLUTION

Thank you

@Matthias - this worked, from what I can tell. how can this be rewritten into a “Custom” column, to keep the original column , incase I need to refer back to it?

@Frankee - for the data you provided it works perfectly. Just if you had in the row without parantheses a missing blank it wouldn’t detect the number string, but that would be a mistake in the data.

The second code creates a custom column “After”, so that you can refer back to the original column. Obviously you can choose any other name for this custom column.

Hey @Matthias - thank you for the reply. I thought, why not just duplicate the column and apply your code. easy enough.

see [Before - Copy] column … right most column:

let
    Source = Excel.Workbook(File.Contents("C:\TEMP\EDNA\PQ_RemoveJ_No.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Before", type text}, {"After", type text}, {"Clean up", type text}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type1", "Before", "Before - Copy"),
    #"Cleaned (numbers & -)" = Table.TransformColumns(#"Duplicated Column",  {{"Before - Copy",each Text.Combine(List.Select(Splitter.SplitTextByAnyDelimiter({" (","(",") ",")"})(_), each Text.Remove(_, {"0".."9", "-"}) <> "")," "), type text}}),
    #"Cleaned numbers & -" =   Table.TransformColumns(#"Cleaned (numbers & -)",{{"Before - Copy",each Text.Combine(List.Select(Splitter.SplitTextByDelimiter(" ")(_), each Text.Remove(_, {"0".."9"}) <> "-")," "), type text}})
in
    #"Cleaned numbers & -"```

@SoftwareTrain - your code worked as well, for the first part. where it removed
the following “(######-###)” .

Thank you for the help,

Frankee

1 Like

@Matthias - BTW - went back to your code…second part where you offered “NEW COLUMN” :slight_smile: …doe.

and that worked too. as opposed to me duplicating the column.

Again, right most column:

let
    Source = Excel.Workbook(File.Contents("C:\TEMP\EDNA\PQ_RemoveJ_No.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Before", type text}, {"After", type text}, {"Clean up", type text}}),
    #"Added After" = Table.AddColumn( #"Changed Type1", "Before2", each Text.Combine(List.Select(Splitter.SplitTextByAnyDelimiter({" (","(",") ",")"})([Before]), each Text.Remove(_, {"0".."9", "-"}) <> "")," ")),
    #"Cleaned numbers & -" = Table.TransformColumns(#"Added After",{{"Before2",each Text.Combine(List.Select(Splitter.SplitTextByDelimiter(" ")(_), each Text.Remove(_, {"0".."9"}) <> "-")," "), type text}})
in
    #"Cleaned numbers & -"

thank you again,

Frankee

1 Like