Transforming Data by parsing, transposing and pivoting data from a text column

@AntrikshSharma and @Melissa,

Thanks for the alternative solutions, but I’m also trying to understand why splitbyanydelimiter fails in this particular scenario? The Microsoft documentation is less than no help. If only my preorder copy of The Definitive Guide to M had arrived yet…

  • Brian

@BrianJ I have been silently watching this post and learning the crazy crazy stuff unleashed by you, @AntrikshSharma & @Melissa. The one part that grabbed my attention is

Is there any such book?

@MudassirAli,

Unfortunately not. At this point it’s just a running joke that we’re going to start a KickStarter campaign to bribe @Melissa to write it.

  • Brian
1 Like

@BrianJ I actually searched for it in excitement. If @Melissa writes it, it’s going to be a very expensive guide for sure.

LOL

@BrianJ the issue is with Table.SplitColumn and not with splitbyanydelimiter, that’s why I changed to Table.TransformColumns. My solution post above is now updated.

You’ll want to check out this video that illustrates the issue.

@BrianJ Looks like there is some kind of column limitation, check this, I imported only 1 row and columns # goes till 15 but yours stopped at 14 with full data:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJ0VLBWcHICEo6O+lAahMFMCN/J0QkkAhF3dARrcnJSio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, TEXT = _t, RESULT = _t]),
    #"Removed Columns" = Table.RemoveColumns(Source,{"RESULT"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "TEXT", Splitter.SplitTextByAnyDelimiter({"; ", " ", "/", "$"}))
in
    #"Split Column by Delimiter" 

your solution with 14 columns:

@Melissa,

Great video and super entertaining explanation. I totally get it now.

@AntrikshSharma - absolutely right. Per @Melissa’s post and video reference, Table.SplitColumn has the static column number problem, while Table.TransformColumn does not, which is why the latter picks up the final AAAA in Column 15.

Huge thanks to both of you. This thread turned into a five star DAX and Power Query masterclass. :+1:

  • Brian
1 Like

Previous DAX solution was mixing up text and dates, and I wanted DAX to align with PQ solution so added few more lines to rank text and dates separately and then combine them.

image

Table  = 
VAR Temp =
    ADDCOLUMNS (
        ADDCOLUMNS (
            VALUES ( Ibesmond[TEXT] ),
            "@Clean String",
                SUBSTITUTE ( SUBSTITUTE ( SUBSTITUTE ( SUBSTITUTE ( SUBSTITUTE ( SUBSTITUTE ( Ibesmond[TEXT], " ; ", "^" ), ";" , "^"), "$", "^" ), " ", "^" ), "^^", "^" ), "AA/BB", "AA^BB" )
        ),
        "@Word Count",
            1 + LEN ( [@Clean String] ) - LEN ( SUBSTITUTE ( [@Clean String], "^", "" ) )
    )
VAR UpperBound =
    MAXX ( Temp, [@Word Count] )
VAR GenerateSerialNumber =
    SELECTCOLUMNS ( GENERATESERIES ( 1, UpperBound, 1 ), "@Serial Number", [Value] )
VAR TempCrossJoin =
    CROSSJOIN ( Temp, GenerateSerialNumber )
VAR FilterSerialLessThanDelimiterCount =
    FILTER ( TempCrossJoin, [@Serial Number] <= [@Word Count] )
VAR SplitString =
    ADDCOLUMNS (
        FilterSerialLessThanDelimiterCount,
        "@Extracted Child", PATHITEM ( SUBSTITUTE ( [@Clean String], "^", "|" ), [@Serial Number] )
    )
VAR FinalTable =
    SELECTCOLUMNS (
        SplitString,
        "Final String",
            VAR CurrentValue = [@Extracted Child]
            VAR Result =
                IF (
                    LEFT ( CurrentValue, 1 ) = "/",
                    MID ( CurrentValue, 2, LEN ( CurrentValue ) ),
                    CurrentValue
                )
            RETURN
                Result
    )
VAR SummarizeFinalTable =
    SUMMARIZE ( FinalTable, [Final String] )
VAR PrelimResult =
    ADDCOLUMNS (
        SummarizeFinalTable,
        "ValueCount",
            VAR CurrentValue = [Final String]
            VAR Result =
                COUNTROWS ( FILTER ( FinalTable, [Final String] = CurrentValue ) )
            RETURN
                Result
    )
VAR TextTable =
    FILTER ( PrelimResult, NOT CONTAINSSTRING ( [Final String], "/" ) )
VAR TextTabletWithRank =
    ADDCOLUMNS (
        TextTable,
        "Ranking",
            VAR CurrentValue = [ValueCount]
            VAR Result =
                COUNTROWS ( FILTER ( TextTable, [ValueCount] >= CurrentValue ) )
            RETURN
                Result
    )
VAR MaxRankOfTextTable =
    MAXX ( TextTabletWithRank, [Ranking] )
VAR DateTable =
    FILTER ( PrelimResult, CONTAINSSTRING ( [Final String], "/" ) )
VAR DateTableWithRank =
    ADDCOLUMNS (
        DateTable,
        "Ranking",
            VAR CurrentString =
                CONVERT ( [Final String], DATETIME )
            VAR Result =
                COUNTROWS (
                    FILTER ( DateTable, CONVERT ( [Final String], DATETIME ) <= CurrentString )
                )
            RETURN
                MaxRankOfTextTable + Result
    )
VAR FinalResult =
    UNION ( TextTabletWithRank, DateTableWithRank )
RETURN
    FinalResult
2 Likes