Transformation in Power Query

Iwona · January 4, 2022, 12:36pm

I need to create PBI file using data from attached .doc achieve this view below :

duty.docx - original source of data , text data

duty.csv - I created to show how it should look like in PBI

Could you please help me to find appropriate and correct steps in Power BI to complete this task. Many thanks in advance - Iwona

Anurag · January 4, 2022, 3:21pm

HI

Pls share the expected ouput also for the provided data becuz it will easier for form member to help you.

Thanks,
Anurag

Iwona · January 4, 2022, 3:46pm

From this text view ( file.docx):

Duty - example.docx (12.2 KB)

to this view from csv file but I have to do it in Power BI desktop :

duty example.csv (996 Bytes)

Melissa · January 4, 2022, 5:48pm

Hi @Iwona,

The sample provided in the docx doesn’t seem to match the data in the depicted outcome (can’t match Col2 and Col3 for example). Another consideration, will you ever only have one set of data, like your sample OR can you have multiple groups consisting of 6 lines?

Whatever the case, can you see if this meets your requirement.
Just paste this code into a new blank query and amend to your needs.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pVBNC8IwDP0vPY/sJV1XdpWBF1GYHyhlR88e/P+wpnZugl8opElfk7yXJgTTmsJwPAJYvaK0XAoEESxYM+tNdG0HzpXtfsJ3O52vWnp57foimFWm2Cn5UURihKPK36Lz6V2LmLhOYNltNQkC5npPbBSQScA1s162lGYeOdkR6kfM/IXGQTnzRzzZ5kPLexJr09apcn/SpBUSfpimHwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t]),
    GroupRows = Table.Group(Source, {"Column1"}, {{"AllRows", each Table.RemoveColumns(_, "Column1" ), type table [Column2=nullable text, Column3=nullable text, Column4=nullable text, Column5=nullable text, Column6=nullable text, Column7=nullable text, Column8=nullable text, Column9=nullable text, Column10=nullable text, Column11=nullable text, Column12=nullable text, Column13=nullable text, Column14=nullable text, Column15=nullable text, Column16=nullable text, Column17=nullable text, Column18=nullable text, Column19=nullable text, Column20=nullable text]}}),
    AddIndex = Table.AddIndexColumn(GroupRows, "Index", 0, 1, Int64.Type),
    AddColCount = Table.AddColumn(AddIndex, "ColCount", each if [Column1] = "D" then Table.ColumnCount( [AllRows] ) else List.Count(List.RemoveMatchingItems( Record.ToList( [AllRows]{0}? ), {""} ) ), type number),
    UpdatedTables = Table.AddColumn(AddColCount, "Temp", each
        let 
            StartNum = if [Column1] = "D" then 1 else List.Sum( List.FirstN( AddColCount[ColCount], [Index] )),
            OldNames = List.Transform( List.Numbers( 2, [ColCount] ), each "Column" & Text.From(_) ),
            NewNames = List.Transform( List.Numbers( if [Column1] = "D" then 1 else StartNum +1, [ColCount] ), each "Column" & Text.From(_)),
            nTable = Table.RenameColumns( Table.SelectColumns( [AllRows], OldNames), List.Zip( {OldNames, NewNames } ))
        in 
            nTable
    )[[Index], [Temp]],
    Pivot = Table.Pivot(Table.TransformColumnTypes(UpdatedTables, {{"Index", type text}}), List.Distinct(Table.TransformColumnTypes(UpdatedTables, {{"Index", type text}})[Index]), "Index", "Temp"),
    Result = List.LastN(
        List.Generate(
            ()=> [ t = Table.ExpandTableColumn(Pivot, Text.From(n), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) )), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) ))), n = 0],
            each [ n ] < Table.ColumnCount( Pivot ),
            each [ t = Table.ExpandTableColumn( [t], Text.From(n), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) )), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) ))), n = [n]+1 ],
            each [ t ]
        ), 1 ){0}
in
    Result

.
Here’s your sample file:
eDNA - Recursive ExpandTableColumns.pbix (81.6 KB)

This also includes a solution for multiple groupings of 6 lines.

Reference material:

I hope this is helpful

Iwona · January 4, 2022, 11:51pm

Hi Melissa, I am so grateful for your help and support . My data is huge and I have thousands of these “paragraphs” like below( I copy some of them in attachment):

and I still have the same problem how to adjust the code .
There is another issue , because the number of lines starting with “V” is not fixed:

Duty1-example.docx (15.2 KB)

Melissa · January 5, 2022, 1:20am

Hi @Iwona,

Okay so you’ll need the grouped version, I’ve adjusted the ‘grouping’.
Paste this code into a new blank query.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVpNb9wgEP0vPkeEGcA21ypSLlUrpR9qFeXYcw/9/1KBwfaAnbIG1pWyy84mCxu/N2+Gh19fh6fhYQD3QCn9APCo4BElShd8AP/Wp8/u6emFfu1fftvi9efnrz/+T3+///T28Dp8jFN8dY/nH4j+tTQCJhpxCu/7YBIaQvD88sXHUkjJ1zv4WRbAbQEz+s/OQuk4zssCgAL4Aj4eb1jgu58o/hejkKWP/HsSpZR/JYxqnMaEaaSumoYxAC9nwEjI68gEAytAUqgODEBknwVYKBZB10LrLLYnSTALwD4kqOMSQ09djh5G9GL+mjV/3ZUO6dSav5ajp4RMUtZQ/rEY4IY1OHqTUCXAb0EPhDaN04RLKGTdt2Ek0HsSjHclgYF3SeBIje0kQD0kKWySFHag6yw+n8JBFdpJIPuQAOq+DSOBuZwEOpLARhLYhQTSEiCtSjBxEsRrxHQ7VvZNx0slda8ElQK80/G5Fb3xcvTCxR0JKDfiip670oFMrejBkKSw5LLgUhYzXcezrZirwqWvdSN6zbk3XY7eTKipiJ7m6CVXthY9laKnMvSMzOKSDO7Ra1ZOEuBi/S9NE6tw3TSMBPPlJAgYTTGFJ57CrulVHUgwchJg2t7BSFssHv+nKtxhNxVJUEdJRgK7J8Gd+3EdwZ9oVLwf17xi1rZiSRVGoRMlyPbsPj6r41Zg3SZ2R4IerRi0b6lBXk4CIPDDtXejyWyOHjwwSUuuaWTJLzMxKG5r9t1Yl3ruKkKdGDAAD1yxO0u5iQBGKTdcyk2HXTUm3RhmG6ppWcMHc9ni2qdw0Ui7DbrKKrxL4Tp7hjHgelcsyKy79opGUCsgmHZftQxIOjqVuWJz6rz5+DwJVN026E46Dq2bMjgw1+4sA5bAD1i4UW/WKPaRAZOSwGQkQDNsMqDO9vSW2NvKgE4yIMvGbpEBB87anWXARAZEOdBbO6fIcGwt4/OQlHHDjRqwVINpQdeY3bJA1sv18VahtSUkDcBmDTiw1e7MgFgAdGSC2VJSdTHY0+MRTeYVZ8C2oCVBOMOAudYM27nrrUcskQGtWzq43poLAFkCSCYpqcgm62rNmcU4WBmgsliXknFPAt2q4J4E2F5PIgmaC8H1Dp8m8KkVsNSOrZioRRY6nrIYUstlTnT1U2bx+YagCw9AqD48MM08uN7kCytKatf9AXliv+g+p22Jz2fSiuBw3/YiLsCzpywzmcXtJGi+8yJ2ha1OL1xv8o0R/EgGzc1dB1gSdzF7x/QuCYe7MsNeHM7wwLa39X13B63HPnjg811gEy2OkNsSpvdGjF1OXVO/d059PgTy9Vhc7NQP7r/oc/TeXFui6V8hK29/AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t]),
    AddedGroup = Table.FillDown( Table.ReplaceValue( Table.AddIndexColumn(Source, "Group", 1, 1), each [Group], each if [Column1] = "D" then [Index] else null,Replacer.ReplaceValue,{"Group"}), {"Group"}),
    GroupRows = Table.Group(AddedGroup, {"Group"}, {{"Grouped", each Table.RemoveColumns(_, {"Group"} ), type table [Column2=nullable text, Column3=nullable text, Column4=nullable text, Column5=nullable text, Column6=nullable text, Column7=nullable text, Column8=nullable text, Column9=nullable text, Column10=nullable text, Column11=nullable text, Column12=nullable text, Column13=nullable text, Column14=nullable text, Column15=nullable text, Column16=nullable text, Column17=nullable text, Column18=nullable text, Column19=nullable text, Column20=nullable text]}}),
    GetLevels = Table.AddColumn(GroupRows, "AllRows", each Table.Group( [Grouped], {"Column1"}, {{"AllRows", each Table.RemoveColumns( _, "Column1"), type table [Column1=nullable text, Column2=nullable text, Column3=nullable text, Column4=nullable text, Column5=nullable text, Column6=nullable text, Column7=nullable text, Column8=nullable text, Column9=nullable text, Column10=nullable text, Column11=nullable text, Column12=nullable text, Column13=nullable text, Column14=nullable text, Column15=nullable text, Column16=nullable text, Column17=nullable text, Column18=nullable text, Column19=nullable text, Column20=nullable text]}})),
    Result = Table.AddColumn( GetLevels, "nTable", each
        let 
            AddIndex = Table.AddIndexColumn([AllRows], "Index", 0, 1, Int64.Type),
            AddColCount = Table.AddColumn(AddIndex, "ColCount", each if [Column1] = "D" then Table.ColumnCount( [AllRows] ) else List.Count(List.RemoveMatchingItems( Record.ToList( [AllRows]{0}? ), {""} ) ), type number),
            lColCount = List.Buffer(AddColCount[ColCount]),
            UpdatedTables = Table.AddColumn(AddColCount, "Temp", each
                let 
                    StartNum = if [Column1] = "D" then 1 else List.Sum( List.FirstN( lColCount, [Index] )),
                    OldNames = List.Transform( List.Numbers( 2, [ColCount] ), each "Column" & Text.From(_) ),
                    NewNames = List.Transform( List.Numbers( if [Column1] = "D" then 1 else StartNum +1, [ColCount] ), each "Column" & Text.From(_)),
                    nTable = Table.RenameColumns( Table.SelectColumns( [AllRows], OldNames), List.Zip( {OldNames, NewNames } ))
                in 
                    nTable
            )[[Index], [Temp]],
            Pivot = Table.Buffer( Table.Pivot(Table.TransformColumnTypes(UpdatedTables, {{"Index", type text}}), List.Distinct(Table.TransformColumnTypes(UpdatedTables, {{"Index", type text}})[Index]), "Index", "Temp")),
            Result = List.LastN(
                List.Generate(
                    ()=> [ t = Table.ExpandTableColumn(Pivot, Text.From(n), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) )), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) ))), n = 0],
                    each [ n ] < Table.ColumnCount( Pivot ),
                    each [ t = Table.ExpandTableColumn( [t], Text.From(n), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) )), Table.ColumnNames( Record.Field( Pivot{0}, Text.From(n) ))), n = [n]+1 ],
                    each [ t ]
                ), 1){0}
        in
            Result
    )[[nTable]],
    ExpandnTable = Table.ExpandTableColumn(Result, "nTable", List.Distinct( List.Combine( List.Transform( Result[nTable], each Table.ColumnNames( _ )))), List.Distinct( List.Combine( List.Transform( Result[nTable], each Table.ColumnNames( _ )))) )
in 
    ExpandnTable

I hope this is helpful

Iwona · January 6, 2022, 12:28am

This grouping concept here is so difficult to me.
How should I modify this code if I need to transform from this version

to this view :

D…L…V…V…V…
D…L…V…V
D…L…V…V…V… and at the end separate based on delimiter “,” .

Would you please help ,
So grateful for your ongoing support and technical knowledge. Thank you - Iwona
Duty1-example.docx (15.2 KB)

Melissa · January 6, 2022, 7:16am

Hi @Iwona,

Can you provide a small mock up of the expected outcome, like you did in post 1? That be really helpful. Because it seems to me your requirement changed, if so please close this thread and create a new one. You can @-mention me so I’ll be notified.

Thanks and all the best
Melissa