PowerQuery and Refresh issue

Hey Guys,

Scenario Background:
I have a dataset that gets lots of Text- which is Feedback from customers. and have been facing this challenge in powerquery, particularly when moving between steps.

I have attempted increasing the Cache to 16 GB for this file, and used Buffer Function at a few places in different queries. These have somewhat helped me in some occasions.

Scenario

Usually when we click on a step, Power Query takes a few seconds and starts loading the data [ it starts showing data load , mb or gb numbers in bottom right corner of the screen]

In this specific scenario, on the step hilighted in grey, that i need to debugg, due to some error that is being generated in following steps, I am getting stuck where the 5 dots on top keep moving, but download of data doesnt start.

I have closed and restarted the PBX file,
Refreshed all the tables referenced in this query before attempting to refresh this.

Even 30 minutes of waiting data load is not begnining on any of the intermediate steps. It does begin refresh at the last step though.

This happens Even for the first step that refers to another table ( that has just been loaded),

Question
Any Idea… What Power Query is trying to do when only the dots are moving but its not loading any data ?
Any ideas on how I can Improve or fix this ? or any ideas to put me in right direction?

Hi @jsodhi,

Let’s start with this:

  1. What data source are you connecting to?

  2. To decrease the load for testing alone, enter a filter rows step following the ‘Source’ or ‘Removed Other Columns’ step. Keep either a n-number of rows from top, bottom or at an interval - whatever provides a suitable mix. Again, this is just for testing, you will remove this step once you have resolved the issue.

  3. What is happening in these 3 steps?
    image

  4. See this post, looks like that might be helpful

Hey Melissa,
Thank you for your response.

    1. this query refers to another query that has all feedback comments.
      but the actual data Source: - is a data flow, and post my development at some point i have to split it into power bi dataset and then connect the report to this dataset.
      This isnt a great scenario because it is preventing any query folding :frowning:
    1. So the data here has text feedback from customers that i am spliting into single words (for a word cloud) and expanding it to multiple rows. The function cleans the words and removes any numbers in the words
Source = #"FactComments*",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"FeedbackID", "Clean Comments", "Sentiment"}),
    #"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Custom", each Text.Split([Clean Comments]," ")),
    #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Invoked Custom Function" = Table.AddColumn(#"Expanded Custom", "CleanWords", each #"Clean Text (no numbers)"([Custom])),
    1. Filter step: so after the words have been split and expanded, i am calculating length of the words and removing for example words less than 3 characters or more than 20 characters. So, that cant really move.
    1. Thank you for the reference resource, It was very helpful and ill keep it for my future reference. the only white space i had to deal here was resolved by cleaning and trimming.

Okay so there are some things we can try and switch around to see if that has a positive impact. Consider this simplified example.

let
    Source = Table.FromColumns(
        {{"There are 50 pigeons enjoying themselves in the square."}},
        type table [String=text]
    ),
    Transform = Table.TransformColumns( Source, 
        {
            {"String", each 
                List.Select( 
                    Text.Split( 
                        Text.Select( _, {"A".."Z", "a".."z", " "}), " "
                    ), (x)=> let n = Text.Length(x) in (n >3 and n <20)
                ) 
            }
        }
    ),
    Expand = Table.ExpandListColumn(Transform, "String")
in
    Expand

Don’t know it this is an option for you but I am tranforming the original input. Next I am only retaining the space, all capital- and lower case letters, before splitting the string. This will yield a list of values that can be transformed before it is expanded - limiting the number of rows to add to the table.

I expect there will be other transformations, we can consolidate, for example you can transform all list values to lower case before expanding and so on but then I will need to know more about what is and needs to happen in your query…

Let me know if this helps

Hey @Melissa

Thank you for the code. It took me some time to test, but i do see that it has helped reduce some of the load time .

I am finding it really interesting and loving it to see how nesting the steps together can help reduce the load time. -

here is the entire code that i have used by incorporating the code that you have mentioned above.

let
    Source = #"FactComments*",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"FeedbackID", "Clean Comments", "Sentiment"}),
    Transform = Table.TransformColumns( #"Removed Other Columns", 
        {
            {"Clean Comments", each 
                List.Select( 
                    Text.Split( 
                        Text.Select( _, {"A".."Z", "a".."z", " "}), " "
                    ), (x)=> let n = Text.Length(x) in (n >3 and n <20)
                ) 
            }
        }
    ),
    Expand = Table.ExpandListColumn(Transform, "Clean Comments"),
    #"Trimmed Text" = Table.TransformColumns(Expand,{{"Clean Comments", Text.Trim, type text}}),
    #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Clean Comments", Text.Clean, type text}}),
    #"Lowercased Text" = Table.TransformColumns(#"Cleaned Text",{{"Clean Comments", Text.Lower, type text}}),
    //  - removing stop words be creating an AntiJoin
    #"Merged Queries" = Table.NestedJoin(#"Lowercased Text", {"Clean Comments"}, StopWords, {"Column1"}, "StopWords", JoinKind.LeftAnti), 
    #"Removed Other Columns1" = Table.SelectColumns(#"Merged Queries",{"FeedbackID", "Sentiment", "Clean Comments"}),
    #"Grouped Rows" = Table.Group(#"Removed Other Columns1", {"FeedbackID", "Sentiment", "Clean Comments"}, {{"Count of Words", each Table.RowCount(_), Int64.Type}}),
    // adding category words ( or word groups) from another list
    #"Merged Queries1" = Table.NestedJoin(#"Grouped Rows", {"Sentiment", "Clean Comments"}, #"Word Groupings", {"Sentiment", "Word"}, "Word Groupings", JoinKind.LeftOuter),
    #"Expanded Word Groupings" = Table.ExpandTableColumn(#"Merged Queries1", "Word Groupings", {"WordGroup"}, {"WordGroup"})
in
    #"Expanded Word Groupings"

Looking forward and super excited to see your suggestions on consoldiation of this code. :slight_smile:

Hi @jsodhi,

let
    Source = #"FactComments*",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"FeedbackID", "Clean Comments", "Sentiment"}),
    Transform = Table.TransformColumns( #"Removed Other Columns", 
        {
            {"Clean Comments", each List.Transform(
                List.Select( 
                    Text.Split( 
                        Text.Select( _, {"A".."Z", "a".."z", " "}), " "
                    ), (x)=> let n = Text.Length(x) in (n >3 and n <20)
                ), (y) =>  Text.Lower(Text.Trim(Text.Clean(y))))
            }
        }
    ),
    Expand = Table.ExpandListColumn(Transform, "Clean Comments"),
    // #"Trimmed Text" = Table.TransformColumns(Expand,{{"Clean Comments", Text.Trim, type text}}),
    // #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Clean Comments", Text.Clean, type text}}),
    // #"Lowercased Text" = Table.TransformColumns(#"Cleaned Text",{{"Clean Comments", Text.Lower, type text}}),
    //  - removing stop words be creating an AntiJoin
    #"Merged Queries" = Table.NestedJoin(Expand, {"Clean Comments"}, StopWords, {"Column1"}, "StopWords", JoinKind.LeftAnti), 
    #"Removed Other Columns1" = Table.SelectColumns(#"Merged Queries",{"FeedbackID", "Sentiment", "Clean Comments"}),
    #"Grouped Rows" = Table.Group(#"Removed Other Columns1", {"FeedbackID", "Sentiment", "Clean Comments"}, {{"Count of Words", each Table.RowCount(_), Int64.Type}}),
    // adding category words ( or word groups) from another list
    #"Merged Queries1" = Table.NestedJoin(#"Grouped Rows", {"Sentiment", "Clean Comments"}, #"Word Groupings", {"Sentiment", "Word"}, "Word Groupings", JoinKind.LeftOuter),
    #"Expanded Word Groupings" = Table.ExpandTableColumn(#"Merged Queries1", "Word Groupings", {"WordGroup"}, {"WordGroup"})
in
    #"Expanded Word Groupings"

For optimizing merges, see here:

I hope this is helpful

Hey @Melissa

Thank you for helping with the code. I’ll also be going through the Chris webs website for more reading.

Regards,
Japjeet

1 Like