Power Query Slow

Hi,

Just after a bit of advice on the Query Editor.

In Excel or Power BI Desktop my query editor can take a while, sometimes seems like ages to perform a basic change like an whole column data type change, sort column, fill up or down etc. etc

In other cases quite quick.

Is this normal for a Data set of around 1 million rows? (Is this data set corrupt maybe)

My import method is in general a folder of Excel CSV files, text files or Excel table from SAP.

Usually I grab one large data set initially and then reference off the main first import query file to generate smaller lookup tables, I tend to use reference as I may add more files to a folder and I want my main source file to collect this data change and feed my smaller tables, don’t believe Duplicate does that?

What is the best method to speed things up?

Would I better creating a tab of smaller table in an excel tab which in turn are linked to a query source data set so I have the ability to add more if needed.

Is there a limit? before my processor starts to burn out and run slow.

Running laptop I7 with 8 gig of Ram.

The attached video , the last query break off from the merged table query took 4 minutes to complete (Materials Query 25 Rows)

Almost like the data is being retrieved from a SharePoint server it’s locally on the same PC.Power Query Slow

Thanks in advance

DarrenExcel File sizes 500 k row files

There are a lot of different factors that can slow down a query model - not seeing the steps you are taking in the query editor, it’s hard to say what might be the issue here.
I’ve seen all sorts of issues, including grouping rows, act as a resource hog.
Sorting is one that one of my colleagues reported as an issue for one of his Excel PQ models (he finally opted to just sort the final result once it was back in Excel)

you mention referencing tables, but I don’t see that in your queries - I do see that you have merged tables (I assume Period10 & Period11 were loaded and then merged into a whole)

Have you considered combining those tables as you load them?
Does this work need to be done in Excel, or can you work in PowerBI desktop - it has some optimization that the Excel Power Query option doesn’t seem to have.

Hi Heather,

Thanks for your reply and information.

Please see screenshots of my PQ

Using just the P11 Import Data file this time

500k of Rows

Created a PQ Reference of that file to create a Plants Lookup table (25 Rows)

That took around 4 mins to complete and load into the model :frowning:

I did filter the Table in excel before loading into PQ, thanks for that tip

and when generating the Plants table, which took five minutes - what steps were taken?
if this were my data, the steps would be

  1. remove all extra columns (so selecting “Plant” and “Name 1” and then choosing delete other columns)
  2. choose “Plant” column and removing duplicates

Hi Heather,

I think i did what you suggested, that is I normally do it (Still Learning)

https://drive.google.com/file/d/1k7p7OB6Xrwh9qTm4SVYJnuE5m1hZ3w0C/view?usp=sharing

Power Data Model - GDrive

okay - so it appears that the issue is with your primary table - something in the source table is probably recalculating each time that you access it.

is that source table loaded back to your model, or is it just a connection table?
One thing that can occasionally speed a model up in PowerBi desktop is to do all your clean/shape/transform steps on a connection version of the table (one that isn’t loaded back to the model), and then reference that table to load it to the model. I don’t know if the same logic holds true for Excel or not.

1 Like

Hi Heather,

On this occasion to make sharing easier, I have the dataset on a tab within the workbook.

I’ve added a link to the file in my last post if that helps understand the file better?

GDrive link

thanks for your help :+1:

My Queries are connections only and added to the data model.

the source to main query is an Excel table that feeds the Query only

unfortunately, I get a request access link on your google, so I’ve sent that request, and I’ll have to check back in later today to see if it came through.

I realize from this answer:

that my question may not have been clear (simply because I’m not immediately sure what your answer means. So, this screenshot shows that my table, which is loaded to the data model, is NOT returned to the Excel spreadsheet:
image

While this shows what it would look like where the table added to the spreadsheet after going through the query process:
image

please forgive me if this seems like an over simplification, caffeine has not kicked in yet this morning :slight_smile:

Hi Heather,

Thanks for looking, I have just granted access to the GDrive file

Thanks

Darren

The source data is a table on a tab of the spreadsheet.

Any changes that I make in the PQ are not loaded back to the Excel Table.

Only back to the Data Model if that helps

Thanks again

Darren

Okay Darren, I copied your table into my own Excel file, and duplicated your query steps exactly - and my time to have both tables load to the data model was under 2 minutes - I then tried loading both tables back to the Excel file, and it took about 2 1/2 minutes.

So, I’m guessing there may be an issue with processing power, or even your version of Excel?

That being said, I did go back and look at your query steps, and I think we can speed a few of them up.

This is your code for the P11 table:

let
Source = Excel.CurrentWorkbook(){[Name=“Table1”]}[Content],
#“Changed Type” = Table.TransformColumnTypes(Source,{{“Entry Date”, type date}}),
#“Renamed Columns” = Table.RenameColumns(#“Changed Type”,{{“Qty in UnE”, “Units”}}),
#“Changed Type1” = Table.TransformColumnTypes(#“Renamed Columns”,{{“Pstng Date”, type date}, {“Time”, type time}}),
#“Renamed Columns1” = Table.RenameColumns(#“Changed Type1”,{{“Reference”, “Del Ref”}, {“PO”, “PO No”}, {“Document Header Text”, “Receipting Operative”}}),
#“Filled Up” = Table.FillUp(#“Renamed Columns1”,{“Receipting Operative”, “PO No”}),
#“Duplicated Column” = Table.DuplicateColumn(#“Filled Up”, “Receipting Operative”, “Receipting Operative - Copy”),
#“Split Column by Position” = Table.SplitColumn(#“Duplicated Column”, “Receipting Operative - Copy”, Splitter.SplitTextByPositions({0, 5}, false), {“Receipting Operative - Copy.1”, “Receipting Operative - Copy.2”}),
#“Changed Type2” = Table.TransformColumnTypes(#“Split Column by Position”,{{“Receipting Operative - Copy.1”, type text}, {“Receipting Operative - Copy.2”, type text}}),
#“Reordered Columns” = Table.ReorderColumns(#“Changed Type2”,{“Plnt”, “Name 1”, “SLoc”, “Material”, “Material Description”, “MvT”, “User name”, “Receipting Operative - Copy.1”, “Receipting Operative”, “Time”, “Mat. Doc.”, “Del Ref”, “PO No”, “Pstng Date”, “Units”, “Entry Date”, “Receipting Operative - Copy.2”}),
#“Removed Columns” = Table.RemoveColumns(#“Reordered Columns”,{“Receipting Operative”, “Receipting Operative - Copy.2”}),
#“Renamed Columns2” = Table.RenameColumns(#“Removed Columns”,{{“Receipting Operative - Copy.1”, “Receipting Operative”}})
in
#“Renamed Columns2”

compare that to my code for the same table:

let
Source = Excel.CurrentWorkbook(){[Name=“Table1”]}[Content],
#“Changed Type” = Table.TransformColumnTypes(Source,{{“Plnt”, type text}, {“Name 1”, type text}, {“SLoc”, type text}, {“Material”, Int64.Type}, {“Material Description”, type text}, {“MvT”, Int64.Type}, {“User name”, type text}, {“Document Header Text”, type text}, {“Time”, type time}, {“Mat. Doc.”, Int64.Type}, {“Reference”, type text}, {“PO”, Int64.Type}, {“Pstng Date”, type date}, {“Qty in UnE”, Int64.Type}, {“Entry Date”, type date}}),
#“Renamed Columns” = Table.RenameColumns(#“Changed Type”,{{“Qty in UnE”, “Units”}, {“Reference”, “Del Ref”}, {“PO”, “PO No”}, {“Document Header Text”, “Receipting Operative”}}),
#“Filled Up” = Table.FillUp(#“Renamed Columns”,{“Receipting Operative”, “PO No”}),
#“Extracted Text Before Delimiter” = Table.TransformColumns(#“Filled Up”, {{“Receipting Operative”, each Text.BeforeDelimiter(_, " "), type text}})
in
#“Extracted Text Before Delimiter”

Basically:

  • I put all Change Type steps in one
  • renamed columns only once
  • it looked like your copy step on the Receipting Operative was only to split the column and retain the first name, so I removed that step and instead just did an extraction
  • this meant there was no need to rearrange the columns (NOTE - column rearrange is a step that hogs resources, so avoid it if not needed to return the data in a table format in Excel)
1 Like

Hi Heather,

really appreciate your time and effort in having look and also offering some great sound advice :+1:

I will adapt my models to follow your advice above.

thanks

Darren

Good luck with it Darren, I know these big models are very frustrating, particularly when you have to work in the Excel Power Query instead of the Desktop.

1 Like