Only refresh new files from folder in Power Query

Hi everyone,

My sales data is saved at the beginning of each month in a CSV file on a local folder. I’ve created a data model in Power Query/PBI using as source the local folder.

Whenever the PBI file is refreshed, Power Query will run through all the files on the folder to update the data. This can be take some time with many historic data and will get slower as the data (history) grows.

Is there a way in Power Query (or any alternative solution) to only refresh the new files added to that folder and not everytime all the files in the folder? In other words, to load the file not previously loaded before, leaving everything else as-is?

Thank you,
Mariano

Hi @Mariano,

Check out incremental refresh.

I hope this is helpful

Hi @Melissa,
Thank you for your answer! I tried in the past with incremental refresh but was not able to make it work. I will for sure give it another try :slight_smile:
I am curious if there is a solution in PowerQuery by creating a kind of “historic” table (that is not being refreshed in its totallity) but it grows by appeding the rows from (only) the new files saved on the local folder?
Best,
Mariano

Theoretically, you can pull the file names from the already imported files (Usually SourceName column) and create a separate query. Then compare them against the files in input folder and do an outer merge (retain only files that are not listed in the data model) and import those files in. This cycle could continue in succeeding months too.

Hi @Mariano,
if incremantal refresh doesn’t work for you, this could be an option for you as you seem to work with desktop:
Incremental Load in Power BI using DAX UNION – The BIccountant
Other option would be possible if you are only interested in some aggregated extract of the data and not the complete data set. In that case you could create a dump for the processed data and then build your own incremental refresh by processing only new files and append the previously processed data. I have this working in Excel for cases where otherwise I’d have to process hundreds of historical files.
Regards,
Matthias

2 Likes

Thank you @Matthias, that seems to work for my current case!

For future reference, this is how I tested it with a smaller size of the data:
The historic years files are saved on the local folder: “SalesHistoricYears”
The (new) current year files are saved on the local folder: “SalesCurrentYear”

In PowerQuery I load from Folder the two tables:
SalesHistoricYearsTable => “Untick” Include in Report Refresh
SalesCurrentYearTable => Include in Report Refresh (as usual)

With DAX create a new Table using UNION (SalesHistoricYearsTable, SalesCurrentYearTable)
As the article says, both Tables need to have the same columns in the same order.

At the beginning of every year there would be a manual step of moving the files from SalesCurrentYear folder to SalesHistoricYears folder and refresh once manually the SalesHistoricYearsTable.

PBI incremental refresh would be the ideal solution but this solution (for my current case) seems to work.

Thank you again for all the answers! @Melissa @pranamg @Matthias
Mariano

2 Likes