Avoid refreshing table

Hi all,

I have two tables with identical structure:

  • Sales2022
  • SalesPrior2022 (in which the Sales2022 is appended). All model measures are based on this one.

I need to shorten refresh time by not refreshing SalesPrior2022 data and also keep model size small (by avoiding another table).

How do you suggest I should handle that?

Hi @Thimios, we aim to consistently improve the topics being posted on the forum to help you in getting a strong solution faster. While waiting for a response, here are some tips so you can get the most out of the forum and other Enterprise DNA resources.

  • Use the forum search to discover if your query has been asked before by another member.

  • When posting a topic with formula make sure that it is correctly formatted to preformatted text </>.

image

  • Use the proper category that best describes your topic
  • Provide as much context to a question as possible.
  • Include the masked demo pbix file, images of the entire scenario you are dealing with, screenshot of the data model, details of how you want to visualize a result, and any other supporting links and details.

I also suggest that you check the forum guideline How To Use The Enterprise DNA Support Forum. Not adhering to it may sometimes cause delay in getting an answer.

Please also check the How To Mask Sensitive Data thread for some tips on how to mask your pbix file.

Hello @Thimios,

Thank You for posting your query onto the Forum.

The answer to your first query - “I need to shorten refresh time by not refreshing SalesPrior2022 data”.

Answer: If the historical numbers are not going to change anymore i.e., in this case, Sales Prior 2022, then you can implement “Incremental Refresh” option in order to reduce the refresh time since it’ll only refresh the data for the period which is mentioned under the that option, in this case, Sales 2022. Below are few of the links provided for reference pertaining to the “Incremental Refresh” topic.

The answer to your second query - “also keep model size small (by avoiding another table)”.

Answer: Since all the measures in your model are based on the table - “Sales Prior 2022”, then there’s no need to bring or load the data of “Sales 2022” in your Power BI front-end side. You can un-check the option of “Enable Load” option inside the Query Editor and therefore, that additional table will no more be loaded in your data model thereby ultimately reducing the file size as a whole.

Hoping you find this useful and meets your requirements that you’ve been looking for.

Thanks and Warm Regards,
Harsh

1 Like

Thank you @Harsh for recommending these guides.

A. I’m aware of both Ruth’s and Reid’s guide to incremental refresh but -unfortunately- my tables are stored in Excel files in SharePoint so that won’t work.

B. Sales 2022 should be always loaded as new data are entered. So if SalesPrior2022 is appended in Sales2022, there is no gain in model size.

I’m open to any workarounds :slight_smile:

1 Like

Do you have access to Dataflows?

Yes @DavieJoe I do.
Already created a test one in the related workspace. What are you suggesting?

Have a data flow that only contains the data that you don’t need to refresh. Don’t set a refresh schedule for it, you now have your static piece of data.

And, if for some reason you need to refresh it in the future, you could just do a manual refresh on it.

1 Like

So once I get data from this DataFlow into my BI Desktop model and disable refresh, how do I “connected” the current year sales table so that I have one sales table to which all measures are based?

1 Like

I’d have the other data as a data flow also and append the static data to the updated data. You now have one table.

Thanks @DavieJoe.
Wouldn’t append increase the model size as old data will be included in two different tables?

Data Flow 1 - this is your static data

Data Flow 2 - this is your “live” data

Data Flow 3 - Append Data Flow 1 onto Data Flow 2.

Your Power BI model will only connect to data Flow 3.

How big are each sets of data?

1 Like

Now I get it @DavieJoe :sunglasses:
Past data is about 2m rows and current about 500k rows.
Will give it a try and let you know.

1 Like

Should work without any issues

1 Like

This is my first time with DataFlows, so could you guide me how to create Data Flow 3?
I created a copy of Data Flow 2 and appended Data Flow 1 but I get a message that “Computed tables require Premium to refresh. To enable refresh, upgrade this workspace to Premium capacity, or remove this table”

Ah so you don’t have premium capacity then?

Just store you old data in the static data flow. Ingest your new data as normal and then append the static data onto the new data.

Power BI will cope with this amount of data no problem.

1 Like

Let me know if I understand right:

  1. Create Data Flow 1 (past sales) in Service and create new Query in BI Desktop for that Data Flow. Refresh once and then disable refresh.
  2. In BI Desktop, create table Sales Data Current (refresh enabled).
  3. In BI Desktop, append Data Flow 1 Query in table Sales Data Current.

Will this lead to loading only Sales Data Current rows?

1 Like

Your data model will load both Past Sales and Current Sales Data. However, it will not have any effect on your refresh times as during refresh only your Sales Data Current will be re-imported. If you need to reduce Past Sales Data size further, you can opt for Group by and Summarize Past Sales at a higher granularity. Say for example, month, quarter, Business Unit etc. Also, if you do not want to load Prior Sales Data to data model, you can always Disable Load in Power Query.

1 Like

Thanks @pranamg.
What is the difference in model size and refresh time between what you’re suggesting and importing past sales tables directly in BI Desktop and then disabling refresh?

Model Size is the total size of the data model (compressed into VertiPaq) that is loaded into memory in the Storage Engine. Every time you refresh the report (for a report in which you have used import method; this does not apply to Direct Query or Live connection), data will be imported into all tables (except tables where Load is disabled and tables not included in Report Refresh) and ETL steps in these tables will be applied. The time taken in re-loading the data from data sources and re-processing the ETL steps is called Refresh time. However, in your case, you can safely Disable Load and uncheck include in Report Refresh for the Past Sales data table as the you do not expect this data to change after the initial import. Hence every time you refresh, only the Current Sales data is re-imported. As suggested above, you can add a final step in your Current Sales Data to append the Prior Sales Data to the Current Sales Data. Also, by disabling load, the Prior Sales Data will not be visible in Data model and will not be available for use in Visualizations or measures.

Reference:

2 Likes

I found out that none of the solutions above improves performance (refresh time, file size) compared to my current setup.
What I have right now is past sales imported from SharePoint Excel tables (with disabled refresh) while current sales table has enabled refresh and appended in past sales table. When manually refreshing in BI Desktop, ALL sales lines are loaded (past & current).
Unfortunately, this is also the case in the suggested solutions.