Need to improve the amount of data we can load

To Summarise the issue:
I am loading in fact data from 3 views (really needs to be 6 views)
Each view is circa 5million records
Each view contains around 20 surrogate (foreign) keys that are needed

I cant load in the data in Power BI desktop as the laptop is incapable of the volume of data (CPU etc just max out)

I had moved to Power BI Service. However I can not get more than three years into one dataflow without it failing (timeout generally). So in mu dataflow I have three queries that are diasabled and one Sales Query that appends all three sales queries (one Query that rules them all :slight_smile: ).

I just cant see a way forward. The workspace is premium and I cant see any way forward apart from one dataflow per fact view and then join them in the data set.

Any suggestions? Having looked at the data I dont think Aggregations would help. The query to get the data is optimised, however runing it through SQL Server Management Studio it is not fast which I take as a sign the SQL box is bad or the network is bad.

Bit stuck now.
Thanks
E

Hi @ells. I agree with you … it sounds like there are issues with your SQL Server and/or network … I’ve used SSIS packages to read, process, and load a data warehouse with 60 million rows from 8 different data sources (database types actually) in under 20 minutes (and this was with SQL Server 2008 [old]) (I’ve used large datasets many times, but this case is the only one I can off-the-top-of-my-head remember stats on…). The “cheapest” way forward is often new hardware … a new box with the latest version of SQL Server you can get might be a plan.
Greg

@Greg
Thanks. 2008 - thats old School BIDS. Unfortunatelly more and more details are coming to light about the SQL box. There are a lot of things between me, the data, and Power BI so there could be other issues. I cant see the SQL box being replaced in the near future.

However something unexplained might have happened. As I was exporting and importing the dataflow json file (and changing the native query option in the file). I noticed an option for “link entities from other workflows”.

This started me thinking. I know I can have 2 views in a data flow and it will load so what about 3 dataflows (each containing 2 of the fact views), add a third to contain the appended Query and we might be done.

There is a note about needing Premium for the linked entity refresh. The three dataflows that get the data each have two queries (they are not enabled for load) Then the fourth is the new query that appends all 6 underlying. This is enabled for load. This will be the query that supplies the data set.

I have an identifier in the data that shows which view so if the report cant sustain 6 years of data I can cghnage the query in the data set.

Just need to read up on Linked entities to see if there are any down sides to this approach.
Thanks
E

Hi @ells, did the response provided by @Greg help you solve your query? If not, how far did you get and what kind of help you need further? If yes, kindly mark the thread as solved. Thanks!

Hi @ells, a response on this post has been tagged as “Solution”. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the check box. Thanks!