Hello,
We have developed some Direct Query Power BI reports which connect to our AWS SQL Server database and refresh every 5 minutes. Leadership loved the POC and wanted to scale to multiple locations, but now we are finding it creates to heavy a load on the server so have paused the project.
The query is not terribly complex. It has a few subqueries, many joins on 10 tables, some of which are > 1 million rows and outputs only ~5k rows.
What are some potential solutions and best practices given this situation? Here are some things I am considering
-Use staging tables with SQL Server Agent job
-Send data to our datalake in Snowflake instead of querying the production database (not sure how to productionalize this, I made a Python demo which sends the data to Snowflake, so in theory this could work). Maybe use third party service for this?
-Enable CDC in database (again not sure which product(s) we would use after this is enabled)
-Use Azure Synapse Analytics since it is purpose built to handle Direct Query at scale (probably use the same Python script, but this does not seem like a robust solution, surely there is an out of the box product in Azure)
Any help, ideas, suggestions are greatly appreciated.
Thanks for your time,
Joe