Native Query against on premise SQL

ells · September 28, 2020, 4:00pm

I am currently having a lot of pain with getting data into Power BI. I have a query that I use to get data from a SQL view. When you open Advanced editor when transforming the data you can see the select statement.
Now I thought this would be classed as a native query. But when I right click on the Source step Native Query is not on the context menu?

The query seems to be very slow.

If you open advanced editor you get
let
Source = Sql.Database(“Instance”,“DB”,[Query = SELECT col1,col2,…coln from v_someview" ,createNavigationProperties = false, CommandTimeout=#duration(0,4,0,0])

This is used against three views each one has circa 5 m rows.

I have no control over the view so I can not change the SQL or remove columns etc. This is why I am using a Query.

Is there a more efficient way to do this?

Thanks
E

EnterpriseDNA · September 29, 2020, 12:46am

Hi @ells, we aim to consistently improve the topics being posted on the forum to help you in getting a strong solution faster. While waiting for a response, here are some tips so you can get the most out of the forum and other Enterprise DNA resources.

Use the forum search to discover if your query has been asked before by another member.
When posting a topic with formula make sure that it is correctly formatted to preformated text </>.
Use the proper category that best describes your topic
Provide as much context to a question as possible.
Include demo pbix file, images of the entire scenario you are dealing with, screenshot of the data model, details of how you want to visualize a result, and any other supporting links and details.

I also suggest that you check the forum guideline https://forum.enterprisedna.co/t/how-to-use-the-enterprise-dna-support-forum/3951. Not adhering to it may sometimes cause delay in getting an answer.

TonyaM · October 1, 2020, 12:37am

Hi ells!
Instead of looking at Advanced Editor, you can click on the gear icon next to Source. That will open up a window to paste in an actual query against the View. I have written queries in SQL and pasted them right in. You can join Views, Tables, etc. You can use SubQueries, CTE’s, etc. The tables I deal with are huge - and I am coming from a SQL background, so it’s easier for me to use my own queries.

Good Luck!!
-Tonya

EnterpriseDNA · October 1, 2020, 12:42am

Hi @ells, a response on this post has been tagged as “Solution”. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the check box. Thanks!

ells · October 1, 2020, 7:03am

@TonyaM
Thanks. I have just read what I posted. I am looking for query performance. Thanks I will look at that as it has got to be easier to read.

I have a similar background. I am struggling in that 15m rows will take longer than 5 hours to load. I am pretty sure it is the SQL Server that is the issue and looking at Query execution plans but if there is anything I am doing wrong in Power BI I want to correct that.

Many Thnaks
E

Greg · October 1, 2020, 12:06pm

Hi @ells. When I encounter a performance issue getting data from SQL Server, I use the same query in SQL Server Management Studio and check the database server’s execution plan and optimize the query there first; Power Query is unlikely to be able to do much to make a slow query faster. Would it be possible to reduce the amount of data retrieved by the query (fewer rows, fewer columns, integer keys instead of string or date/time values, etc.)?
Greg

ells · October 1, 2020, 6:41pm

@Greg
Thanks.
Cant reduce the amount of data. I am not even retrieving half of what the users want. In all honsesty the voulmes dont seem to be the issue. Sure there is 15M rows and with a lot of dimensions there are some 15 foreign keys.
On the SQL side the query runs in 15monutes for each 5M row chunks. However when loading into Power BI desktop to build the data set this fails after 5 hours - due to timeout.
SQL is controlled by another company so waht I can achieve is limited. I can see some issues but they dont make the inroads that I need, There are a couple of Hash Matches which I have not come across before.

I can see the laptop I am using is struggling (common to see CPU and memory hit 90% or more), can also see the disk filling up so assume it is spooling to disk.

I am going to give it a go sticking the fact in a data flow. Although if that works then we may have some issue with security so not ideal.

Thnaks
M

ells · October 2, 2020, 6:07am

@Greg
I spent a bit of time yesterday looking further into this. I decided to create a data flow and see how that performs. Went and refreshed it with 4 years of data. Came back an hour later and ut had completed the refresh.

It appears that the combination of laptop and Powewr BI desktop are causing the issue.

Does it load the data from SQL to memory When I have clicked ‘close and apply’ this is the point it pauses and struggles?

Cheers

Greg · October 2, 2020, 9:16am

Yes … when transforming your data, only schema and a subset preview are loaded into the model … as you said, it’s when you hit “close-and-apply” that all of the data is loaded into the model. Glad it’s now better for you. Greg

ells · October 8, 2020, 8:16am

@Greg
ok so now I have 15M records in the repoort I have to build another report with cut down dimensions and all 30 M rows.

The Power BI Service will not even complete validating the queries in the data flow. At this rate I dsont think we will ever let Qlik go.

Cant load the data into the data model in a timely manner either vi desktop or Power BI Service
Queries in the report take too long to get and show the data.

To clarify how long is too long - I can go put the kettle on make a coffee and the screen has not moved.

Rather lost
E

Greg · October 8, 2020, 11:46am

Hi @ells. Just a thought … have you investigated aggregations, as this could potentially vastly reduce your rowcount if your report is going to roll-up results anyway… Greg

ells · October 8, 2020, 2:17pm

@Greg,
I did start to look at it.
i think one of the problems is the number and width of the dimensions. If I did a count of rows on the table, Then did a distinct count based on the foreign keys I am using I would get circa 85% of the total.
I am not sure that Aggregations would work with that sort of percentage?

Have not been able to find any deep knowledgable info on them so have not placed it at the top of the list and as I am struggling to get the data into workflow / model this is where I am focussing.
Thanks
E

Greg · October 8, 2020, 2:25pm

The only other things I can think of are:

revisit the report design and be really specific about the question it is supposed to answer; perhaps it doesn’t need the 30M rows but could “get-away” with a subset
do your rollups in your on-premises data source and only pull the results table into Power BI

I don’t know if one or both of these would be of any help to your situation …
Greg

ells · October 9, 2020, 6:15am

@Greg,
Many thanks. Currently in the standard “between a rock and a hard place” position.

“Qlik does this with all of the data. Qlik does the security we want.”

I cant even match the volumes of data - I am at 50%
Anything to help with column level security will bloat the bloating model
I had been thinking about Aggregations but I dont think it will mix with the column level security

Currently I cant get the full fact data into a data flow or a data set.
oh well
E