Latest Enterprise DNA Initiatives

Sql Database Speed?

Hey gang,

Does an Sql data base drastically improve the speed of a power bi refresh over and above data in Excel?

Chris

@cms418,

I know very little about SLQ, but I recall reading an article from Chris Webb where refresh on a large flat file took nearly 7 times longer in his testing then refresh from an optimized dimensional model. I also know you can test this if you have SQL Profiler installed using the external tools capability in DAX Studio.

But I’ll send up the Bat Signal to EDNA Experts @Greg and @hafizsultan , both of whom know a ton about SQL.

  • Brian

Hi @cms418.

For small datasets, it’s likely that there wouldn’t be a noticeable difference in loading performance, unless the speed of the network connections differ and/or the loading of the database server itself is high. For large datasets, however, I would choose a SQL Server database data source over an Excel data source every time.

Another “feature” of Excel (at least back in the mid-2010s, not sure if it still applies…) is that it considers only the first 7 rows of a spreadsheet to determine the column data type, and given that Excel cells can contain anything, it is incumbent upon the the spreadsheet author to have only a single, consistent data type per column. A SQL Server data source will always have a defined data type for every column, so consistency is ensured.

At the end of the day, however, the typical consultant’s answer “… it depends …” is the best I can give. The specific scenario (e.g., what is the data set area [rows x columns], network access speed, database server loading, etc.) will need to be tested to discover the best solution for each application.
Greg

(P.S.: I’ll pencil-in a test for the near future where I make up and load both a small and large dataset from both Excel and SQL Server and see what the differences in loading performance are, and I’ll add my findings to this post when available.)

2 Likes

Thanks that would be great!

I’m your humble opinion, what’s the line of small vs large in a data set?