@dimbroane, one of the drawbacks with using SQL views is that they have to compile every record from the underlying data source before applying any externally-provided filters (i.e. the view can’t say whether to include or exclude a row unless it checks all the underlying data first). That being said, properly designed views hide enormous complexity in the underlying data architecture and are generally intended to surface data for consumption for tasks like yours.
Specifically addressing incremental refreshes versus direct query: do not direct query a mammoth view, for the very reason specified in my first paragraph. But do attempt an incremental refresh if possible. The exact mechanism for incremental refreshes depends heavily on the data fields available and the needs of your report. The following info applies whether you are sourcing from a view or from tables.
Ideally, every row being imported will have some kind of date you can use to determine the “age” of the data. That could be an order date, record created date, record last modified date, etc. Then you only import rows that are of a certain age (i.e. last 10 years, last 30 days, etc.). The more complex version is “give me everything that has changed since X date”, with “X date” typically be the last time you ran the import. It gets uglier if, for example, orders can be modified after the order date but you have no modified date indicating the record was changed.
Even if you don’t have the ability to use the incremental refresh capability in the Power BI service (which I do not have because I am on-prem), you can still import only the last X days/years of records into a staging table in Power BI and then merge or append as needed to the primary data model fact table (and dimensions, if they are derived from the big import).
Other options include creating or asking for a SQL view optimized to your needs. I get to “cheat” because I develop all my own SQL Server architecture , but even in a big company you could ask for your own view.
All the other advice about reducing the columns is of course critical. You should only import the exact columns you need, especially with massive datasets where even with modern tech you can hit bandwidth and server performance bottlenecks.
Hope this helps with your main question.
John C. Pratt