Best practice to get data from REST API that requires paging

Hi,

I have a REST API source that requires paging, i.e. load the same URL with parameters ?page=1, ?page=2, ?page=3, … until empty result is returned.
I can use web connector and build these requests programmatically from a static list of (page) numbers, but this has two disadvantages that I need a solution for:

  • The absolute killer for this solution is that you cannot schedule automatic refresh in Power BI service! It says, the source URL is built dynamically, so Power BI cannot determine for what data source to ask for authentication credentials (although not needed for my source) and security settings. The problem is desbribed here in section " Refresh and dynamic data sources": https://docs.microsoft.com/en-us/power-bi/connect-data/refresh-data
  • Even a long static list of prosepective page numbers is not a really automated state of the art solution, but that’s the minor problem.

So, how do you connect to REST APIs that require calling an undefined number of pages to load a full table, if you need scheduled refresh in Power BI service? Isn’t there a build-in solution for this common scenario in Power BI?

For now I’ve built a dozen of queries manually - one query per page - to cover my data growth for a year or so, but that’s not a survivable solution.

Thanks a lot in advance!

Best regards,
Martin

Hi @Martin,

Not sure but I hope these links will help you.

1 Like

@Martin,

I literally came across a solution to this exact issue a couple of days ago when I was working on a web scraping problem. I can’t find the article at the moment, and am headed off to a meeting but will dig it up for you and post later tonight.

  • Brian
1 Like

@Martin,

Here’s the path I went down the other night on this:

The original article that get cited a lot is this one:
https://www.mattmasson.com/2014/11/iterating-over-multiple-pages-of-web-data-using-power-query/

However, this takes a static, brute force approach to the multiple page problem.

These two sources improve on that approach by iterating until an empty table is hit


And Gil Raviv provides an approach for dealing with cursor-based pagination:

From the Datachant post, I then located two Chris Webb blog posts that address the problem of doing this in scheduled refresh:


I’m still very much in learning mode about some of this, but I hope at least a portion of what I’ve found is helpful to you.

  • Brian
1 Like

Thank you @BrianJ
The believe that there is a solution to the dynamic data source problem already helped me to find a solution:
https://m.youtube.com/watch?feature=youtu.be&v=_SWj249xAxQ
BR
Martin