Deneb Example - Regression

Deneb/Vega-Lite can be used to create regression analyses in Power BI. Vega-Lite has 6 regression algorithms built-in: linear, logarithmic, exponential, power, quadratic, and polynomial. The example visual presented herein consists of a scatter chart using movie ratings overlayed with a regression line and an R-squared statistic.

This example illustrates a number of Deneb/Vega-Lite features, including:

0 - General:

  • use of a “params” block to create a radio-button screen widget for selecting the regression algorithm
  • use of a “layer” block consisting of 3 sections: scatter chart, regression line, and r-squared statistic.

1 - Scatter Chart:

  • use of a “point” mark to display IMDB rating vs. Rotten Tomatoes rating (the raw data)
    • both the X and Y axes use custom axis tick counts (10)
    • custom tooltip with title, release year, release date, IMDB rating, and Rotten Tomatoes rating

2 - Regression Line:

3 - Regression Statistic:

  • use of a “transform” block to apply the (selected) regression algorithm to the raw data and calculate the r-squared statistic
  • use of a “text” mark located at the center of the visual (width/2; height/2) to display the r-squared statistic

The intent of this example is not to provide a finished visual, but rather to serve as a starting point for further custom visual development.

Also included is the sample PBIX using the movies JSON file provided on the Vaga-Lite examples website.

This example is provided as-is for information purposes only, and its use is solely at the discretion of the end user; no responsibility is assumed by the author.

Greg
Deneb Examples - Regression.pbix (1.5 MB)

3 Likes

marking as solved

Thanks a lot Greg, Very interesting
If I may ask two questions :
1/ is signal the key to force the data transformation following to a change of selection made by user ? In fact, I have some difficulties withi this signal
2/ I do not see the dataset following this transformation (nothing added ?). Therefore I do not understand on what field the line is drawn. Because you have set a regression based on two fields, you just need to recode these fiels to get it ?

Fabrice

Hi @fabrice.aunez.

1 - Yes, once the user has selected a regression method, the signal is used to reference it in the transform; AFAIK, you can’t do this in Vega-Lite, but Davide Bacci’s method (see above) of using Vega within Vega-Lite seems to work great:

  "transform": [
    {
      "regression": "IMDB Rating",
      "on": "Rotten Tomatoes Rating",
      "method": {
        "signal": "_regression_keyword"
      }
    }
  ],

2 - I’m not sure exactly, as Vega-Lite does the regression calculations internally, but think it is the REGRESSION_LINE_marks dataset; I certainly didn’t add anything

Hope this helps.
Greg

Hi Greg, This post has really helped me with my Power BI work so big thanks for that. I was wondering if you’ve had any experience plotting the Confidence Interval (CI) for a Regression line using Vega-lite? I see in the documentation there is a mark type of “errorband”, with extent “ci” but I cannot get it to plot the CI for the regression points themselves. Any help much appreciated.

Thanks,
George

Hi @geehaf53. I’m glad you’ve found my thoughts useful. No, I haven’t used the confidence intervals outside of the boxplot mark before, but if you mock-up an example of what you’re looking for, I’ll take a stab at it, likely on Monday or Tuesday.
Greg

Hi Greg, I have attached a .pbix sample.
The blue marks are my actual data and the red regression curve is working as expected. I’ve also added the green marks which represent the same regression curve but plotted as points only. I’m trying to achieve two additional curves which represent the upper and lower confidence values of the green (and therefore the red curve) regression points. I hope this makes sense.
Regards,

George
regression_example.pbix (1.5 MB)