Deneb Example - Ridge Plot

Deneb/Vega-Lite can be used to generate a Ridge Plot (also known as a Joy Plot), which can be used to compare overlapping distributions of data between categories. Probability density estimations are often smoothed using a kernel density estimation, or KDE, algorithm/process. The example presented herein displays layered kernel density estimation plot of climate statistics for Ottawa, Canada for the period 1950 to 2020 categorized by month with the Power BI dataset being converted into a probability distribution by the density transform built-in to Vega-Lite.

The first approach attempted was to use the “facet” operator in Vega-Lite, and while this was easy to implement and had only a single layer with 3 marks, the syntax for facet overlap was not found, and the solution had a lot of whitespace. A second approach was more brute-force using 36 (12x3) “layer” operators, and, while certainly more verbose, was also easy to implement with mostly cut-and-paste operations, and, in my opinion, resulted in a more attractive and useful layout. The second approach is described below.

This example illustrates a number of Deneb/Vega-Lite features, including:
0 - General:

  • a “title” block with a 2-line subtitle array
  • a shared “transform” block with:
    • a “calculate” transform using the statistic selection parameter to determine the temperature statistic to use in the KDE density transform
    • a “calculate” transform to determine the month
  • a shared “params” block with:
    • a radio button screen widget to allow the user to select the desired temperature statistic
    • a parameter for the Y increment to use for each month
  • a shared “encoding” block to ensure all marks use the same X and Y axes
    • the X-axis configured with:
      • conditional label font size (increments of 10 larger)
      • conditional tick mark size and colour (increments of 10 larger)
      • conditional grid line width and colour (increments of 10 darker; zero wider)
      • a set scale to prevent X-axis resizing with data changes
    • the Y-axis configured with:
      • a set scale to prevent Y-axis resizing with data changes

1 - Month Outer Layer:

  • a “layer” block for each month (12 objects per year), each with:
    • a “filter” transform to restrict the dataset to only records for the current month
    • a “density” transform to calculate the KDE distribution for the selected statistic and construct a new dataset
    • 3x “calculate” transforms to determine the reverse month number, Y offset, and offset KDE density
    • a nested “layer” block for the marks for each month (see below)

2 - Month Inner Layer:

  • a “layer” block for the marks for each month (3 objects per month), each with:
    • an “area” mark using a “Y” value of the offset KDE density and “Y2” value of the Y offset
    • a “rule” mark using full-width “X” values and a “Y” value of the Y offset
    • a “text” mark complete with:
      • a “filter” transform to restrict the dataset to a single record
      • a “calculate” transform to determine the month label
      • a “Y” value of the Y offset
Deneb/Vega-Lite Specification JSON Code:
{
  "title": {
    "anchor": "start",
    "align": "left",
    "offset": 20,
    "text": "Power BI Ridge Plot using Deneb (Layer)",
    "font": "Verdana",
    "fontSize": 24,
    "fontWeight": "bold",
    "fontStyle": "normal",
    "subtitle": [
      "Daily Temperature Distribution, 1950 to 2020, International Airport, Ottawa, Canada",
      "Data Source: Environment and Climate Change Canada (https://climate-change.canada.ca/climate-data/#/daily-climate-data)"
    ],
    "subtitleFont": "Verdana",
    "subtitleFontSize": 16,
    "subtitleFontWeight": "normal",
    "subtitleFontStyle": "italic"
  },
  "data": {
    "name": "dataset"
  },
  "width": 1140,
  "height": 490,
  "transform": [
    {
      "calculate": "_temperature_statistic == 'Maximum Temperature' ? datum['MAX_TEMPERATURE'] : _temperature_statistic == 'Minimum Temperature' ? datum['MIN_TEMPERATURE'] : datum['MEAN_TEMPERATURE']",
      "as": "_temperature"
    },
    {
      "calculate": "month( datum['Date'] )",
      "as": "_month"
    }
  ],
  "params": [
    {
      "name": "_temperature_statistic",
      "value": "Maximum Temperature",
      "bind": {
        "input": "radio",
        "options": [
          "Maximum Temperature",
          "Minimum Temperature",
          "Mean Temperature"
        ],
        "name": "Statistic: "
      }
    },
    {
      "name": "_month_y_increment",
      "value": 0.03
    }
  ],
  "encoding": {
    "x": {
      "type": "quantitative",
      "title": "Temperature (°C)",
      "axis": {
        "domain": false,
        "labelFontSize": {
          "expr": "datum.value % 10 == 0 ? 14 : 10"
        },
        "titleFontSize": 16,
        "gridColor": {
          "expr": "datum.value == 0 ? 'black' : datum.value % 10 == 0 ? '#969696' : '#E3E3E3'"
        },
        "gridWidth": {
          "expr": "datum.value == 0 ? 2 : 1"
        },
        "tickSize": {
          "expr": "datum.value % 10 == 0 ? 14 : 8"
        },
        "tickColor": {
          "expr": "datum.value == 0 ? 'black' : datum.value % 10 == 0 ? '#969696' : '#E3E3E3'"
        }
      },
      "scale": {
        "domain": [
          -50,
          50
        ]
      }
    },
    "y": {
      "type": "quantitative",
      "axis": null,
      "scale": {
        "domain": [
          0,
          0.45
        ]
      }
    }
  },
  "layer": [
    {
      "name": "JANUARY",
      "transform": [
        {
          "filter": "datum['_month'] == 0"
        },
        {
          "density": "_temperature",
          "groupby": [
            "_month"
          ],
          "extent": [
            -50,
            50
          ],
          "as": [
            "_kde_value",
            "_kde_density"
          ]
        },
        {
          "calculate": "11 - datum['_month']",
          "as": "_reverse_month"
        },
        {
          "calculate": "datum['_reverse_month'] * _month_y_increment",
          "as": "_month_y_offset"
        },
        {
          "calculate": "datum['_kde_density'] + datum['_month_y_offset']",
          "as": "_new_kde_density"
        }
      ],
      "layer": [
        {
          "name": "AREA_00",
          "mark": {
            "type": "area",
            "opacity": 0.7
          },
          "encoding": {
            "x": {
              "field": "_kde_value"
            },
            "y": {
              "field": "_new_kde_density"
            },
            "y2": {
              "field": "_month_y_offset"
            }
          }
        },
        {
          "name": "DOMAIN_00",
          "mark": {
            "type": "rule"
          },
          "encoding": {
            "x": {
              "datum": -50
            },
            "x2": {
              "datum": 50
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        },
        {
          "name": "LABEL_00",
          "transform": [
            {
              "filter": "datum['_kde_value'] == -50"
            },
            {
              "calculate": "monthFormat( datum['_month'] )",
              "as": "_month_name"
            }
          ],
          "mark": {
            "type": "text",
            "yOffset": -10
          },
          "encoding": {
            "text": {
              "field": "_month_name",
              "type": "nominal"
            },
            "x": {
              "datum": -49.5
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        }
      ]
    },
    {
      "name": "FEBRUARY",
      "transform": [
        {
          "filter": "datum['_month'] == 1"
        },
        {
          "density": "_temperature",
          "groupby": [
            "_month"
          ],
          "extent": [
            -50,
            50
          ],
          "as": [
            "_kde_value",
            "_kde_density"
          ]
        },
        {
          "calculate": "11 - datum['_month']",
          "as": "_reverse_month"
        },
        {
          "calculate": "datum['_reverse_month'] * _month_y_increment",
          "as": "_month_y_offset"
        },
        {
          "calculate": "datum['_kde_density'] + datum['_month_y_offset']",
          "as": "_new_kde_density"
        }
      ],
      "layer": [
        {
          "name": "AREA_01",
          "mark": {
            "type": "area",
            "opacity": 0.7
          },
          "encoding": {
            "x": {
              "field": "_kde_value"
            },
            "y": {
              "field": "_new_kde_density"
            },
            "y2": {
              "field": "_month_y_offset"
            }
          }
        },
        {
          "name": "DOMAIN_01",
          "mark": {
            "type": "rule"
          },
          "encoding": {
            "x": {
              "datum": -50
            },
            "x2": {
              "datum": 50
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        },
        {
          "name": "LABEL_01",
          "transform": [
            {
              "filter": "datum['_kde_value'] == -50"
            },
            {
              "calculate": "monthFormat( datum['_month'] )",
              "as": "_month_name"
            }
          ],
          "mark": {
            "type": "text",
            "yOffset": -10
          },
          "encoding": {
            "text": {
              "field": "_month_name",
              "type": "nominal"
            },
            "x": {
              "datum": -49.5
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        }
      ]
    },
    {
      "name": "MARCH",
      "transform": [
        {
          "filter": "datum['_month'] == 2"
        },
        {
          "density": "_temperature",
          "groupby": [
            "_month"
          ],
          "extent": [
            -50,
            50
          ],
          "as": [
            "_kde_value",
            "_kde_density"
          ]
        },
        {
          "calculate": "11 - datum['_month']",
          "as": "_reverse_month"
        },
        {
          "calculate": "datum['_reverse_month'] * _month_y_increment",
          "as": "_month_y_offset"
        },
        {
          "calculate": "datum['_kde_density'] + datum['_month_y_offset']",
          "as": "_new_kde_density"
        }
      ],
      "layer": [
        {
          "name": "AREA_02",
          "mark": {
            "type": "area",
            "opacity": 0.7
          },
          "encoding": {
            "x": {
              "field": "_kde_value"
            },
            "y": {
              "field": "_new_kde_density"
            },
            "y2": {
              "field": "_month_y_offset"
            }
          }
        },
        {
          "name": "DOMAIN_02",
          "mark": {
            "type": "rule"
          },
          "encoding": {
            "x": {
              "datum": -50
            },
            "x2": {
              "datum": 50
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        },
        {
          "name": "LABEL_02",
          "transform": [
            {
              "filter": "datum['_kde_value'] == -50"
            },
            {
              "calculate": "monthFormat( datum['_month'] )",
              "as": "_month_name"
            }
          ],
          "mark": {
            "type": "text",
            "yOffset": -10
          },
          "encoding": {
            "text": {
              "field": "_month_name",
              "type": "nominal"
            },
            "x": {
              "datum": -49.5
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        }
      ]
    },
// *** NOTE: April to November deleted for space reasons
// as each month is a simple copy-and-paste of the previous, with the naming adjusted and the filter incremented by 1
// see PBIX for full code ***
    {
      "name": "DECEMBER",
      "transform": [
        {
          "filter": "datum['_month'] == 11"
        },
        {
          "density": "_temperature",
          "groupby": [
            "_month"
          ],
          "extent": [
            -50,
            50
          ],
          "as": [
            "_kde_value",
            "_kde_density"
          ]
        },
        {
          "calculate": "11 - datum['_month']",
          "as": "_reverse_month"
        },
        {
          "calculate": "datum['_reverse_month'] * _month_y_increment",
          "as": "_month_y_offset"
        },
        {
          "calculate": "datum['_kde_density'] + datum['_month_y_offset']",
          "as": "_new_kde_density"
        }
      ],
      "layer": [
        {
          "name": "AREA_11",
          "mark": {
            "type": "area",
            "opacity": 0.7
          },
          "encoding": {
            "x": {
              "field": "_kde_value"
            },
            "y": {
              "field": "_new_kde_density"
            }
          }
        },
        {
          "name": "DOMAIN_11",
          "mark": {
            "type": "rule"
          },
          "encoding": {
            "x": {
              "datum": -50
            },
            "x2": {
              "datum": 50
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        },
        {
          "name": "LABEL_11",
          "transform": [
            {
              "filter": "datum['_kde_value'] == -50"
            },
            {
              "calculate": "monthFormat( datum['_month'] )",
              "as": "_month_name"
            }
          ],
          "mark": {
            "type": "text",
            "yOffset": -10
          },
          "encoding": {
            "text": {
              "field": "_month_name",
              "type": "nominal"
            },
            "x": {
              "datum": -49.5
            },
            "y": {
              "field": "_month_y_offset",
              "type": "quantitative"
            }
          }
        }
      ]
    }
  ]
}
Deneb/Vega-Lite Config JSON Code:
{
  "view": {
    "stroke": null
  },
  "area": {
    "color": "#0F4C81",
    "line": true
  },
  "rule": {
    "color": "#C9C9C9"
  },
  "text": {
    "align": "left",
    "color": "black",
    "fontSize": 16,
    "fontWeight": "normal"
  }
}

Also included is the development sample PBIX using a datasource of climate data for Ottawa, Ontario, Canada for the period of 1950 to 2020 obtained from Environment and Climate Change Canada.

The intent of these examples were not to provide finished visuals, but rather to explore the use of the Deneb custom visual and the Vega-Lite language within Power BI and to serve as a starting point for further development.

This example is provided as-is for information purposes only, and its use is solely at the discretion of the end user; no responsibility is assumed by the author.

Greg
Deneb Example - Ridge Plot - V3.pbix (3.2 MB)

4 Likes

marking as solved

2 Likes

So impressive. You’re really moving the needle in Power BI viz. Great to see

1 Like

Thanks so much Sam … I really appreciate it and feel I’ve really accomplished something when you make such a comment. Thank you!