Power BI Challenge 15 - Formula 1 Analysis

Hi @JohnG. I can’t say for this dataset for sure (@haroonali1000 can comment), but for a normal F1 race there are 2 cars per row on the grid, and the left/right position is set depending on the circuit, so differs from race-to-race. Not what you wanted to hear, I’m sure, but it is what it is.
Greg

Thanks, Greg.

I had hoped to see if there was a correlation between Grid Position and finish position.

However, given that the drivers are only two wide, right and left hand turns, and the length of the race, it’s not an angle worth pursuing, unless I have additional insights.

Regards,

John Giles

1 Like

@JohnG
I like the way you are thinking !

1 Like

All,

If you’re on the fence about whether to participate in the Data Challenges or not, I would urge you to watch today’s YouTube video, where I had a great conversation with Challenge newcomers @DavieJoe and @FPastor. David and Federico spend some time walking through each of their entries, and then we talk about their experience of participating as a first-time entrant.

Really fun discussion, and thanks to both for taking the time to discuss their experience.

– Brian

2 Likes

15 minutes of fame….well……22 really :joy:

1 Like

Hi all. Here’s my submission for Data Challenge 15, Formula 1 Analysis.

eDNA Challenge 15 - Formula 1 - Greg Philps

https://app.powerbi.com/view?r=eyJrIjoiMTZiZjE3YTItYTcwNS00MWQ0LWEwODEtOTUxY2YxNzkxZDJlIiwidCI6IjBjM2ZiODk1LTllZjYtNDkzNy1hYjdkLTZmMjNlYWUxOGY3OSJ9

Here are the notes I kept during development:

Goals
  • to strictly adhere to the one-page criteria for this challenge and not use any sort of navigation at all; the visuals on the main page are thus very simple and are intended to server as windows into the tooltips
  • to attempt to put all useful information for each data subset into tooltips
  • to gain experience using the small multiples display options for line charts and bar charts
  • to gain experience displaying data-driven images in visuals
What I liked
  • I’m happy with the results of using small multiples and plan to incorporate them more frequently, when report space allows
  • I’m really a fan of using variable transparency
  • I’m set in my ways now with using the [General] section of the [Format] pane to set the X/Y/Width/Height to multiples of 10 pixels; makes alignment, spacing so much easier
  • I’m set in my ways now to use PowerPoint to add a “mask” to background images to dull the colours
What I didn't like/needs more work
  • I’m not happy with how simple the visuals are on the main page; having images be simply a means to get to a tooltip, while it served my learning purposes for this challenge, is not a good solution and could be vastly improved, I’m sure
  • I’m not happy with the brightness of the background image and thus how it impacts the overlaid visuals (with transparency), but I like the image so much I wanted to try to keep it and was loathe to cover it up; not a good solution, but I ran out of the time I allotted myself for this challenge before I could come up with a better solution
Custom Visuals
  • used free “Text Filter” by Microsoft to enable searching in text columns
  • used free “Simple Image” by Vincent Faight to enable the display of an image from a URL
Data Loading and Transformations
  • ensured options for “Import relationships from data sources on first load” and “Auto-detect new relationships after data is loaded” were unchecked (disabled)
    • (from File \ Options and settings \ Options \ Current File\ Data Load \ Relationships)
  • ensured option for “Change default visual interaction from cross highlighting to cross filtering” was checked (enabled)
    • (from File \ Options and settings \ Options \ Current File\ Report Settings \ Visual Options)
  • loaded all source CSV files as “RAW …” files to [Staging Queries] query group and disabled load
  • added query parameter for “parameterSourceFolder” and adjusted all staging queries to use the source folder parameter, including check for ending ""
    • *** NOTE: while setup with a local folder only, the use of parameters for source path and file would allow the use of online files (OneDrive for Business) and, if the dataset is deployed to DEV, TEST, and PROD workspaces, the dataset in each workspace could be easily configured in the Power BI Service to use the appropriate folder and file ***
  • [RAW constructor_standings]: changed data type for [positionText] column from decimal to text
  • [RAW pit_stops]: change data type for [duration] column from decimal to text
  • [RAW results]: replaced errors in [number] column with NULL
  • added the Enterprise DNA Extended Dates Table; marked as a date table
  • referenced “RAW …” staging queries as necessary for tables for the [Data Model] query group
  • adjusted column naming for all " …id" columns in data model tables to use “… Key” for linking columns

[Results]:

  • changed data type of [Position] column to whole number
  • replaced errors with NULL

[RAW Drivers]:

  • edited [forname] and [surname] columns to replace accented characters in driver names (a bigger job than I thought, so only partially complete)

[Nationality]:

[Drivers]:

  • merged with [Nationality] supporting query to get [Country] and [Country Code] columns

[Constructors]:

  • merged with [Nationality] supporting query to get [Country] and [Country Code] columns

[Circuits]:

  • I wanted to augment the available data on circuits, to make additional information like track length, race length (laps), lap records, etc. available, and did some quick research and found some circuit info at https://f1.fandom.com/wiki/Circuits and https://f1.fandom.com/wiki/lap_record;
  • additional data for select circuits for laps and track length sourced from the Wikipedia links provided in the Kaggle dataset and entered manually
  • I created a custom circuit Excel file to hold the new circuit data and manually added the same “circuitId” as the Kaggle export circuits.csv file to enable a proper merge; edited the dataset and replaced accented characters in the circuit names
  • I wanted to add circuit maps as well, and downloaded a small number (about 15) images from the wikipedia URLs provided in the Kaggle dataset, then uploaded these 15 images to imgbb.com to generate URLs, then updated my custom circuit CSV file with the URLs
Data Modelling
  • used standard waterfall layout with lookup table on top, multiple fact tables in middle (i.e., [Races], [Qualifying], [Results]), multiple measure tables in a column at the top-right, and multiple supporting tables in a row at the bottom-left
  • used 2 additional modelling pages, 1 each for [Drivers] and [Constructors] to more easily view the relationships for that data subset
DAX Calculations
  • used [Key Measures] group for general simple measures (e.g., [Total Races], [Total Wins], [Total Qualifyings], etc.)
  • used [Driver Measures] group for driver-specific measures (e.g., [Poles], [Podiums], [DNFs], etc.)
  • used [Constructor Measures] group for constructor-specific measures (e.g., [Constructors’ Championships], [Rank], etc.)
  • used [Circuit Measures] group for circuit-specific measures (e.g., [Events], [Turns], [Lap Record], etc.)

[Drivers]:


Q3 Appearances = 

CALCULATE( [Total Qualifyings],

    FILTER( Qualifying,

        NOT ISBLANK( Qualifying[Q3] ) )

) + 0

Circuits:

  • reproduced many values from the [Circuits] table as measures to ease/facilitate use in text boxes and cards, including

Lap Record Time = 
VAR _milliseconds = MAX( Circuits[Lap Record Milliseconds] ) 
VAR _minutes = INT( DIVIDE( _milliseconds, 60000, 0 ) ) 
VAR _seconds = INT( DIVIDE( _milliseconds - ( _minutes * 60000 ), 1000, 0 ) ) 
VAR _ms = _milliseconds - ( _minutes * 60000 ) - ( _seconds * 1000 ) 
VAR _Result = IF(
    ISBLANK( _milliseconds ), "n/a", 
    _minutes & ":" & FORMAT( _seconds, "00" ) & "." & FORMAT( _ms, "000" )
) 

RETURN
_Result

Ran out of time to properly develop a solution for calculating drivers’ and constructors’ points, points rank, championships, and championships rank so chose a quick-and-dirty solution consisting of 4 calculated tables.

Visualizations

Theme:

  • browsed internet searching for dark F1 car images; chose a night image of Lewis Hamilton’s black 2021 Mercedes
  • used PowerPoint to place a mask (rectangle) with gradient fill (dark top, transparent bottom) over the image to dull the intensity of bright spots on the top of the image

General:

  • used 50 px high rectangles for title backgrounds with 25 px radius borders to create appearance of fully rounded left-and-right edges
  • used 5 px radius borders on visuals to create appearance of rounded corners
  • used 2 px 0-degree custom shadows on visuals
  • used multiples of 10 px for sizing for X/Y/Width/Height for all visuals
  • used 10 px around and 10 px spacings between all visuals

Header:

  • browsed internet for a black, transparent Formula 1 logo and displayed at left of title bar
  • used standard text box for title text
  • used “Reset” button to activate “Reset Filters” bookmark
  • used the visual tooltip on an “Information” button to display report info (e.g., description, author, date, report ID, version, custom visuals, etc.)

Drivers:

  • used standard table visual to display drivers and race wins (with data bars); most of the specific driver data will be showcased in the custom tooltip
  • used “Text Filter” custom visual to allow for easy searching for a specific driver

[Driver Tooltip]:

  • used textbox with measures only to display driver name and country
  • used line chart with small multiples to show finishing position and starting position for the drivers’ last 4 career years
  • used donut chart to show wins, 2nd places, 3rd places, other point finishes (i.e., places 4-10), and non-point finishes (i.e., places 11+)
  • used multi-row card to show measures for drivers’ statistics (e.g., wins, rank, drivers championships, rank, etc.)

Constructors:

  • used 100% stacked column chart to display constructor wins and non-wins; most of the specific constructor data will be showcased in the custom tooltip
  • used “Text Filter” custom visual to allow for easy searching for a specific constructor

[Constructor Tooltip]:

  • used textbox with measures only to display constructor name and country
  • used bar chart with small multiples to show wins and non-wins (other) for 2014-2021
  • used multi-row card to show measures for constructors’ statistics (e.g., wins, rank, constructors championships, rank, etc.)

Circuits:

  • used a simple bar chart with gradient coloring to show circuits by number of events held; most of the specific circuit data will be showcased in the custom tooltip
  • used “Text Filter” custom visual to allow for easy searching for a specific circuit

[Circuit Tooltip]:

  • used textbox with measures only to display circuit name and location
  • used “Simple Image” custom visual to display maps for select circuits; for un-selected circuits, showed a “not available” image
  • used multi-row card to show measures for circuit statistics (e.g., turns, lap length, lap record, etc.)
17 Likes

Hello Haroon:

I have a question regarding the Status ID’s.

There is a distinct status for drivers who “finish” the race, Status = “1”.

There many other statuses that indicate “+1 Lap”, “+2 Laps”, and so on. Do these statuses indicate that the driver didn’t finish the race? Or that the driver finished the race, but was however many laps behind the winner?

Regards,

John Giles

1 Like

Hi John. The latter: the race is over when the winner crosses the finish line, and those that have been lapped once “+1”, those that have been lapped twice “+2”, etc.
Greg

1 Like

Thanks, Greg.

Just to confirm, for statistical purposes, those who have been “lapped” are considered to have finished the race?

1 Like

Yes … finished, but lapped.
Greg

1 Like

Thanks, again, Greg.

1 Like

Hi Greg

Excellent submission yet again!

I was going through your report and found that the Driver Championships (titles) numbers for Alain Prost and Ayrton Senna are not correct. Maybe you have used the points column in the ‘Results’ table which has not been corrected for different rules that existed for some years like points collected over best of 6 races to determine the title winner. I would suggest to use the points column in the ‘Driver Standings’ table instead which caters for these changes.

Great report overall.

Alvi

1 Like

Hi @Alvi. Thanks for the kind words. Mea culpa, I made the cardinal sin of the report developer: I trusted the data source. As I knew the subject well, I only checked a quick random sample of the data points within the last 10 years or so. I knew that F1 had changed its point scheme several times in its history (https://en.wikipedia.org/wiki/List_of_Formula_One_World_Championship_points_scoring_systems) but did not pursue it. If I had been responsible for the data source I never would have included multiple fields containing the same data for exactly that reason.
Greg

4 Likes

Here’s the entry from one of our non-member participants, Fred.

3 Likes

Here is my entry for Challenge 15. Full writeup to follow.

Thanks
Jarrett

14 Likes

Top stuff @JarrettM love it!

1 Like

Here is my submission for Challenge #15 - Formula 1

14 Likes

Awesome @alexbadiu!!

You look like a true Lewis Hamilton fan. Even he would be more than happy to see this one page masterpiece. Simply amazing!!

Alvi

3 Likes

jaw hits floor

Wow wow wow! Spectacular!

2 Likes

Jeez! When @FPastor said in his interview a couple of weeks ago that the Enterprise DNA Challenges were the chance to “play with the big boys”, looking at @Greg 's, @JarrettM 's and @Alex’s entries, it’s very clear he was not. kidding. around. Wow!

5 Likes