Power BI Challenge 11 – Covid 19 Reporting

@emercado777,

I’m not at the computer now so I can’t check that specific problem, but we’ve identified a lot of wonkiness in the Recovered data. While we’re reporting it minimally on the report for completeness, we are focusing our primary analyses on Confirmed and Deaths.

  • Brian
1 Like

@BrianJ
Thanks for looking into things Brian but the data does appear odd.
Assuming the data is cumulative, I get a maximum unfiltered total for recorded deaths of just over 376K which is incorrect based on the Joh Hopkins Dashboard. I have now downloaded the CSV’s and will take a look again later.
Regards,
Craig.

Problem Now Solved.

2 Likes

Thanks for the quick response. Yeah, the data seems to be missing some values there. Anyhow, no worries. I might need to make some assumptions when doing this.

Here is my Submission.

Attached PBIX file:Challenge 11 w ISO.pbix (1.4 MB)

Link to PowerBI service:

4 Likes

Hello to Everybody,

Please find here my 1 page solution to this Challenge. I am first participant.

Link to webreport

Let me give you a short description of my work:

In the summary of COVID situation I had 4 main objectives :
1: have a quick global vision of main indicators with international comparison
2: be able to dig into any country in detail with all main measures to set up a country diagnosis
3: show trends, actionable indicators that decision makers need
4: keep historical data for further analysis

You can track daily new cases and deaths, but I also calculated 14 day average measures (generally used in covid statistics as well) to visualise cleaner trends without having the daily fluctuation in data.

By using the time slicer, you can visualise cumulated data for any period, for instance to compare the First Wave results to the Second Wave results of the pandemic.

On the map visualisation it is easy to have a global view how the different countries are impacted at the moment (size of the bubble = average daily new cases taking into account last 14 days). Trend is shown vs 7 days before to show the evolution of this same 14 days average indicator. By changing the time slicer you can see which were the countries with worsening trend for instance a month ago, and how this changed for now.

Continent category is mainly used to have better visualisation of data by geographical zoom on the map but also by ignoring “extreme data” : ex: without selecting Europe, it is difficult to see relative size of the bubbles for European countries due to US extremely high value.

I voluntarily did not focus on the active cases information, as it seems the least reliable and the least accurately managed by authorities.

Regards,

7 Likes

Challenge Participants,

If the cumulative reporting data structure in the Johns Hopkins data is giving you as big a headache as it gave me, be sure to check out @markperrone’s excellent post here for a relatively simple solution:

– Brian

4 Likes

Hello All,
This is my submission for Challenge 11.

Data:
I used the data provided by EDNA. I continue the development using given PBIX.

Approach:
I wanted to show most important details in also as simple as possible. Therefore I used colours to differentiate the 3 categories . So user can understand the report even without reading the title. Also I used dark theme because I feel Covid 19 brings negative effect to World. I used most important metrics to show in this report. Also Added date filter to filter the result by date.

3 Likes

Here is my submission. I still have some things I want to add but it will have to be added in v2 later as I only had a few hours to work on this one.

I added two parameters. The first one is to estimate the number of people mourning for their loved ones. I think this is an important impact of COVID-19 deaths. The second one is a cases multiplier. I added this one after articles saying the number of people reported as having COVID can be much lower than actual, due to mild sickness/asymptomatic or when many people get it in the same residence. After the first person is positive they no longer need to test to realize they have it, so it won’t get reported in official numbers. This is important for the deaths % of confirmed, and also confirmed per 100k population views (I haven’t added population yet!).

Edit: fixed the counts formula, will take a few mins for the report link to update.

5 Likes

Thank you Haroon for setting this challenge. This is the first of the challenges that I have taken part in and I have found it really useful to get a bit more familiar with the presentation tools available in Power BI.

I am still getting to grips with some of the whizzier graphics that I have seen on other submissions - but this has been really helpful for me to explore the use of custom tool-tips to enable more detailed numerical information to be made available to a user while staying within the brief that it should be a 1 page report. I have taken a screen shot of the report - but hopefully the link to the app will work too.

https://app.powerbi.com/reportEmbed?reportId=ff99e155-0a6f-40f7-975d-62d1f1034b15&autoAuth=true&ctid=05ce1cbd-1d9d-44ba-8c1d-2ff97ee4b6ff&config=eyJjbHVzdGVyVXJsIjoiaHR0cHM6Ly93YWJpLW5vcnRoLWV1cm9wZS1yZWRpcmVjdC5hbmFseXNpcy53aW5kb3dzLm5ldC8ifQ%3D%3D

Have a great week.

Chris

7 Likes

Everyone,

Attached is my submission. For this one, I opted to keep it a bit simple as the data set was limited, but more importantly due to limited time. However, my approach on these challenges is more about learning new approaches, visuals, etc.

Here are two screen shots. I have a toggle button (the white button above location slicer) to flip between charts showing the information by “Confirmed Cases” versus “Deaths”. I ignored the “Recovered” data.

Here was my approach:

Data
Given that the count in the original data set is just the cumulative amounts for a given day, I created 4 additional calculated columns in the Covid Data table. I opted to go with calculated columns instead of measures as I just found it quicker this time and the data set was relatively small. The columns were:

“Previous Day Count” – this was used to get the value from the previous day which would then allow me to calculate what the daily count was. I added variables to take both the category (Confirmed, Death, Recovered) as well as location.

“Daily Count” – this would just be the amount in the original Value column (provided with the dataset) minus the previous day count above.

I then added two more columns to find out what the daily change was from the previous day as well as the percentage change. I was hoping to use this information to track the % change over time – but ended up running out of time. I’m hoping someone else did something around this.

Key Measures
There is nothing fancy here. Mainly the daily counts for each category and then 7 day moving averages. I also created Top 10 countries for both confirmed cases and deaths.

Visuals
Very standard visuals here. I did use the ArcGIS map visual but found that with this as well as the standard Microsoft map that the bubbles representing size aren’t very good at representing the actual amount. A small bubble could be 20 cases or 20,000 cases and they look to be the same size. So overall, I didn’t find this very effective. I am going to review others who used maps to see what you all did.

A new visual that I have never used before is the one for the top 10 countries for confirmed cases/deaths. I used the Infographic Designer visual (available from the custom visual location) – and then used people as the icon. I thought this would be the most relevant to display the number of cases.

Credit for the icon

Icons made by Freepik from www.flaticon.com
7 Likes

Hi,

I noticed that when I refresh the John Hopkins data set, PowerQuery does not fetch the newest dates. I think this is because the column dimension is hard coded into the applied steps of PowerQuery.

Is there a simple way to add a dynamic column to the applied steps, so that when refreshing the data, all the newest data will download from John Hopkins site?

https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

1 Like

Hi Emercado777,

You can simply delete the Columns=… part by clicking on the Source Step in the Applied steps. I had the same issue before. Here is my example that left after deletion:

= Csv.Document(Web.Contents(“https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"),[Delimiter=",”, Encoding=65001, QuoteStyle=QuoteStyle.None])

Regards,

3 Likes

@hajni157 is correct, you can replace that hard coded value by null

For example.


.

see the documentation here

2 Likes

Excellent, that worked! I added “null” to the column variable. I also added an additional step at the end of the transformation to change the date type of “Value” to Whole Number.

Thanks @Melissa and @hajni157

1 Like

Hello all,

Please find below my submission to Challenge n.11 !!

10 Likes

This is what I did too!

HI All,

This was quite a challenging assignment. It is my first time connecting to web data, but it was a good learning experiencing.

Here is a video of how the my dashboard works because I don’t know how to make my dashboard public. I think it is a permission thing.

And thanks Haroon for putting this challenge together. The eDNA videos came in big handy while doing this challenge.

Pictures

7 Likes

Good afternoon/evening everybody!

I was having some network issues yesterday so I didn’t get to post this on the forum. I was able to send my submission on time, but I wanted to wait till today to see if I could get the report to work using the direct connection to the source data. But anyways… below you’ll find my submission for Challenge 11. I decided to specialize my report to my home state of Minnesota in the United States to add a personal touch to the report and allow for a deeper dive into the data. This also allowed for me to take direct inspiration from the Tableau reports and one Power BI report the Minnesota Department of Health already use to report to the public. Combining these reports into a single comprehensive report, as well as incorporating the Minnesota COVID-19 model developed by an interdisciplinary team at the University of Minnesota School of Public Health and the Minnesota Department of Health, this allows for a single source for all COVID-19 data in the state as well as allowing for someone to split the report into multiple parts to improve performance.

Now, let’s dive into the report itself! The report has 5 distinct parts: Case Overview, Preparation and Response, Impact on Demographics, Vaccines, and Advanced Modeling. Below you will see a brief summary for each page and a screenshot of the complete page. Some of the pages are very tall, but as all the information is related I chose to keep them on one page and allow the user to scroll.


Home
Basic landing page with icons to direct the user to the appropriate report.

Summary


Case Overview
This is really the single quick view that higher ups and the general public would look at. Basic stats, clean, and minimalistic. This would most likely answer most peoples questions that aren’t looking for much more information.

Summary


Preparation and Response
This is the operation side of the pandemic (i.e. capacity, economics, public reactions, etc.). There’s five pages in this section including: Response Capacity, Public Health Risks, Economic & Food Security, Social Distancing, and Critical Care Supplies.

Summary

Response Capacity
Response Capacity focuses on if we currently have enough available beds, staff, or supplies to meet demand.

Public Health Risks
The Public Health Risks page focuses on tracking the spread of the virus and includes positive test rate, case growth rate, testing rate, hospitalization rate, and the rate of spread. Minnesota has established high risk and critical measures for each of these so we can see if we are not only flattening the curve but are getting to manageable levels.

Economic & Food Security
This is a high level look at how COVID has affected Minnesotans livelihoods by tracking new unemployment claims, total existing unemployment claims, and applications for the Supplemental Nutrition Assistance Program (SNAP).

Social Distancing
Yes, it’s not truly ‘social distancing’ since it tracking if people are staying 6 feet apart would be pretty invasive, but we can see how travel increases exposure, spread, and new COVID cases by county.

Critical Care Supplies
This page tracks the inventory, supply, and shipment of critical care supplies in the state. The state itself is working with the private and public sector to ensure that healthcare professionals have all the essential supplies they need to fight the virus even if demand for these supplies increases.


Vaccines
Now that the vaccines have been created and distribution is in full swing, we are no longer simply trying to hold off the virus, but bringing the fight to it. Demand for the vaccine is of course very high so this section of the report includes a high level vaccine summary, demographics on who is getting vaccinated, supply and distribution, and administration of the vaccine across the state.

Summary

Summary
The summary answers most questions people have around the administration and supply of the vaccine (how many people have been vaccinated, how much of each type do we have, how many doses have been administered).

Who’s Getting Vaccinated
Currently very basic since the data is just starting to come in, but breaks out vaccine dose administration by gender and age group.

Distribution
Tracks the available supply of the vaccine, supply shipped to Minnesota providers, and supply shipped for the CDC Long-Term Care Vaccination Program. They just released more information on this so I wish I could have included that in my submission but I felt like that would be cheating.

Vaccines Administered
Breaks out the administered doses of the vaccine by product type and provider type.


Demographics
This is probably the most fascinating part of the report in my opinion. This analyzes the impact COVID has had by demographics with a special focus on its effects by race/ethnicity in different facets of life. This section touches on the effect COVID has had on different demographics not only by the number of cases and deaths, but also by unemployment, worker characteristics, and homelessness.

Summary

Intro
This section starts with a basic overview of the purpose of this analysis and the need for such analysis when looking at the affect the virus has had.

COVID-19 Cases
This is a detailed breakout of the main overview page, but analyzes COVID-19 cases, hospitalizations, and deaths by race/ethnicity. This is one of the longest pages as it’s really three pages stacked on top of each other.

Unemployment
Takes a look at how COVID has affected employment by race/ethnicity. I would have liked to eventually add a correlation heatmap to this page to prevent any misinterpretation of the data, but the data does speak for itself here.

Worker Characteristics
The next three pages all reflect the impact COVID has had on worker characteristics (essential/non essential workers, ability to work from home, vulnerability to layoffs, etc.). The difference is in how you want to see the data: race/ethnicity, gender, income range. The top chart is a high level look for quick analysis and the bottom chart is industry specific.



Group Setting
This page focuses on the impact COVID has had on homelessness in regards to race/ethnicity. It also lays out the effects it has had on healthcare staff treating the homeless by race/ethnicity.


Advanced Modeling
I wish I could show you this visual, but I am having major network issues so this model isn’t able to connect to the source data, but it’s an incredible model. When my connectivity issues are resolved I’ll update this page.

A little bit about the model itself - the model used is a SEIR (susceptible-exposed-infectious-recovered) model and the interesting part about their model is that it contains a lot of flexibility around uncertainty which has really had a big impact on the spread and treatment of the virus. I was going to implement some additional what-if parameters to allow for a user to see directly how things like the availability of ICU beds, increases/decreases in travel, and funding impact the spread and treatment of the virus over time, but I since the code for this model is publicly available I am planning to incorporate that into some training videos for everyone. The model is rather easy to implement so I think it will be great for everybody to see the implementation of a model like that and how to add parameters to a machine learning model in Power BI.

Summary


I hope you like it!

12 Likes

Hi All, here is my submission.

Learnt a lot in these 3 days of putting up the dashboard. Thanks to the team.
I had the same issue to update the data, thanks to Melissa and hajni157 for answering that and thanks to Emercado777 for asking the question.

I have 2 pages and you can navigate via the button, the second page has some animation with the map visuals. Also have a custom tool tip , as you hover over the country, it gives you the statistics.

Regards,
Mustafa

5 Likes