Data Analysis - Case Study in Power BI -  Covid Vaccinization Process

A complete guided case study in Power BI based on the Kaggle data set.


Introduction

In this blog, we are going to explore COVID vaccine data and try to find out the inference of how the immunization process is going on worldwide.

Source of Data

For the case studies, Kaggle is a good source of data. So this time I consider below dataset 

https://www.kaggle.com/gpreda/covid-world-vaccination-progress?select=country_vaccinations_by_manufacturer.csv

  1. It has two files, but we will use one file named “country_vaccinations.csv” for this case study.
  2. If you want to know more in detail about each field, you can have a look at the below picture. Though all the fields are self-explanatory. 
Content from Kaggle

Our Objective

The objective of this case study is to follow the primary steps of a data analysis project.

  1. Data Understanding
  2. Data Import and Cleaning process
  3. Create Relationship 
  4. Choose recommended graphs for visualization and know the purpose of those graphs
  5. Publish to Power BI Service

Import Data

  • Let’s start with the Get Data option under the Home tab. As this is a CSV file, select the Text/CSV option from the drop-down list
  • Select the file named country_vaccinations.csv
  • After selecting the file, data will be displayed in the below format
Image by Author
  • Click on Load and save data.

Data Cleaning

After importing, it is obvious to go for the data cleaning process.

  1. Click on Transform Data under the Home tab and go to Power Query Editor.
  2. In Power Query Editor, go to the View tab, enable Column Distribution, Column Quality and Column Profile.
  3. It will help you to find out missing values, any data errors, any data type mismatch, any outliers etc. 
  4. Based on the above findings, you can take appropriate actions. 
  5. For example, in this data, we have daily_vaccinations_raw which has 90% empty rows that means it has missing values. Whereas daily_vaccinations has less than 1% empty rows. Both columns have the same purpose. So we can remove daily_vaccinations_raw.
Data Cleaning , Power Query Editor, Power BI
Image by Author

6. Right Click on Country → Click on Replace Values and replace “England” with “United Kingdom”.

Image by Author

6. Now click on the Close & Apply button and return to the main Power BI Desktop pane.

Create Date Table

  1. First, create one Date table before proceeding to any calculation. 
  2. Here you will use the DAX function and this date table will help you to do Time Intelligence Analysis.
  3. Go to the Modelling tab → Click on the New table 
  4. Write “Date Table = CALENDARAUTO()” and the automatic date table is now in place. 

Create Relationship

Now you have two tables and it’s time to create a relationship between them. 

  1. Click on “Model” from the left side navigation bar. 
  2. Click on the date column of the country_vaccinations table, then drag & drop to the date column of the Date Table.
  3. A many to one relationship is created.
Relationship in Power BI
Image by Author

Now you are ready to create visualizations. But one thing to remember, based on our visualization requirements, we will create different calculated measures or calculated columns.

Modify Column Name

Provide some proper names to all fields and tables. For example, remove underscore, words start with capital letters etc.

Rename Field in Power BI
Image by Author

Select Theme

Before proceeding to report, you can select one theme for your project. It will help you choose the proper colour combination. For each theme, there have some suggested colours, however, you can very well select any other colour also.

Go to View tab -> Under Themes Select Executive ( you can select based on your choice) 

Create Snapshot View

For any summary report or dashboard, it is a good idea to have some snapshot views. 

At a glance, the user will get some idea about the current scenario of the business/data. 

  1. Click on Card visual → Added to the canvas area.
  2. Select field People Fully Vaccinated
  3. Click on Format → Go to Data label and Category Label. Change Color, Font family and Text size. Add some background color to it. 
Image by Author

4. Follow the same process for Total Vaccinations, People Vaccinated and Total Country. 

5. To derive Total Country → Select the Country column and change to Count (Distinct) from the drop-down.

Image by Author

6. Using Format painter, copy the same format for all the Card visuals.

Image by Author

Add Year, Month Slicer

  1. Add Slicer visual beside the card visuals. 
  2. Add Date Hierarchy → Keep only Year and Month. 
  3. Normally people are interested to know how the vaccine process is going month on month. 
Image by Author

Create Line Graph “People Vaccinated by Date”

  1. Add Line Chart to the canvas area.
  2. Add Date in Axis and People Vaccinated in Values. As it is a trend analysis (based on date, that’s why it is called trend), it is preferable to use a line graph to show how data varies over time.
  3. In the Format section, you can do the following changes

a) choose one Data Colors,

b) enable Data labels, update Display units based on your choice so that values can be visible properly and easy to follow the data variations. But across all reports, try to keep the same display units, it will help any user to understand the data variations.

c) if you want, you can modify the title, or font size or different text style.

Image by Author

Create Clustered Column Chart “Top 10 Country by People Fully Vaccinated”

  1. Add Bar Chart to the canvas area.
  2. Add Country in Axis and People Fully Vaccinated in Values. 
  3. To display any comparison analysis, it is preferable to use a clustered column chart.
  4. In the Filters section, select Top N filter type from Country, add show items 10 and By value People Fully Vaccinated.
Image by Author

5.In the Format section, you can do the following changes

a) choose one Data Colors,

b) enable Data labels, update Display units based on your choice so that values can be visible properly and easy to follow the data variations.

c) if you want, you can modify the title, or font size or different text style.

Image by Author

Create Clustered Bar Chart “Top 10 Country by Daily Vaccinations”

  1. Add Bar Chart to the canvas area.
  2. Add Country in Axis and Daily Vaccinations in Values.
  3. As it is also a comparison analysis, so you can use a bar graph. This time I am using Clustered Bar Chart.
  4. In the Filters section, select Top N filter type from Country, add show items 10 and By value Daily Vaccinations.
Image by Author

5.In the Format section, you can do the following changes

a) choose one Data Colors,

b) enable Data labels, update Display units based on your choice so that values can be visible properly and easy to follow the data variations.

c) if you want, you can modify the title, or font size or different text style.

Image by Author

Create Map Visual “Total Vaccinations by Country”

  1. Add Map visual to the canvas area.
  2. Add Country in Location and Total Vaccinations in Size.
  3. In this analysis, you want to get some idea overall vaccinations process across all countries. For this, map visual is ideal. 
  4. In the Format section, you can do the following changes

a) choose one Data Colors

b)change the Map styles to Grayscale, update Bubbles size

c) if you want, you can modify the title, or font size or different text style.

Image by Author

Publish to Power BI Service

  1. Go to the File tab → Click on Publish
  2. Click on Publish to Power BI
  3. Select a destination workspace and click on the Select button 
  4. Now your report is published to Power BI Service.
Image by Author

Open Report from Power BI Service

  1. Type in browser → https://app.powerbi.com/
  2. Enter your credentials. 
  3. Go to the workspace where you have published. 
  4. Now your report is ready for presentation.
Image by Author

Try out yourself 

  1. You can try other types of analysis using this data set.
  2. Play with different formatting to make it more presentable. 
  3. Add some page background to look more professional.

Download

File Name: Covid Data analysis

Video

Conclusion

In this blog, we have completed a guided case study in Power BI based on the Kaggle data set.

If you have any questions related to this project, please feel free to post your comments.

Please visit my website for other technical resources.

Please like, comment and subscribe to my YouTube channel which you have already seen. 🙂 Keep Learning.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: