A guided Retail Data Analysis case study in Power BI
To know any tool or technology, it is always better to go through a case study or self-derived project. This time I thought to do Retail Data analysis as well as explore the Forecasting and Create Date table feature in Power BI.
Table of Contents
1. Data Set
3. Learning Objective
4. Import Data
5. Transform Data
6. Create Date Table
7. Data Model Build up
8. Data Visualization
If you search for any domain-specific data in Kaggle, you will surely get multiple options. So choose wisely the data set for your project. This time I have taken Retail Data.
In this data set, there are three files — sales data-set, Features data set, and stores data-set.
After analyzing the data files, here are some noted points
· Data contains weekly sales information about 45 stores including size and type of stores.
· It contains department information, CPI (Consumer Price Index) values, markdown information, etc.
The primary objectives of this case study are following
1. Functional Objective: Build Retail Analytics Dashboard
a. Scorecard information: Total Stores, Total Sales, Department Count, Average CPI
b. Report Visual:
i. Weekly Sales Forecasting
ii. Weekly Sales Trend Analysis
iii. The More Profitable Weeks
iv. The More Profitable Stores
v. Sales by Store Type
vi. Average CPI Trend Analysis
c. Reports should be filtered based on the year.
2. Technical Objective:
a. How to create a Date table in Power BI
b. How to create a Forecasting report
- Let’s start with the Get Data option under the Home tab. As this is a CSV file, select the Text/CSV option from the drop-down list
- Select the file named Features data set.csv
- After selecting the file, data will be displayed in below format
- Click on Transform Data and Navigate to Power Query Editor.
- Now it’s time to add two more files. In Power Query Editor, under the Home tab, click on New Source, then Text/CSV. Select the next file sales data-set.csv and press ok to add in the editor.
- Follow the same steps for stores data-set
After loading data, it’s time for data profiling as well as data transformation. This is the most important step for any data analysis project.
Now we are in Power Query Editor.
1. First rename the table name as Features, Sales, Stores
2. Go to the Features table.
a. Observe data type for CPI is text.
b. We need to change to number type.
c. There are some rows that contain the value “NA”.
d. Select CPI -> Right-Click → Select Replace Values. Here replace “NA” values with 0
e. Change the data type of CPI to decimal number.
3. Click on Close&Apply under the Home tab and come out from Power Query Editor.
Create Date Table
Now we are in the main Power BI editor where model, data, and report sections are present. For this analysis, we need to create a common Date table.
1. Click on Data from the left-side navigation pane
2. Click on New Table under the Tables Tools tab.
3. To create the Date table, write Date = CALENDARAUTO(). A date column with a new date table will be created.
· We have already loaded tables where already date columns are present. So Power BI takes that data as a reference to create and populate this data table.
· In Feature and Sales tables, there have date columns but not with continuous information. For any Time Series or Forecasting analysis, we need continuous calendar information. The purpose of this data table is to populate continuous date information.
4. Now create other columns (using New column) in the Date table which we could be used for data visualization.
a. Year = YEAR(‘Date’[Date])
b. MonthNum = FORMAT(‘Date’[Date],”MM”)
c. Month = FORMAT(‘Date’[Date],”MMM”)
d. WeekNum = WEEKNUM(‘Date’[Date],2)
e. Week Start Date = ‘Date’[Date]-WEEKDAY(‘Date’[Date],2)
f. Week End Date = ‘Date’[Date]-WEEKDAY(‘Date’[Date],2)+7
5. Select Don’t summarize for all number fields.
6. Change Sort by column for Month to MonthNum
Data Model Build up
To build a proper data model with these 4 tables, click on Model from the left panel.
1. To create a relationship between two tables, it is better to drag and drop method.
2. Select Store column of Store table -> Drag and drop to Store column of the Sales table.
3. Observe 1 to many relationships built. That means for one record of the Store table, there are multiple records present in the Sales table.
4. Follow the same method and create the following relationships.
a. Store table → Feature Table: 1 to Many relationships (using Store column of both tables)
b. Date table -> Sales table: 1 to Many relationships (using Date column of both tables)
c. Date table -> Features table: 1 to Many relationships (using Date column of both tables)
5. Here is the final data model after creating all relationships
Now we are in the Report section.
First, we create the below measures using the New measure under the Calculations group of the Home tab.
1. Sales table
· Total Weekly Sales: Total Weekly Sales = SUM(‘Sales’[Weekly_Sales])
· No. of Dept: No.of Dept = DISTINCTCOUNT(‘Sales’[Dept])
2. Stores table
· Total Stores: Total Stores = DISTINCTCOUNT(‘Stores’[Store])
3. Features table
· Avg CPI (Average CPI): Avg CPI = AVERAGE(‘Features’[CPI])
Let’s time to explore some visualization.
1. First select the theme for this report. Go to the View tab → select any theme for your preference under Themes. I have selected the Tidal theme.
2. Rename the page to Summary. Create the following reports to achieve our functional objective.
a. Slicer: Year
i. Total Stores
ii. Total Weekly Sales
iii. No.of Dept
iv. Avg CPI
c. Table: The More Profitable Weeks (In Formatting section, I select Style Flashy rows)
d. Table: The More Profitable Stores (Select Style same)
e. Treemap: Sales by Store Type
f. Line chart: Avg CPI Trend
g. Area chart: Weekly Sales Trend Analysis
h. Line chart: Weekly Sales Forecasting
i. First create a Line chart with Date and Total Weekly Sales fields.
ii. Change Date Fields from Date Hierarchy to Date
iii. Go to the Analytics section of this visual.
iv. Click on Forecast → Add one Forecast. Add below details.
· Forecast Length: 6 Months
· Ignore last : 0
a. Let’s put 3 in ignore last 3 months. This way, we can compare the Power BI’s forecasting result with the actual data in the last 3 months of the dataset.
b. Revert back to 0 in Ignore last column.
· Confidence Interval: 95%
· Seasonality: 70 points.
Final Summary Report
Publish to Power BI service
To publish the report, go to the File tab, select Publish, and the My Workspace. Now login to Power BI Service URL (https://app.powerbi.com/) and go to My Workspace to view the Retail Data Analysis Report.
Please find the Power BI file in the below GitHub Directory.
In this guided project we learned about the following things.
1. How some sample retail data looks like
2. Prototype of Retail Analytics reports
3. Create Date Table feature in Power BI
4. How Forecasting report can be created in Power BI
If you have any questions related to this project, please feel free to post your comments.