Guided project on OLTP to OLAP data transformation and Decomposition Tree using US Super Store Sales data
Earlier, for any data transformation or data conversion project, we used different ETL tools. Yes, for large-scale enterprise projects, definitely we need ETL or data warehouse tools. But, using Power BI we can very well transform any transactional data to analytical data that means OLTP data model to OLAP data model.
In this blog, we are going to learn how we can create a Star-Schema data model from a transactional data file. We will explore Decomposition Tree AI visual.
Table of Contents
1. Data Set
3. Learning Objective
4. Import Data
5. Transform Data and Data Model Build up
6. Data Visualization
I have taken US Super Store data from Kaggle for this project.
There are 21 columns in this data set.
Data set contains the below information
- Order details
- Customer details
- Product details
- Sales, Quantity, Discount, Profit information
- Shipping details
- Geography details
The primary objectives of this case study are following
- Dimensiontional Data Model Build(OLTP to OLAP transformation)
- AI Visuals — Decomposition Tree
- Let’s start with the Get Data option under the Home tab. As this is a CSV file, select the Text/CSV option from the drop-down list
- Select the file named US Superstore data.csv
- After selecting the file, data will be displayed in the below format
- Click on Load and save data.
Transform Data and Data Model Build up
- Go to Power Query Editor and verify the data type of all columns whether it is in the required format or not. If not, then make those changes.
- Go to Left Panel → Right-click on the table under Queries and click on Reference.
- A new reference table will be created.
4. Rename this table with Dim — Product.
5. Under the Home tab → Go to the Manage Column group → Click on Choose Columns drop down → then click on Choose Columns and one dialog selection box will open with all column lists.
6. For the Dim — Product table, select only Product ID, Category, Sub-Category, Product Name columns. Unselect other columns and click on the Ok button.
7. Now if we select any Product ID from the drop-down, we can observe multiple rows for one Product ID. This is a dimension table, so we have to create unique for each product.
8. To create unique rows, go to the Reduce Rows group under the Home tab. Click on Remove Rows → Remove Duplicates and unique Product rows will be generated. Please note, during the above selection, make sure you have selected the right column which will be denoted as the primary key of this dimension table.
9. Product Dimension table is ready. Click on the Close & Apply button and come out from Power Query Editor.
10. Now in Desktop mode, click on Model under the left panel section and create the relationship between two tables.
11. Now in Desktop mode, go to the Fields section and select Dim — Product table to create a dimensional hierarchy.
12. We need to follow the same above steps to create Dim — Customer and Dim — Geography tables and rename the US Superstore Data table to the Fact — Sales table.
13. Create two date tables using the New Table feature under the Data section of Desktop mode. One date table is for the Order date and another for the Ship date. You can check my other blog to know about the Date table creation process. https://medium.com/analytics-vidhya/data-analysis-in-power-bi-how-to-build-forecasting-dashboard-with-retail-data-2f7918819dcc
14. To follow best practices, we will hide those fields in the fact table which have been already selected for dimension tables. For example, Customer Name, Product Name, Country, etc.
Data Visualization (AI Visual)
Power BI offers different AI visuals to leverage AI features in simple charts such as bar and column. Decomposition Tree is one of them.
It relates to a tree with branches. It allows a user to breakdown a measure across multiple dimensions in any order using the drill-down feature. For any type of ad-hoc analysis, it can be used.
This visual allows users to identify the split of the profit and sales based on product category, sub-category, and product name. This would help business teams to target those groups accordingly in their future campaigns in order to maximize the net profits.
Please download the code file from the below location
In this guided project we learned about the following things.
1. How to build a dimensional data model from transactional data.
2. Dimensional Hierarchy creation.
3. AI-Visuals — How to create Decomposition Tree visuals
If you have any questions related to this project, please feel free to post your comments.