You will learn how to build statistical visuals like decision trees in Power BI.
The decision tree is an equally important part when we are using the Decision Tree machine learning algorithm for our data science project. Power BI has some small visualization capability and custom visual features are enabling to implement this.
In this blog, we are going to explore one statistical visualization, named decision tree.
For this case study, I consider the US Superstore dataset from Kaggle.
- Let’s start with the Get Data option under the Home tab. As this is a CSV file, select the Text/CSV option from the drop-down list
- Select the file named US Superstore data.csv
- After selecting the file, data will be displayed in the below format
- Click on Load and save data.
What is a Decision Tree?
It resembles an upside-down tree.
A decision tree splits the data into multiple sets. Then, each of these sets is further split into subsets to arrive at a decision.
Decision trees make it very easy to determine the important attributes. It requires performing tests on attributes to split the data into multiple partitions.
So the decision trees can go back and tell us the factors leading to a given decision.
If you want to know more about the decision tree, you can check my blog about this.
How to Create a Decision Tree?
In Power BI, many custom visuals are based on R packages. The Decision Tree Chart is based on R package rpart to build the model and rpart.plot to visualize the model as a tree.
Let’s create a Decision Tree step by step.
- Goto Visualization section → Click on Get more visuals.
- Open the “Power BI Visuals” dialog box. Search with “decision tree”.
- Click on Add button beside on Decision Tree Chart
4. Select the Decision Tree Chart visual and add it to your current page.
5. This tree predicts the Sales as a Target Variable dependent on the Input Variable Discount. Now you can add variables accordingly in the visual and get the initial view.
6. If you want to change some formatting section parameters, it could change the algorithm parameters.
7. If you enable Tree parameters, then you can observe Maximum depth is 15 and Minium bucket size is 2.
Maximum depth means a value between 2 and 15, limiting the number of levels from trunk to leaf. and Minimum bucket size can have a value between 2 and 100. That means, the higher this number, the lower the number of nodes.
8. When you enable Advanced parameters, then three parameters will enable Complexity, Cross-validation and Maximum attempts.
Complexity means a number between 0.5 and one trillion to control if the node needs to be further split or not.
Cross-validation has some different sets of values like Auto, None, 2-fold to 100-fold etc.
Maximum attempts relate to a number between 1 and 1000.
Both Cross-validation and Maximum attempts mean the higher the value the better the accuracy, but the longer the calculation process.
I am keeping the default values.
9. Additional parameters are for showing warning and Show info.
10. Now Decision Tree is ready to display.
As per the Decision Tree algorithm, Root Node, Decision Node, Terminal Node are key points. From the below picture, you can get some idea about this.
- The root node is selected based on the results from the selected attributes.
- Then these attributes are repeated until a leaf node, or a terminal node cannot be split into sub-nodes.
- For the outcome of a prediction with a decision tree, only the leaf-level nodes (plotted on the bottom) are used.
- Here at nodes 1 and 3, no decision can be made. Then it is further distributed to next-level nodes.
- Now from other nodes (2, 5, 6, 7 etc.), we are getting the predictive result based on some decisions. In this way, you can read one decision tree.
Please find the code in the below location
In this blog, we understand how to create and analyse decision trees in Power BI.
In my next blog, we will learn more about AI and Power BI.
If you have any questions related to this project, please feel free to post your comments.
Please visit my website for other technical resources.
Please like, comment and subscribe to my YouTube channel which you have already seen. 🙂 Keep Learning.