How to Use 27+ Statistical Functions in DAX - DAX in Power BI - Chapter 13

We are going to explore 28 key statistical functions in DAX using practical examples.

According to Microsoft documentation, there are different statistical and aggregation functions in DAX.

In this blog, you will explore functions related to the below items,

1. average,
2. maximum,
3. minimum,
4. count,
5. median,
6. standard deviation (sample, population),
7. variance (sample, population)
8. sample data extraction from population data.

Get Data

Data Set 1

At first import, the US Super Store data from Kaggle and save it to Power BI Desktop.

• Let’s start with the Get Data option under the Home tab. As this is a CSV file, select the Text/CSV option from the drop-down list
• Select the file named US Superstore data.csv
• After selecting the file, data will be displayed in the below format
• Click on Load and save data.

Data Set 2

1. Create another table using some data.
2. Make sure this data set should have some blank column values. It will help you to use different functions related to blank.

AVERAGE, AVERAGEA, AVERAGEX

First, we need to know what is the purpose of three different average functions. If you go into detail about the argument and return type of these functions, you can identify the difference.

AVERAGE: It takes a numeric column as an argument and returns a decimal number that represents the arithmetic mean of that column values.

AVERAGEA: It returns a decimal number, but it handles text and non-numeric values.

AVERAGEX: This function takes a table as its first argument and an expression as the second argument. First, it evaluates expressions for each row of a table, and then take the resulting set of values and calculate its arithmetic mean.

Example: Let’s find out some examples of these.

MAX, MAXA, MAXX

MAX: It returns the largest value in a column. It doesn’t support logical values.

MAXA: It also returns the largest value in a column, but it supports date type as well as logical values like TRUE/FALSE.

MAXX: It evaluates an expression of a table and returns the largest values.

See the examples below.

MIN, MINA, MINX

MIN: It returns the smallest value in a column. It doesn’t support logical values.

MINA: It also returns the smallest value in a column, but it supports date type as well as logical values like TRUE/FALSE.

MINX: It evaluates an expression of a table and returns the smallest values.

See the examples below.

MEDIAN, MEDIANX

MEDIAN: It returns the decimal value of numbers in a column.

MEDIANX: It takes a table as the first argument or an expression that returns a table. The second argument is a column that contains the numbers for which you want to compute the median or an expression that evaluates to a column.

See the examples below.

COUNT, COUNTA, COUNTAX

COUNT: It returns the count of rows that contain the following kinds of values: Numbers, Dates, Strings. TRUE/FALSE values are not supported. Blank values are skipped

COUNTA: It is similar to the COUNT function, but it supports TRUE/FALSE values.

COUNTAX: It counts nonblank results when evaluating the result of an expression over a table. Whenever the function finds no rows to aggregate, the function returns a blank. If you want to count logical values, use the COUNTAX function.

COUNTX, COUNTBLANK

COUNTX: It takes two arguments. The first argument must always be a table or any expression that returns a table. The second argument is the column or expression that is searched by COUNTX.

COUNTBLANK: It returns the number of blank cells in a column.

COUNTROWS

This function is used to count the number of rows in a base table, but more often is used to count the number of rows that result from filtering a table, or applying context to a table.

DISTINCTCOUNT, DISTINCTCOUNTNOBLANK

DISTINCTCOUNT: It takes a column as an argument. You can use columns containing any type of data. When the function finds no rows to count, it returns a BLANK, otherwise, it returns the count of distinct values. It includes the BLANK value.

DISTINCTCOUNTNOBLANK: Unlike the DISTINCTCOUNT function, DISTINCTCOUNTNOBLANK does not include the BLANK value.

See the examples below.

Extract Data using SAMPLE Function

This is a very interesting part of Power BI.

Using the Sample function you can create one table which contains the sample rows or sub data set of the main table.

In statistics, there are two types of data sets Population and Sample. Population means the complete data set, it could be a huge amount of data. The sample is a subset of that large data set.

For any statistical analysis, we are considering this sample data set.

The SAMPLE function returns a sample of N rows from the specified table.

STDEV.S, STDEVX.S, VAR.S, VARX.S

STDEV.S: It returns the standard deviation of a sample population.

STDEVX.S: It returns the expression for each row of the table and returns the standard deviation of expression assuming that table refers to a sample of the population.

VAR.S: It returns a number with the variance of a sample population.

VARX.S: It returns the expression for each row of the table and returns the variance of expression; on the assumption that the table refers to a sample of the population.

See the examples below.

STDEV.P, STDEVX.P, VAR.P, VARX.P

STDEV.P: It returns the standard deviation of a sample population.

STDEVX.P: It returns the expression for each row of the table and returns the standard deviation of expression assuming that table refers to a sample of the population.

VAR.P: It returns a number with the variance of a sample population.

VARX.P: It returns the expression for each row of the table and returns the variance of expression; on the assumption that the table refers to a sample of the population.

See the examples below.

Please find the code in the below location

DAX — Chapter 13

https://github.com/arpitag1/Power-BI

Conclusion

In this blog, we learn the 28 statistical functions of DAX with practical examples.