Finding outliers

Hi

I am trying to find outliers for some production orders.

Goal :
I want to find the production orders which have unusual high costs.

Approach :
I thought I get the median (or mean) +/- the standard deviation and figure out which orders are above/below.

Issue :
Median / average work as long as my table only includes Articles. As soon as I get the order in as well, the median is no longer the median for all orders of the same article.
With this I cannot add the standard deviation to identify which order is an outlier.

Question :

  1. How do I get the median to be the same for each order of the same article?
  2. Is there a better approach to find outliers?

I attach some simple data examples and really hope someone can push me in the right direction.
Median Calculation.pbix (37.0 KB)

Thanks in advance
Franz

Hi @franz.lurvink,

Give this a go.

Median v2 = 
MEDIANX( 
    ADDCOLUMNS(
        CALCULATETABLE(
            SUMMARIZE( Data, Data[Article], Data[Year],  Data[MaterialCost p.pc.] ), 
            REMOVEFILTERS( Data[Order] )
        ),
        "Median", [MaterialPrice] 
    ),  Data[MaterialCost p.pc.] 
) 

I hope this is helpful.

1 Like

Hi Melissa

Thanks again for helping me out.
I don’t quite understand the setup … not yet, but I will study it.
My current problem is that the REMOVEFILTERS is not recognised. Strange enough KEEPFILTERS exists but not the REMOVEFILTERS. Any idea why this is?
(btw: I don’t use BI for this exercise but PowerPivot in Excel. But I guess this is not causing the issue.)

Regards
Franz

Hi Melissa

Sorry correction: your solution works perfectly in PowerBI … many thanks!!

In PowerPivot (Excel) the REMOVEFILTERS does not exist. Is there an alternative you would recommend for PowerPivot?

Regards
Franz

Yeah there are definitely some differences between Power Pivot and Power BI :pensive:
Give this a go in Power Pivot.

Median v3 = 
MEDIANX( 
    ADDCOLUMNS(
        CALCULATETABLE(
            SUMMARIZE( Data, Data[Article], Data[Year],  Data[MaterialCost p.pc.] ), 
            ALL( Data[Order] )
        ),
        "Median", [MaterialPrice] 
    ),  Data[MaterialCost p.pc.] 
)
1 Like

I owe you!!
Thanks a million.

Regards
Franz