I have some data around Company Size (#Employees) and would like to group these companies based on the size. The issue I am seeing is the column for #Employees is not standard within the source data. Sales reps are allowed to enter whatever value the may see fit; so I have a healthy mix of actual employee counts and ranges. So far I have tried duplicating the column and replacing all Ranges with the max value of said range, I can then change the type to numeric and use lookup tables or one of many other ways Sam has outlined grouping. This is not sustainable though as I am looking at a table with 10K+ entries. Each time data is refreshed, another ad-hoc range may be added that will need to be addressed in the data model. Attached screenshot gives a sample of this data. Question is, does anyone know of a more efficient way to accomplish this? At the end of the day, the request is to be able to filter based on company size.
That’s a tough one Randy, as I don’t think there is. I think the only way is to fix this at source. So make sure the input is correct.
Having to contend with so many variations isn’t really that sustainable from a Power BI point of view (or any analytics tool for that matter)
I think the way you are attempting to fix it is likely the best way you can be doing it. Mixing text and number values within columns though is never a good idea in my view.
Is there any possibility of fixing this at the source / input?
Thanks Sam. There is the potential, just means I have to sell the business on locking this field in the app to a pre-defined drop down selection. Had to try in BI first though
I’m always hesitant to advise placing band-aids over data input like this. As you’ve seen it becomes unmanageable within Power BI, with too many potential variations.
With these sort of problems I would almost always default to fixing the input side, but I know it’s always not that easy.