Why does Sam use an index column in the data model?

rizzotony · July 16, 2021, 7:20pm

Hi all,

I’m learning PBI and am wondering why/if I need to use an index column?

For example, I have a fact table with a customer ID. I have a look-up table with customer ID. The relationship is one customer ID in the look-up table to many customer IDs in the fact table. Both fields are character fields.

Is there a resource I can be directed to learn more?

Thank you!

BrianJ · July 16, 2021, 8:02pm

@rizzotony,

I’ll give you the all-purpose Power BI answer that is correct in almost every circumstance - “it depends”.

I suspect you may be watching one of his videos on constructing financial statements. For some of those models, he uses an index field in certain tables to track/sort row position in assembling the visual. In other cases, for example where non-standard cumulative total patterns are needed, an index column can dramatically simplify the DAX required. Finally, in some cases where the filter conditions are very complex, I find an index column very helpful for debugging, since I can easily identify which rows are/are not getting pulled into the filter by looking at the index column, which can be helpful in sorting out why you’re getting unexpected results.

However, high cardinality columns (ones with lots of unique values) tend to be quite costly in terms of space and memory required, and index columns will have the maximum cardinality since by definition every value will be unique. So, despite the advantages in some cases of adding an index column, you shouldn’t do it by default, but evaluate whether the benefit in terms of analysis/viz/debugging is worth the cost of memory and space. So, it really does depend…

Great question. I hope this is helpful.

Brian

rizzotony · July 16, 2021, 8:28pm

Makes sense. Thank you!