When to create a dimension and not

Hi!

Are there any general recommendations for when to create a separate dimension table and when to simply keep the values in the fact table of as part of another dimension table?
Take the following example with the raw tables from a database (no transformation) of course the customer table could contain many more columns

Would it be better to keep this table structure or should you merge the Country and Gender values to the Customer table like this.

I guess that if I use the dimensions on other fact tables (other than sales) it makes sense to keep it as separate table but when (if ever) is it a good idea to keep the values in the dimension or fact table and not break it into its own table?

Thanks
Fredrik

@Frede,

Your question is probably the most debated issue in data modeling - snowflake (your top configuration) vs. star schema (your bottom configuration). Here’s a good summary table of the pros and cons of each:

My personal view is to stick with star schema when at all possible. It makes for a simpler data model, and makes DAX easier to write and debug. But I don’t often work with really large datasets, where the denormalization in a star schema would be a bigger issue.

I hope this is helpful.

  • Brian

P.S. Here’s another recent thread on this issue with a link to an excellent Microsoft article on star and snowflake schemas:

Thanks Brian!

I guess there is no definitive right or wrong but the usual “it depends”. One advantage of keeping to a star schema IMO is that for end-users that want to create their own reports from scratch it is much more intuitive to look for the gender or country fields in the customer table, than to have to open the gender and country tables to find them and add to their visuals.

@Frede,

Totally agree. Snowflake schemas are definitely valid, and have their place, but I think in most cases star schema makes everyone’s life a lot easier.

I heard Marco Russo talk on this topic this summer and he said something to this effect: A star schema will work for almost every model. Many of you are probably sitting there thinking “yes, but my model is special”. I’m here to tell you today that your model is not special – use a star schema.

It got a big laugh and probably was a bit of an exaggeration to make a point, but I think it’s pretty good advice…

  • Brian
2 Likes