Data model best practice for many columns

Hi all,

I extract my HubSpot data from Power BI using an API connector. HubSpot utilizes one large table to house most of their data around contacts, meaning roughly 376 columns? I don’t need all those columns obviously but in your estimation, what is the maximum # of columns a table should have before it’s best to duplicate. The maximum columns I would need is 68 out of the 376.

Your insight is appreciated.

Hi @ysherriff - I am not aware of the exact limit on the number of Columns but main Rule is to keep only necessary columns and remove the unnecessary ones.

Next thing is Cardinality, higher the Cardinality more space the Column consume, try to reduce Cardinality as much as possible.

Also, can use a Tool named Vertipaq analyser to check what are the columns consuming most space. Then either remove them or create a Lookup/Dimension table and keep Id’s instead of Actual values.

Can check below Tutorial, it has some good information on Storage and Vertipaq engine.

Thanks
Ankit J

2 Likes

I’ll 2nd what @ankit has said, if you don’t need a column then don’t pull it into your model. Only use what you need.

1 Like

Thank you very much both if you.
I will review the site and tool.

1 Like