From everything I read in the forum it is recommended for fact tables to be long rather than wide.
I have a power bi report that is showing telemetry data.
There are over 1000 sensors and data is aggregated every 15 minutes.
For every sensor – up to 4 values are calculated (based on area type). – utilisation, average, minimum and maximum (some values will be empty because of the area type)
Due to the large number of records created daily (1000244) = 96,000, am I better off keeping the table wide or long?
If I go long, some values will be empty and I could reduce the number of rows created but will still end up creating about 250,000 records daily with a narrow long table.
Using aggregation is not a solution at this stage due to customer requirements.
@Dana I don’t think you can transform columns into rows in this case, a single column with different aggregation doesn’t make sense to me for any further calculation.
if you are concerned with the number of rows that are being created, there are a few things to ask yourself.
You mention that data is aggregated every 15 minutes, is this by design, or something that is simply happening in your database? In other words, can you aggregate the data any further?
How much history are you bringing into the report, and do you need all of it?
If data refresh is a concern, have you considered using incremental refresh?
Hi @Dana, we aim to consistently improve the topics being posted on the forum to help you in getting a strong solution faster. While waiting for a response, here are some tips so you can get the most out of the forum and other Enterprise DNA resources.
Use the forum search to discover if your query has been asked before by another member.
When posting a topic with formula make sure that it is correctly formatted to preformatted text </>.
Use the proper category that best describes your topic
Provide as much context to a question as possible.
Include the masked demo pbix file, images of the entire scenario you are dealing with, screenshot of the data model, details of how you want to visualize a result, and any other supporting links and details.
@AntrikshSharma, I guess I was not clear in my previous description.
What I meant in regards to rows versus columns - is to have one column with the type of the attribute (utilisation, max, min and average) and a second column with the value.
This can be achieved in power query
However, in my case as we are still in the design phase - the data in the database can be stored either way (long or wide) - which is the reason for my question.
@Heather, thank you for taking the time to respond to my question.
The data is aggregated every 15 minutes as per customer requirements, and it can’t be aggregated any further.
Another requirements is to have up to 3 years of data available. Clearly, some of the older data will have to be stored in a data lake, but this requirement has not been detailed at this stage.
I have implemented incremental refresh - however that does not change the final size of the table.
@Dana Yes, I understood what you meant, answer is still the same, you would end up creating more complex DAX for trading off some additional model size, to be honest what you should focus first on is identifying how many unique values each column will have.
If you have Avg, Max, Min in a same column, think of what kind of DAX you would have to write, even your first base measure would be complex.