Changes in source data, new columns

Paul · January 14, 2018, 10:22am

Hi,
Regularly I come across the following problem. A model is based on Get Data from a folder, monthly files are added to the folder. After a while the customer decides to add a few columns to the source data, January and February were the same, then in March a new column has been added and the query breaks.

Today I solve this by adding a similarly named column in the first file used to create the model.

I am sure there must be a more sophisticated way of doing this.
Appreciate your comments.
Cheers,
Paul

sam.mckay · January 14, 2018, 7:33pm

Hi Paul,
Unfortunately there really is no way to solve this automatically. The data really needs to have the same ‘structure’ or the query will just break. It’s the same if any column name changes as well.

It is all due to ‘M’ code sitting in behind a query. The code list out every single column name, table name etc so it knows how to step through any transformation that occurs.

Have you dived into the query editor much? I recommend to stay away from just using ‘get data’, as you really want to dive into all the possibilities in the query editor. It also helps significantly in understanding how to setup you data to optimize it for Power BI and your data models.

Have you had a look through the advanced data transformation and modelling course? Certainly spend some time working through this if you can find the time, I run through virtually everything you need to know about setting up your models well.

GarryA · January 14, 2018, 11:42pm

There is a way to do this but it requires a higher level query editing using M code.

Have look at Colin Taylor’s method of dynamically managing header order and naming…

Good luck and I hope that helps out!

Cheers Garry

Paul · January 15, 2018, 8:28am

Hi,
Thanks. This looks like the solution.
I follow Ruth at Curbal, but must have overlooked this topic.
Cheers
Paul

Paul · January 15, 2018, 8:32am

Hi Sam,
Thanks.
I should have left out Get Data and just put from folder because I use the query editor just as in your course (I have now completed all of them)
In this case I have succeeded in adding columnheaders in the advanced query editor manually, but it all seemed rather laborious. The example from Paul seems a viable way of automating this.

GarryA · January 15, 2018, 8:37am

Great!..Grouping, Renaming and Ordering Dynamically are tasks which are very useful in reporting. There are only a few good resources on the internet around this.

Sam is going to hopefully do some more fun stuff on these topics!..

Paul · January 15, 2018, 8:42am

Let’s hope so, in fact this header stuff and added columns issue calls for a built in more automated function by Power BI, they might be working on this. I can only imagine many people to face similar problems.

GarryA · January 15, 2018, 8:55am

Yes Colin, mentions this in his video about that but I wouldn’t hold my breath…

I have got a working example where i can dynamically manage Column names and Column Name Orders in a table using a Power BI data model (no excel used)…It was difficult to learn at first and i am still working through all the logic…to replicate this with other accounting database connections.

sam.mckay · January 15, 2018, 8:56pm

This is actually an interesting solution and not something I’ve actually done before.

One thing I would just add here and is how I personally would approach things is that I don’t like to advise to placing bandaids over poor data entry with Power BI. Power BI does many things exceptionally well, but it just doesn’t fix bad data or poor data entry.

While in a lot of cases you can do it and create fixes, there are a few downsides of doing so.

The amount of possible entries could be significant so accounting for all variables is difficult
Sometimes building overly complex work-arounds can work against the success of your development as the continuity of your model for others becomes degraded
The more complex the fix-up the less people will understand it in the future or know how to audit it.

While this is very much all subjective, I personally would always first assess can an improvement to process take place (I know this is certainly not always that easy) then attempt to fix in Power BI after.

In saying that this is a common problem so maybe a nice pattern here to fix it will become more mainstream.

Chrs, Sam

GarryA · January 16, 2018, 12:16am

Hey Sam, its so true about poor data entry and poor data structuring too…I am sure you have seen a lot of badly formatted data and models…Sounds like a good idea for some new content…

How to deal with Poor data and Poor data structures.

In saying that, I also agree with you that its an interesting topic and common problem for the report author.

I see a fair bit of badly formatted data and in most cases when i build models I am reporting on consolidated data.

I normally need to group data (ie multiple general ledger accounts) , set column data types, filter & order data to make the PBI visuals more efficient, user friendly and have more impact in telling a data story.

Paul · March 22, 2018, 3:46pm

Hi,
I came across this video by Gil Raviv, explaining how to deal with mismatching column names,
https://www.youtube.com/watch?v=gKiQVY8uP4Y at 43:40

Paul

sam.mckay · March 22, 2018, 7:42pm

Nice, I’ll have to find the time to watch at some point.