I have payroll data files which I want to gather into BI. Originally, I was Combining files but additional columns appeared so Combine started giving me issues. Would Merge be a better fit since the files will not be the same. Probably 18 of 20 columns will be the same. The rows will be different with each file as well. Or, should I just manually create one file in Excel through copy / paste to create a file. I would rather automate thru BI Power Query. I want to be able to take the most recent payroll file each week and add to existing BI file.
I have attached a copy of some of the data in the files and highlighted in yellow some of the columns that get added at various times.
Hi,
There is not 1 recipe for this type of data pollution, ranging from bespoke scripts for one situation to conversion tables for another, but manageable.
May I recommend you get the book “Collect, Combine and Transform Data using PQ by Gil Raviv”
He describes, with examples that you can download, how to handle these anomalies.
Paul
I was watching these videos and thought if you would be able to go through some of these and other videos of this course, it can help you with your requirements.
Hi @RBDport, we’ve noticed that no response has been received from you since the 21st of March. We just want to check if you still need further help with this post? In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the checkbox. Thanks!
Thanks for the help. I will look at those videos today. This weekend I was reading the book that Gil Raviv wrote (Paul’s suggestion) and that holds some promise.
I will look at Append again. I thought the columns and layout had to be the same for both files. I never know which columns may be added and new columns may be added anywhere in the file. I will test this with some other ideas today.
Trying the APPEND and this may work. Not a problem if columns are added but if some not there I have to adjust the Power Query. No big deal.
Having trouble setting up in Power Query a Payroll History File (after APPEND two files) and want to add a third file say the next Payroll Input File to APPEND to History File automatically. It keeps removing the prior Input File data. Any ideas so I can use Input File that appends to a file to create History.
Payroll History File - Jan and Feb appended
Payroll Input File - with March data, and set up to append to History File it works okay and March is added w Jan and Feb
Payroll Input File - with April data (remove March data in file), and PQ is still set up to append Input File. The March data is removed and replaced with April so History File is incorrect.
Have you tried to append one query at a time and doing the same steps for all three (or more of the tables) again instead of appending all in one step?
Yes, I could do that but then I have to go into PQ every month to add each APPEND. I would prefer to have PQ automatically do it based on adding the most recent file to all the others like COMBINE would.
I was reading the Collect, Combine, and Transform ….book by Gil Raviv suggested by Paul and noted on pg 266 on pitfalls about the MissingField.UseNull argument within Table.RemoveColumns . It helps when my columns are missing but needed and will add nulls. This will help for one issue and I will just use the Combine (as before) instead of Merge or Append. Only issue still is when new columns are added but that will not happen very often and I should know about it. I will just have to add the field manually to Advanced Editor.
In the book there is a chapter, 4-7 that describes how to build a conversion table. This might be the best solution.
Also in cases like this, I create a separate query to show me only the header names. This way I can easily find out if someone plays a trick with me again.
(Table.ColumnNames(table)
Paul
I read about the conversion table with names as well, but the column names isn’t really my problem. All my files will have the same column names except for the one added at any time. I will try to see if helps with new columns added.
I saw in the book too about creating a query to check the names like you mention. No one else will be in the files I use but me. But, it was interesting what you can do with PQ and M language when you know more.
@RBDport - do you ever NEED the new columns that are added?
If not, have you considered selecting all the columns you do need (which I assume will always have the same names) and then deleting all other columns? You should be able to do this even with a single report that doesn’t have any of the additional columns.
If that step is built into your query, you can append the tables without needing to constantly check for new columns.
Yes, I may need the new columns as they are either new employees in new states that were added or payroll related deductions/earnings that had not been used previously.
I have the query set up now as you suggest to remove unneeded columns so I will have to add any new columns, if needed, to the group.
A simple ‘Combine Transform’ will do it. See the Screenshot attached.
The only caveat is that the Template file (used in the Parameter) must have all column headings that will ever be required. If you ever have a need for more column headings, just change the Parameter Template file to reflect all column headings required.