I have an old payroll system. I do not have access to the data files themselves but only to reports. Attached is a pdf report out of the sample company. When I Import into Power Query using Get From PDF I get several tables per page. Each Employee starts on a new page but can span a page. Each time the report is run the Page may have a different number of tables I on it eg. this time Employee 1002 doesn’t have Table A and next time they do – or the other way around.
BTW: Each user will store their version of the pdf in a different location with a different name.
I want the data in
- Personal Section (present on every page)
- Job Section (present on every page)
- A particular Table (call it Table A). Only some pages have this table. The table I want is the one like the last on Page 1 – the first heading is “Payment or Deduction” the second “Quantity” etc.
I want to
- Combine all the Personal Section for each Employee into one table with the employee as a column.
- The same for the Job Section
- The same for any instance of TABLE A in the File
My issue is how to identify which Tables to import from each page when they do not have meaningful names.
How would I go about getting the data I need as above?