Import from a Read folder an HTML file that has a .xls extension

Here is my problem.
I am downloading a file from my bank on a daily basis.

The bank file downloads with a .xls extension, but it is really an HTML file.

I want to drop that file in a read folder and have PQ read the file.

This is what my screen looks like…

But if I click on the double down arrows, I get the message…
image

If I use the import from a web page feature, I can see the table

But the problem with this is it looks like it is hard coded to the filename. I want it to read the file in the folder.

I think in essence I don’t know how to read an HTML file from a folder.

Hi @BillK,

Don’t have any data to play with but since you do have a way to access the data in the file, you could try something like this.

let
    Source = Folder.Files("M:\Finance Share\Lockbox Processing\Read"),
    AddedCustom = Table.AddColumn(Source, "Custom", each Web.Page(File.Contents([Folder Path] & "\" &[Name])))
in
    AddedCustom 

I hope this is helpful otherwise please share a dummy file…

1 Like

Melissa,

Thanks so much. That led me to the solution!

I did need to go through a bunch of interim steps to get to the end.

Here is what got me to the end…
let
Source = FileList,

AddedCustom = Table.AddColumn(Source, "Custom", each Web.Page(File.Contents([Folder Path] & "\" &[Name]))),
#"Filtered Rows" = Table.SelectRows(AddedCustom, each ([Extension] = ".xls")),
#"Expanded Attributes" = Table.ExpandRecordColumn(#"Filtered Rows", "Attributes", {"Content Type", "Kind", "Size", "ReadOnly", "Hidden", "System", "Directory", "Archive", "Device", "Normal", "Temporary", "SparseFile", "ReparsePoint", "Compressed", "Offline", "NotContentIndexed", "Encrypted"}, {"Attributes.Content Type", "Attributes.Kind", "Attributes.Size", "Attributes.ReadOnly", "Attributes.Hidden", "Attributes.System", "Attributes.Directory", "Attributes.Archive", "Attributes.Device", "Attributes.Normal", "Attributes.Temporary", "Attributes.SparseFile", "Attributes.ReparsePoint", "Attributes.Compressed", "Attributes.Offline", "Attributes.NotContentIndexed", "Attributes.Encrypted"}),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Attributes",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Caption", "Source", "ClassName", "Id", "Data"}, {"Custom.Caption", "Custom.Source", "Custom.ClassName", "Custom.Id", "Custom.Data"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Expanded Custom",{"Custom.Data"}),
#"Kept First Rows" = Table.FirstN(#"Removed Other Columns1",1),
#"Expanded Custom.Data" = Table.ExpandTableColumn(#"Kept First Rows", "Custom.Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30"}, {"Custom.Data.Column1", "Custom.Data.Column2", "Custom.Data.Column3", "Custom.Data.Column4", "Custom.Data.Column5", "Custom.Data.Column6", "Custom.Data.Column7", "Custom.Data.Column8", "Custom.Data.Column9", "Custom.Data.Column10", "Custom.Data.Column11", "Custom.Data.Column12", "Custom.Data.Column13", "Custom.Data.Column14", "Custom.Data.Column15", "Custom.Data.Column16", "Custom.Data.Column17", "Custom.Data.Column18", "Custom.Data.Column19", "Custom.Data.Column20", "Custom.Data.Column21", "Custom.Data.Column22", "Custom.Data.Column23", "Custom.Data.Column24", "Custom.Data.Column25", "Custom.Data.Column26", "Custom.Data.Column27", "Custom.Data.Column28", "Custom.Data.Column29", "Custom.Data.Column30"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Expanded Custom.Data", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"SITE", type text}, {"WORKSOURCE", type number}, {"PROCESSING DATE", type date}, {"BATCH NUMBER", Int64.Type}, {"TRANSACTION NO", Int64.Type}, {"SEQUENCE NO", Int64.Type}, {"PAID AMOUNT", Currency.Type}, {"CHECK NO", Int64.Type}, {"RT NO", Int64.Type}, {"CHECK ACCOUNT NO", type number}, {"STUB ACCOUNT NO", Int64.Type}, {"REMITTERNAME", type text}, {"SUBSEQNO", Int64.Type}, {"IMAGESEQ1", Int64.Type}, {"IMAGESEQ2", Int64.Type}, {"ITEMTYPE", type text}, {"FORMATNO", Int64.Type}, {"BATCHMODE", type text}, {"FOREIGNITEM", type text}, {"SORTLABEL", Int64.Type}, {"SORTTYPEID", Int64.Type}, {"BATCHREJECTID", Int64.Type}, {"CUSTOMBATCHNO", Int64.Type}, {"SORTTYPEDESC", type any}, {"USERFLAG16", type any}, {"EXPRESSMAILFLAG", type text}, {"BATCHREJECTREASON", type any}, {"USERSTRING101", type any}, {"CHECKDATE", Int64.Type}, {"PAYMENTCHANNEL", type text}})

in
#“Changed Type”

If you have a suggestion on how to skip all the interim steps (from #“Filtered Rows” … to #“Expanded Custom.Data”) please let me know.

Thank you again so much for your help.

Bill

Melissa,

Actually, this was a little closer to what I was trying to do. Still had to expand and remove columns a couple of times, but I got there.

let
Source = FileList,
#“Filtered Rows1” = Table.SelectRows(Source, each ([Extension] = “.xls”)),
#“Added Custom” = Table.AddColumn(#“Filtered Rows1”, “Custom”, each Web.Page([Content])),
#“Removed Other Columns2” = Table.SelectColumns(#“Added Custom”,{“Custom”}),
#“Expanded Custom1” = Table.ExpandTableColumn(#“Removed Other Columns2”, “Custom”, {“Caption”, “Source”, “ClassName”, “Id”, “Data”}, {“Caption”, “Source”, “ClassName”, “Id”, “Data”}),
#“Filtered Rows2” = Table.SelectRows(#“Expanded Custom1”, each ([Source] = “Table”)),
#“Removed Other Columns3” = Table.SelectColumns(#“Filtered Rows2”,{“Data”}),
#“Expanded Data” = Table.ExpandTableColumn(#“Removed Other Columns3”, “Data”, {“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”, “Column7”, “Column8”, “Column9”, “Column10”, “Column11”, “Column12”, “Column13”, “Column14”, “Column15”, “Column16”, “Column17”, “Column18”, “Column19”, “Column20”, “Column21”, “Column22”, “Column23”, “Column24”, “Column25”, “Column26”, “Column27”, “Column28”, “Column29”, “Column30”}, {“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”, “Column7”, “Column8”, “Column9”, “Column10”, “Column11”, “Column12”, “Column13”, “Column14”, “Column15”, “Column16”, “Column17”, “Column18”, “Column19”, “Column20”, “Column21”, “Column22”, “Column23”, “Column24”, “Column25”, “Column26”, “Column27”, “Column28”, “Column29”, “Column30”}),
#“Promoted Headers1” = Table.PromoteHeaders(#“Expanded Data”, [PromoteAllScalars=true]),
#“Changed Type” = Table.TransformColumnTypes(#“Promoted Headers1”,{{“SITE”, type text}, {“WORKSOURCE”, type number}, {“PROCESSING DATE”, type date}, {“BATCH NUMBER”, Int64.Type}, {“TRANSACTION NO”, Int64.Type}, {“SEQUENCE NO”, Int64.Type}, {“PAID AMOUNT”, Currency.Type}, {“CHECK NO”, Int64.Type}, {“RT NO”, Int64.Type}, {“CHECK ACCOUNT NO”, type number}, {“STUB ACCOUNT NO”, Int64.Type}, {“REMITTERNAME”, type text}, {“SUBSEQNO”, Int64.Type}, {“IMAGESEQ1”, Int64.Type}, {“IMAGESEQ2”, Int64.Type}, {“ITEMTYPE”, type text}, {“FORMATNO”, Int64.Type}, {“BATCHMODE”, type text}, {“FOREIGNITEM”, type text}, {“SORTLABEL”, Int64.Type}, {“SORTTYPEID”, Int64.Type}, {“BATCHREJECTID”, Int64.Type}, {“CUSTOMBATCHNO”, Int64.Type}, {“SORTTYPEDESC”, type any}, {“USERFLAG16”, type any}, {“EXPRESSMAILFLAG”, type text}, {“BATCHREJECTREASON”, type any}, {“USERSTRING101”, type any}, {“CHECKDATE”, Int64.Type}, {“PAYMENTCHANNEL”, type text}})
in
#“Changed Type”

Hi @BillK,

Can you test this query?
It’s a bit harder to code without any actual data because I can’t see the results of the steps but I think this will work - fingers crossed.

let
    Source = FileList,
    #"Filtered Rows1" = Table.SelectRows(Source, each ([Extension] = ".xls")),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows1", "Custom", each Table.SelectRows(Web.Page([Content])), each ([Source] = "Table"))[[Custom]],
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30"}, {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30"}),
    #"Promoted Headers1" = Table.PromoteHeaders(#"Expanded Data", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers1",{{"SITE", type text}, {"WORKSOURCE", type number}, {"PROCESSING DATE", type date}, {"BATCH NUMBER", Int64.Type}, {"TRANSACTION NO", Int64.Type}, {"SEQUENCE NO", Int64.Type}, {"PAID AMOUNT", Currency.Type}, {"CHECK NO", Int64.Type}, {"RT NO", Int64.Type}, {"CHECK ACCOUNT NO", type number}, {"STUB ACCOUNT NO", Int64.Type}, {"REMITTERNAME", type text}, {"SUBSEQNO", Int64.Type}, {"IMAGESEQ1", Int64.Type}, {"IMAGESEQ2", Int64.Type}, {"ITEMTYPE", type text}, {"FORMATNO", Int64.Type}, {"BATCHMODE", type text}, {"FOREIGNITEM", type text}, {"SORTLABEL", Int64.Type}, {"SORTTYPEID", Int64.Type}, {"BATCHREJECTID", Int64.Type}, {"CUSTOMBATCHNO", Int64.Type}, {"SORTTYPEDESC", type any}, {"USERFLAG16", type any}, {"EXPRESSMAILFLAG", type text}, {"BATCHREJECTREASON", type any}, {"USERSTRING101", type any}, {"CHECKDATE", Int64.Type}, {"PAYMENTCHANNEL", type text}})
in
    #"Changed Type"

Test CCB.Bill.82568_100220_1162987 - Copy.xls (9.8 KB) Test Import HTML file improvement.xlsx (27.3 KB)

Your adjustment didn’t seem to work.
I got this message at the #“Added Custom” step
Expression.Error: We cannot convert a value of type Function to type Type.
Details:
Value=Function
Type=Type

I tried to create a test file to work with, but my query spins forever :frowning_face: (in the real file it works)

Hi @BillK,

The format in the provided sample file is different, there is no [Source] = “Table” for example…
However should I assume that is the correct record containing the table with data?

Yes, the table with data should be good. I’m sorry I the test file didn’t work properly.

Okay made some minor adjustments but again can’t test the full query.
See if this works for you.

let
    Source = FileList,
    FilterExtension = Table.SelectRows(Source, each ([Extension] = ".xls")),
    KeepCustom = Table.AddColumn(FilterExtension, "Custom", each Web.Page([Content]))[[Custom]],
    FilterTable = Table.AddColumn(KeepCustom, "Filter", each Table.SelectRows([Custom], each [Source] = "Table" ))[[Filter]],
    ExpandFilter = Table.ExpandTableColumn(FilterTable, "Filter", {"Data"}, {"Data"})[[Data]],
    ExpandData = Table.ExpandTableColumn(ExpandFilter, "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30"}, {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30"}),
    PromoteHeaders = Table.PromoteHeaders(ExpandData, [PromoteAllScalars=true]),
    ChangeType = Table.TransformColumnTypes(PromoteHeaders,{{"SITE", type text}, {"WORKSOURCE", type number}, {"PROCESSING DATE", type date}, {"BATCH NUMBER", Int64.Type}, {"TRANSACTION NO", Int64.Type}, {"SEQUENCE NO", Int64.Type}, {"PAID AMOUNT", Currency.Type}, {"CHECK NO", Int64.Type}, {"RT NO", Int64.Type}, {"CHECK ACCOUNT NO", type number}, {"STUB ACCOUNT NO", Int64.Type}, {"REMITTERNAME", type text}, {"SUBSEQNO", Int64.Type}, {"IMAGESEQ1", Int64.Type}, {"IMAGESEQ2", Int64.Type}, {"ITEMTYPE", type text}, {"FORMATNO", Int64.Type}, {"BATCHMODE", type text}, {"FOREIGNITEM", type text}, {"SORTLABEL", Int64.Type}, {"SORTTYPEID", Int64.Type}, {"BATCHREJECTID", Int64.Type}, {"CUSTOMBATCHNO", Int64.Type}, {"SORTTYPEDESC", type any}, {"USERFLAG16", type any}, {"EXPRESSMAILFLAG", type text}, {"BATCHREJECTREASON", type any}, {"USERSTRING101", type any}, {"CHECKDATE", Int64.Type}, {"PAYMENTCHANNEL", type text}})
in
    ChangeType

Hi @BillK, we’ve noticed that no response has been received from you since the 6th of October. We just want to check if you still need further help with this post? In case there won’t be any activity on it in the next few days, we’ll be tagging this post as Solved. If you have a follow question or concern related to this topic, please remove the Solution tag first by clicking the three dots beside Reply and then untick the checkbox. Thanks!

Melissa,

This was perfect! You are the best!

Bill