Creating a custom function to extract data from folder based on extension

Hi! It’s been a while since I’ve been here :sweat_smile:

I’m very close to building a custom function with a formula that I’m using almost on a daily basis to extract data from .txt and .csv files stored in a folder. I’d like to save this function to use it across any excel file.

The missing link is referring to the query that is being selected in the parameter. Could somebody offer me the final piece of the puzzle (or improve my current idea)

Here’s my current code, and I’ve added a dummy file for testing. The only thing required is adding a .csv or .txt to the test folder of this dummy file:


> = (SelectContent as list) =>
> let
>     Source = "SELECTED QUERY from SelectContent parameter",
>     Extract = Table.AddColumn(Source, "Tables", each if [Extension] = ".csv" then Table.PromoteHeaders(Csv.Document([Content])) else if [Extension] = ".txt" then Table.PromoteHeaders(Csv.Document([Content],[Delimiter="#(tab)"])) else null)
> in
>     Extract

eDNA_ExtractContent custom function - Maikel.xlsx (15.9 KB)

Hi @Maikelshuvit,

Could you please elaborate on your thinking? You’ve specified one function parameter, SelectContent as a list. What kind of list will you be supplying? Will it contain file names, row numbers, or something else? If it is intended solely for selection purposes, then you also need a parameter to input a table.

To illustrate, this takes a table and a list with zero-based row numbers:

(t as table, SelectContent as list) as table =>
let
    r = Table.ToRows(t),
    Source = Table.FromRows( List.Transform( SelectContent, each r{_} ), Table.ColumnNames(t)),
    Extract = Table.AddColumn( Source, "Tables", each 
        if [Extension] = ".csv" 
        then Table.PromoteHeaders(Csv.Document([Content])) 
        else if [Extension] = ".txt" 
        then Table.PromoteHeaders(Csv.Document([Content],[Delimiter="#(tab)"])) 
        else null
    )
in
    Extract

A function invokation can look like this:
ExtractContent(Sourcefolder, {0,6})

To select rows 1 and 7 from the Sourcefolder and apply the Extract operation.
I hope this is helpful.

Hey @Melissa, thanks for checking in on this.

I haven’t really explored writing functions yet so setting the parameters is where I’m lacking I think. In my work I’m extracting a lot of data from .csv, .txt and .xlsx files that are stored in folders. I don’t really like the combination function that generates all of these helper queries, so I’ve been using the conditional formula instead as a custom column to extract table content.

Lazy that I am, I’d like to automate this extraction formula from folder files to not have to write the formula every time I need it (daily).

So basically, what I need to select with the function parameters is:

  1. Refer to an existing query with the selected list of files
  2. Select the [Content] column from this query to apply the conditional formula as described, based on the file extension.

The code you provided doesn’t do it yet but I’m certain you have some kind of trick up your sleeve to make this work :pray: