Excel BI Challenge Workout 119

EnterpriseDNA · April 21, 2023, 5:48am

— CAN YOU SOLVE THIS - EXCEL CHALLENGE 119—
(Solutions in any language are also welcome for Excel Challenges)

Remove the trailing and leading underscore characters. In between, underscore character will not be removed.
Hence if String =“E_xce__l__”, then answer would be “E_xce__l”

(Your formula need not be a single formula. You can write multiple formulas to arrive at a solution. Also your formula need not be different from others as long as you have worked out your formula independently)

Download Practice File - https://lnkd.in/dGRTkmM5

#excel, #excelchallenge, #powerquerychallenge, #daxchallenge, #m, #dax, #powerbi, #powerquery, #powerpivot, #sheets, #googlesheets, #data, #analytics, #businessintelligence, #finance, #office365

Excel BI’s LinkedIn Post

BrianJ · May 8, 2023, 4:53am

I’ve recently started doing much of my data cleaning in Power Query by calling an R script to apply a regex to the data using the stringr package within the tidyverse.

The following regex basically makes this a one-line solution:

^(?:+)|(?:+)$

Click for M Code

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("PYrBCkBAFEW/xWBnIVY+wEaRlc2kkzRFCT3zFv4eTdmdc8+11gwOdr0wY2YNcT2JX4I0eq7eSQiQJC1OZlC509fDqaCk39Qf/NMHHY4Tj7LzYgh5RTvJFX06Pg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [String = _t]),

  Dupe = Table.DuplicateColumn(Source, "String", "Answer"),
  RunRegexR = R.Execute("library(stringr)#(lf)df <- dataset#(lf)df$Answer <- str_replace_all(df$Answer, ""^(?:_+)|(?:_+)$"", """")",[dataset=Dupe]),
  Result = RunRegexR{[Name="df"]}[Value]
in
  Result

QuadriAtharuOlayinka · May 10, 2023, 1:59pm

FORMULA

=DROP(REDUCE("",A2:A8,
LAMBDA(x,y,
LET(_t,CONCAT(TEXTSPLIT(y,"_")),
_f,LEFT(_t), _l,RIGHT(_t),
_s,FIND(_f,y),
_inst,SUM(N(MID(y,SEQUENCE(LEN(y)),1)=_l)),
_lp,FIND("|",SUBSTITUTE(y,_l,"|",_inst)),
_r,MID(y,_s,(_lp -_s)+1),
VSTACK(x,_r)))),1)

VARIABLE DESCRIPTION
_t: Result of splitting a cell value into an array of substrings using the “_” (underscore) character as a delimiter.
_s: Starting position of the first substring in the original cell value within the entire string.
_lp: Position of the last occurrence of the last substring in the original cell value within the entire string.
x: Accumulated dropped substrings as the formula iterates over each cell value in the range.
y: Current cell value being processed by the formula.
_l: Last substring in the array _t.
_f: First substring in the array _t.

TRIM- Quadri Atharu.xlsx (34.5 KB)

QuadriAtharuOlayinka · May 11, 2023, 4:02am

FORMULA 2

=MAP(A2:A8,LAMBDA(_str,
LET(_sp,MID(_str,SEQUENCE(LEN(_str)),1),
_nu,N(_sp<>"_"),
_fp,XMATCH(1,_nu),
_lp,XMATCH(1,_nu,,-1),
MID(_str,_fp,_lp-_fp+1))))

I thought to myself after the first solution, this should be way easier than I solved it. Then I realised , all that is needed is the first position of the non-underscores and last position of the non-underscores. Then a MID of that.
In the formula above,
_str - The whole array of strings,
_sp - Splitted strings to rows
_nu - Non Underscores Test, 1 rep non-underscores, and 0 represent underscores,
_fp- First Position of Non-Underscores
_lp- Last Position of Non-Underscores

TRIM- Quadri Atharu Solution 2.xlsx (34.9 KB)