Hi everyone,
Wow this has been amazing and so much fun! Thank you eDNA community!
You guys are turning this new initiative into a huge success.
In the YouTube Solution video I discussed the general approach I have taken in solving this problem. All specific details on my submission you can find in this post.
I hope you enjoy reading . Get ready… here I go.
What to do when you get stuck?
Start “Rubberducking” - verbally articulating your general strategy before diving into coding.
Search within the Documentation. M is a functional language and the MS Docs can be a big help in finding functions, examining their syntax and review examples by copy and pasting them into a new blank Query.
I’ve got it bookmarked and second to the Forum it’s the site I visit most.
.
However if you rather have direct access via your External Tools menu, see the post below.
.
Personally I’ve learned a lot by just Googling… one of the major parts in this challenge revolved around Pivoting Stacked data. Now there are a couple of ways to do this of course but what if you’ve never seen this ‘pattern’ before? Knowing (or guessing ) that this format is known as Stacked can help you with some research.
.
See how many hits you get when you ask Google for anything on: “stacked data in power query”
.
Review my full PQ solution
On to my solution.
I tent to design my queries using UI as much as I can and when the query kind of does what it needs to do then I’ll go into the Advanced Editor examine the code and see if I can modify/change it.
In my view, this data offered a lot of opportunities to experiment with some actual M coding. You can find all of my notes on that in the Countries (full UI) with Comments query.
And when I was done, this was the end result.
.
Starting from line 4 where I invoked a custom Text Cleaning function (for anyone interested in that a YouTube video will become available some time next week). This is the M code for the fxCleanText function.
(myText as text) as text =>
let
ToRemove = {""""} & Text.ToList(":[]}"),
CleanText = Text.Remove(myText, ToRemove),
TrimText = Text.TrimEnd(Text.Trim(CleanText), ","),
FixLink = Text.Replace(TrimText, "//", "://"),
ReplaceValue = Text.Replace(FixLink, "_", " ")
in
ReplaceValue
.
Which can then be Invoked to Transform both columns in a single step.
.
Filtering Rows
.
Creating segments to Pivot the data (aka identifying each separate block of stacked data)
Nested 2 functions here and only returned the 3 Columns mentioned in the square brackets.
And the Pivot
.
Renamed the Columns
Table.RenameColumns( PivotCol1,
let
nNames = List.ReplaceMatchingItems(
List.Transform(
Table.FirstN(UpdateIndex, 20)[Column1], Text.Proper),
{
{"Km2", "Area Sq Km"},
{"Mi2", "Area Sq Mi"},
{"Alpha 2", "ISO 2 Code"},
{"Alpha 3", "ISO 3 Code"},
{"Tld", "TLD"}
}
),
cNames = Table.ColumnNames(PivotCol1)
in
List.Zip( {cNames, nNames} )
)
.
Format and assign Type to the Columns
Table.TransformColumns(
RenameColumns,
{
{"Country", Text.Proper, type text},
{"Capital", Text.Proper, type text},
{"Currency", Text.Proper, type text},
{"Native Language", Text.Proper, type text},
{"Famous For", Text.Proper, type text},
{"Alcohol Prohibition", Text.Proper, type text},
{"Drive Direction", Text.Proper, type text},
{"Constitutional Form", Text.Proper, type text},
{"Continent", Text.Upper, type text},
{"ISO 2 Code", Text.Upper, type text},
{"ISO 3 Code", Text.Upper, type text},
{"Neighbors", Text.Upper, type text},
{"Area Sq Km", Number.From, type number},
{"Area Sq Mi", Number.From, type number},
{"Is Landlocked", Logical.From, type logical}
}
)
.
To do some more additional clean up on the “Famous for” column, created another function.
(myText as text) as text =>
let
ToList = Text.Split(myText, " "),
Replacements = List.ReplaceMatchingItems( ToList,
{ {"And", "and"}, {"The", "then"}, {"Of", "of"}, {"To", "to"}, {"In", "in"} }
),
CombineText = Text.Combine(List.Transform(Replacements, Text.From), " "),
NewText = Text.Replace(CombineText, "'S", "'s")
in
NewText
.
and finally Invoked that - All done (in Power Query )
.
Nobody seems to have had any issues displaying the Flag in the Table, I remember being amazed the first time I saw that little trick.
.
Here’s my file: Problem of the Week #2 – Dimension Table from Text Solution (Melissa).pbix (2.9 MB)
If you have any questions in regards to my solution please reach out and @mention me.
.
Things I noticed or want to call out after reviewing your submissions.
- For both @chrish and @logicalbi participating here was their first Forum contribution
- On the number of query steps… @BrianJ required the most 45 in total, @Rajesh 's second entry the least with just 15 steps (note I’m incl. let and in clause) followed by @chrish with 17 and @alexbadiu with 19 steps.
- I thought the change @chrish made to the Source step was pretty slick.
Both Hossein Satour and @amira.bedhiafi.pro also used a combination with Lines.FromBinary in the Source step - Can you guys comment on how/why you decided to do that?
Thanks! - Several methods were used for segmenting and setting up the Pivot step:
- if-then-else constructs like: if [Column1] = “Country” then [Index] else null
- Divisions like: Number.IntegerDivide([Index], 22) or [Index]/10 or each _ / 22
- Modulo like: Number.Mod([Index], 20)
- Not everyone chose to do a Pivot to get to the tabular format other methods used:
.
If you have more to share on your own entry I would enjoy reading AND appreciate that
As you know to encourage participation, we do a random drawing of five participants who will receive a cool Enterprise DNA vinyl laptop sticker. This round’s winners, in alphabetical order, are:
To claim your price please make sure to send your details to this email address: brian.julius@enterprisedna.co
That about sums it up for me… If you went exploring too and feel I missed any of the hidden gems, please feel free to leave your thoughts in a post below. Thanks!
Hope to see you soon in new POTW challenge!
All the best.
Melissa