Problem of the Week #2 (PQ) Create a Dimension Table from a Text file

Hi everyone,

Wow this has been amazing and so much fun! Thank you eDNA community!
You guys are turning this new initiative into a huge success.

In the YouTube Solution video I discussed the general approach I have taken in solving this problem. All specific details on my submission you can find in this post.

I hope you enjoy reading :wink:. Get ready… here I go.

What to do when you get stuck?
Start “Rubberducking” - verbally articulating your general strategy before diving into coding.

Search within the Documentation. M is a functional language and the MS Docs can be a big help in finding functions, examining their syntax and review examples by copy and pasting them into a new blank Query.
I’ve got it bookmarked and second to the Forum it’s the site I visit most.


.

However if you rather have direct access via your External Tools menu, see the post below.

.
Personally I’ve learned a lot by just Googling… one of the major parts in this challenge revolved around Pivoting Stacked data. Now there are a couple of ways to do this of course but what if you’ve never seen this ‘pattern’ before? Knowing (or guessing :wink:) that this format is known as Stacked can help you with some research.
.
See how many hits you get when you ask Google for anything on: “stacked data in power query”


.

Review my full PQ solution

On to my solution.
I tent to design my queries using UI as much as I can and when the query kind of does what it needs to do then I’ll go into the Advanced Editor examine the code and see if I can modify/change it.

In my view, this data offered a lot of opportunities to experiment with some actual M coding. You can find all of my notes on that in the Countries (full UI) with Comments query.

And when I was done, this was the end result.


.

Starting from line 4 where I invoked a custom Text Cleaning function (for anyone interested in that a YouTube video will become available some time next week). This is the M code for the fxCleanText function.

(myText as text) as text =>
  let
    ToRemove     = {""""} & Text.ToList(":[]}"),
    CleanText    = Text.Remove(myText, ToRemove),
    TrimText     = Text.TrimEnd(Text.Trim(CleanText), ","),
    FixLink      = Text.Replace(TrimText, "//", "://"),
    ReplaceValue = Text.Replace(FixLink, "_", " ")
  in
    ReplaceValue

.
Which can then be Invoked to Transform both columns in a single step.
image
.

Filtering Rows


.

Creating segments to Pivot the data (aka identifying each separate block of stacked data)
Nested 2 functions here and only returned the 3 Columns mentioned in the square brackets.
image

And the Pivot


.

Renamed the Columns

Table.RenameColumns( PivotCol1,
  let
    nNames = List.ReplaceMatchingItems(
      List.Transform( 
        Table.FirstN(UpdateIndex, 20)[Column1], Text.Proper),
        {
          {"Km2", "Area Sq Km"},
          {"Mi2", "Area Sq Mi"},
          {"Alpha 2", "ISO 2 Code"},
          {"Alpha 3", "ISO 3 Code"},
          {"Tld", "TLD"}
        }
      ),
    cNames = Table.ColumnNames(PivotCol1)
  in
    List.Zip( {cNames, nNames} )
)

.

Format and assign Type to the Columns

Table.TransformColumns(
  RenameColumns,
  {
    {"Country", Text.Proper, type text},
    {"Capital", Text.Proper, type text},
    {"Currency", Text.Proper, type text},
    {"Native Language", Text.Proper, type text},
    {"Famous For", Text.Proper, type text},
    {"Alcohol Prohibition", Text.Proper, type text},
    {"Drive Direction", Text.Proper, type text},
    {"Constitutional Form", Text.Proper, type text},
    {"Continent", Text.Upper, type text},
    {"ISO 2 Code", Text.Upper, type text},
    {"ISO 3 Code", Text.Upper, type text},
    {"Neighbors", Text.Upper, type text},
    {"Area Sq Km", Number.From, type number},
    {"Area Sq Mi", Number.From, type number},
    {"Is Landlocked", Logical.From, type logical}
  }
)

.

To do some more additional clean up on the “Famous for” column, created another function.

(myText as text) as text =>
  let
    ToList = Text.Split(myText, " "),
    Replacements = List.ReplaceMatchingItems( ToList,
        { {"And", "and"}, {"The", "then"}, {"Of", "of"}, {"To", "to"}, {"In", "in"} }
      ),
    CombineText = Text.Combine(List.Transform(Replacements, Text.From), " "),
    NewText = Text.Replace(CombineText, "'S", "'s")
  in
    NewText

.
and finally Invoked that - All done (in Power Query :wink:)


.

Nobody seems to have had any issues displaying the Flag in the Table, I remember being amazed the first time I saw that little trick. :smiley:

.
Here’s my file: Problem of the Week #2 – Dimension Table from Text Solution (Melissa).pbix (2.9 MB)

If you have any questions in regards to my solution please reach out and @mention me.
.

Things I noticed or want to call out after reviewing your submissions.
  • For both @chrish and @logicalbi participating here was their first Forum contribution :tada:
  • On the number of query steps… @BrianJ required the most 45 in total, @Rajesh 's second entry the least with just 15 steps (note I’m incl. let and in clause) followed by @chrish with 17 and @alexbadiu with 19 steps.
  • I thought the change @chrish made to the Source step was pretty slick. :+1:
    image
    Both Hossein Satour and @amira.bedhiafi.pro also used a combination with Lines.FromBinary in the Source step - Can you guys comment on how/why you decided to do that?
    Thanks!
  • Several methods were used for segmenting and setting up the Pivot step:
  1. if-then-else constructs like: if [Column1] = “Country” then [Index] else null
  2. Divisions like: Number.IntegerDivide([Index], 22) or [Index]/10 or each _ / 22
  3. Modulo like: Number.Mod([Index], 20)
  • Not everyone chose to do a Pivot to get to the tabular format other methods used:
  1. Nested table Transpose by @MK3010 and Hossein Satour
  2. #table construct by @Rajesh
  • @Rajesh created an epic Text Cleaning function :clap: :clap: :clap:
    image
  • and last but not least
    @MK3010 did an excellent write up so I recommend reviewing that, if you haven’t already.

.
If you have more to share on your own entry I would enjoy reading AND appreciate that :+1:

As you know to encourage participation, we do a random drawing of five participants who will receive a cool Enterprise DNA vinyl laptop sticker. This round’s winners, in alphabetical order, are:

  1. @BrianJ
  2. @chrish
  3. @MK3010
  4. @Rens
  5. @Sue

To claim your price please make sure to send your details to this email address: brian.julius@enterprisedna.co

That about sums it up for me… If you went exploring too and feel I missed any of the hidden gems, please feel free to leave your thoughts in a post below. Thanks!

Hope to see you soon in new POTW challenge!

All the best.
Melissa

6 Likes