My approach to these challenges balances four different perspectives:
- Consultant - I try to role play these from the vantage point of a consultant, focused on answering the questions at hand, while also trying to find ways to provide some extra value to the client. I also pretend not to know info that would be unknown to the consultant (e.g., that the data for these are generated randomly)
- Client - having spent years as a manager overseeing teams of analysts, I have strong opinions about how I personally like to see data presented. This is why most of my entries tend to focus on one primary question per page and don’t have a ton of interactive features.
- Forum Contributor - I try to employ techniques that I think others on the forum might find interesting/instructive. For this submission, those techniques included: R script analyses and visuals, direct page navigation with buttons/toggles, calculation of distances from lat-long coordinates (Haversine formula) in Power Query, composite ranking approach, web scraping, and use of custom visuals (Flow Map, Smartfilter Pro, and Cards with States)
- Student - per the “Deliberate Practice” model, I try to stretch myself each challenge to learn some new techniques. For this one, that focus was spatial data techniques in PBI, not having really explored that aspect of the program prior outside of some basic mapping.
I don’t enjoy the graphical design aspect of this work and am not very good at it, so I typically borrow liberally from people who are (thankfully, including many on this forum) and just to stick to the maxim of “don’t be ugly” so the aesthetics don’t get in the way of the analysis. For this one, I used the Acterys Planning Theme from the Microsoft PBI Themes Gallery
DATA RELIABILITY AND VALIDITY:
In the real-world, before diving into any analysis, I always ask the question “are these data any good?” (i.e., do they pass basic threshold tests for reliability and validity, and based on the outcome of that analysis, to what extent can/should they be used for management decision-making?). While not comprehensive, I tried to provide some examples of the type of testing I would do in such an analysis, and chose some different approaches than I’ve showcased in previous challenges. Aesthetically, I really don’t like the way this page turned out (suggestions for improvement welcomed), plus I couldn’t find room for this important component, which helps clarify why the uniform distribution of minutes elapsed makes no sense:
COMPOSITE PERFORMANCE METRIC:
In thinking about a way to address the client’s request for insight into store-warehouse pairs that are/aren’t working well, while also providing sliceable data on the individual performance metrics, I thought back to a composite ranking approach that I worked on this Spring with @Mark. I thought that would work well here too, since it allowed me to collapse five separate performance metrics into a single score, and then allow the client to filter the data by warehouse and/or store (or groups thereof). To create the composite score, I summed the ranks of each pair (1-500) across each metric (5). In this summation, low score would represent better performance, which I thought counterintuitive, so I subtracted the sum from 2500 (5 metrics * 500 pairs/metric) so that higher score would represent better performance. I set the filter conditions to ALL, so that even when sliced, each store-WH pair would maintain a score calculated on the 2500 base, rather than on the filtered subset (this has no effect on the resulting order within the filtered subset, but allows the viewer to see even when filtered, where the pair would rank overall). I used SmartFilter Pro for my slicers, which I think provides both great additional functionality and aesthetics, and I use it now on all my reports.
Visually, I paired the table version of this analysis with a bar chart version, using these button templates, direct page navigation and synched slicers. I like the visual effect of the viewer being able to toggle back and forth, rather than trying to cram both visuals on the same page.
ROUTE OPTIMIZATION AND COST SAVINGS:
This section was what I did primarily for my own learning, but I felt it fit within the challenge scope in terms of “help management visualize what is happening so that we can decide what the next stage of the transformation is”. Since no locations were given for the warehouses, I made what I thought was a reasonable assumption and located them at the lat-long of the geographic center of the state, web-scraping that data from inkplant.com. I then downloaded a large city database from simplemaps.com including lat-long. The advantage of this dataset is that it also included population, which allowed me to address the problem of multiple cities/towns with the same name, by grouping the city data in Power Query by name and then filtering on the name with the max population. I think this worked well for all but one ambiguous store location (“Aurora”, which could be IL or CO) where the decision rule selected the latter but based on other store locations I think the former is correct (unlikely to have two stores in the Denver area and none in Chicago area), but I treated this data as valid, since it would be instantly verifiable with the client.
The next step was to calculate the distance between each store-warehouse pair in a dimension table I called Distance. The ideal way to do this would be to calculate the actual driving distance between the two points using a Google or Bing Maps API call. However, that proved too involved for the time I had available, so I used the Haversine formula to calculate the “as the crow flies” distance between two sets of lat-long coordinates. @Paul is our resident geospatial data guru, and his posts were essential in figuring out how to do this in PQ.
Once I had distance calculated for each pair, I grouped by store name in PQ and calculated the closest warehouse to each store. This calculation became the basis for my “Route Optimization” mapping, using the MS Flow Map custom visual. It’s a simplified analysis, since it assumes that the closest WH can supply all of the needs of the stores with which they are associated. If we’d been given data on products provided by each WH, we could have relaxed this assumption and done a full-scale linear optimization, which would have been cool. I do like the way the visualization for this component turned out.
For the final page, I just used the basic PBI map, with a measure to change the size of the dot based on whether the facility was a store or a WH. The most interesting technique on this page I think is the use of CONCATENATEX with UNICHAR( 10 ) to create the table on the right. This is one of my favorite DAX tricks and I was going to do a video on it until Alberto Ferrari beat me to it earlier today.
Anyway, that’s probably waaaaayyyyy more than enough. As always, huge thanks to Haroon and the Enterprise DNA team for the fantastic job they do with these challenges, and thanks to everyone who participates - I think the exchange of ideas that occurs in the course of these challenges is phenomenal, and I would encourage anyone who hasn’t submitted an entry yet to give it a try soon. It’s a ton of fun, and I guarantee at the end of the process your skills will have improved regardless of what level you started at.
P.S. Here’s my PBIX file:
(Note: I had to pull this down temporarily because it inadvertently revealed some confidential licensing info. Working on masking that, and will repost as soon as I figure out how to do that. Huge thanks to @Heather for letting me know. I learn new stuff from these challenges all the time… )
OK, it’s back up. I contacted OKViz (the SQLBI folks) and they told me that even if the license numbers for their custom visuals are shared, no one else can use them. Good info to know if you’re considering using them.
eDNA Data Challenge #4 – Brian Julius v2a.pbix (3.7 MB)
You made a brilliant report (as always)! I was particullary impressed by the “Route Optimization” map. It is dynamic, clear and very visual. I would really like to learn more in detail about the steps needed to arrive to such a result.
It would be really awesome if you could do a video of how you prepared the data before visualizing inside MS Flow Map custom visual.
Or about techniques using R… In many reports in the professionnal world we have to think about data quality, correlations etc…
I am motivated to learn a lot of the techniques you use ! They are inspiring! Thank you for sharing with us all this knowledge!
As a newbie to Power BI Desktop and still in the early days of learning through this fantastic channel and forum.
I am really impressed with your methodology and explanation detail how you set this up from the ground up, a real inspiration for myself.
My job is in logistics and basic analysis around transport, delivery times etc. so this challenge was perfect for my learning although I am not brave enough yet to submit a model yet to match anything of this calibre, maybe soon though
Thank you for generously providing the PBIX file, I’m sure lots of folks will learn from this example.
I have also appreciated the succinct ‘about this report’ area that you have provided at the end of each of your challenge samples.
I started my PowerBI journey at the same time as my journey into data analysis, and with no real formal training in the analytics side. I quickly realized the need to have such a reference, but mine is a separate page, with notes regarding version updates (accessible via a button ). I had not considered documenting the custom visuals, or the data sources. Thank you for some great ideas there.
You should still do the video, If someday I see both of us came up with the same solution I am still going to post mine, not letting the hard work and my thoughts go in vain
Hi @BrianJ I loved you report
Thanks very much for the kind words and positive feedback. With regard to the requested videos:
- I’ve got one in the hopper almost completed called “R for non-R Users”, which provides some simple strategies for Power BI users with a basic statistical foundation to leverage the immense power of R and R visuals without actually having to know a lot about the program.
- @Paul and I are going to be working on an eDNA content module (probably too much to squeeze into a YouTube video, even a long one) on Data Review strategies, addressing the various approaches for determining data quality, reliability, validity, etc.
- I will do one on the Route Optimization approach I used. What I’d really like to do is take the time to work through the map API call approach for calculating driving distances between two lat-long points and compare that to the Haversine approach.
- @Melissa and I are in the early stages of sketching out a series on DAX virtual tables that we’ve wanted to do for a while.
Having a bit of a challenge balancing the videos, the data challenge entries, and forum support (not to mention day job, family, friends and sleep…), but these topics above are at the top of a long list.
Thanks again for the input. We all really strive to make the videos as useful to you and as responsive to your priority issues as possible, so this type of feedback is extremely helpful and greatly appreciated.
By the sounds of it, you want to try to please everyone. We all appreciate your expertise and helping everyone.
Please don’t get yourself burnt out. We all understand you have a life outside of the forum.
Please stay safe.
Again thanks for your helping us with solutions.
Incredible write up Brian. So much value in here.
There is literally so much value embedded into your report that anyone could learn from, I don’t know where to start…
To be honest even I have learned so much from reviewing your write up and also having a look at your report. You’ve really thought outside the box and derived some insights that I would never have thought of myself and so I feel like I’m learning so much from reviewing some of the techniques and ideas that you have embedded into your development and design of your report.
I also like many aspects of the navigation and also calculations and summary insights you’ve placed into certain pages.
I really like the enhanced card visualizations where you’ve placed a key metric but then also added some color to it where you have created some text summaries of what the metric actually means.
I also think the color scheme overall, works really well. The navigation features on each different page to reset or go to the home page also superb.
I also loved them map visualization where you have completed your supply chain optimization analysis. This is something I’ve never ventured into but think could enable huge value extraction for any business who deals with this sort of challenge.
I also love the filtering mechanism and custom visual that you’ve utilized, I think I might challenge myself to use something more creative like this in a future challenge.
Overall incredible design and report submission @BrianJ. We’re lucky to have you as part of the community and there is so much to learn from so much of what you’ve contributed here.
All I can think of is how can we get more of the Enterprise DNA community to gain exposure to a lot of the amazing work and ideas you’re placing into these challenge submissions and also into the forum!