After Data Challenge 8 there was some significant interest in the machine learning applications within Power BI and I’ve been asked to put together some resources to help people implement machine learning in their Power BI reports as well as some tools to help them learn machine learning. I have a passion for machine learning and especially for bringing these types of advanced analytics to a larger audience. Power BI has been one of the best tools I have found that this can be done in a way that both allows the average person to leverage these models in an easy to understand tool without requiring an advanced understanding of data science, machine learning, and statistics, but still allows someone with that understanding to leverage that knowledge, optimize those models, and get the same degree of quality they are used to in other analytics platforms.
Machine Learning within Power BI
There’s a two main ways to integrate a machine learning model into Power BI: as a visual or as a query within Power Query or as a visual (you can connect to a machine learning model through a dataflow, direct connection, or live connection but not everyone has the resources to do so, so I’m not going to go over these at the moment). Within Power BI, the two most common programming language used for machine learning are R and Python (personally, I tend to favor R but that is purely a personal choice), but there are other programming languages that can be used for machine learning. But since these are the most popular ones I am going to focus on these two for now. Additionally, I am going to break this post out into two parts. The first part will provide resources for learning R and Python, resources for integrating R and Python into Power BI, and resources for learning machine learning., integration while the second part I will walk through some examples of using machine learning in Power BI and Part 2 will cover machine learning in Power BI.
**Updated: I’ve decided to add a section at the bottom for recommendations from the EDNA Community.
Learn R & Python
While it is possible to deploy some machine learning models without any R or Python programming, you will find that understanding these languages will significantly help assist you when you create and tweak your models. I highly suggest using Datacamp, Coursera, and edX for learning either programming language. There’s a number of excellent books as well that I’ll add later on, but if you want to get started right now, these are three of the best resources for learning. If you have any resources that have helped you, just let me know.
DataFlair: This recommendation comes from @GuyJohnson. I did not personally use DataFlair in my data science journey, but after looking through their catalog and trying a few of their courses I whole heartedly agree that this is another amazing resource.
Integrating Python and R into Power BI
There’s a two main ways to integrate a machine learning model into Power BI: as a visual or as a query within Power Query or as a visual (you can connect to a machine learning model through a dataflow, direct connection, or live connection but not everyone has the resources to do so, so I’m not going to go over these at the moment).
Learn Machine Learning
Part 2 is out now. I will be updating this post to include more resources in the future.
This is a fantastic resource list. I was not familiar with the edX courses, but those are great to know about.
Thanks so much for taking the time to write this up. Already looking forward to part two.
I have also found that http://Data-Flair.training has some good training.
Under the Blog at the top of the page are some excellent tutorials on many subjects (I posted about the Power BI one before). These seem to all be free.
There is also paid training there
@GuyJohnson I didn’t know they had machine learning, I’ll definitely take a look! Thank you!
@bradsmith @BrianJ There is also some good stuff a Udemy. Can be pricey but if you wait it out the prices comes down to around $10 or so every so often.
And it has lifetime access to what you buy
@BrianJ honestly, edX and Datacamp were probably the two biggest influences on my early ML & analytics education. My undergrad degrees weren’t in any computer science or even technical field and I originally started my career in sales for a protein manufacturer so I had zero experience with anything beyond excel. Datacamp gave me the base understanding and confidence in my coding ability to take those courses and edX brought it to a whole other level.
For more advanced users with an intermediate-advanced level of understanding in R or Python or are looking for a Professional Certificate, MicroMasters(R) Program, or just a learning path I’d recommend I’d highly recommend:
And for anybody who doesn’t need a certificate, can’t afford to pay for the courses, or simply just wants to try one of the courses without paying, edX does allow you to audit a course for free. And every course does allow you to later upgrade to the verified certificate track later up to a specified deadline so you can take part of a course and make sure it’s what you want before deciding.
Oh wow, amazing Brad.
Let’s set up a call and get this into a video! The YT channel (and the world’s Power BI users) need to learn this!
Sounds great! I love talking about both machine learning and Power BI!
Let’s do it. Connect with me on Skype
@bradsmith what do you think is better to start with, R or Python? In your personal experience what integrates well into Power BI?
That’s honestly a really hard question to answer. R was specifically built to perform statistical and numerical analysis on large datasets and display the results whereas python was originally created as a more general purpose programming language so I personally lean towards R.
If you have some experience with other programming languages, Python’s syntax is more similar to other programming languages than R’s. If not, I’d say go with R. Troubleshooting issues early felt easier in R than it did early on with Python. It gets much easier once you have a basic understanding of python but for me if felt much easier to do in R.
@MudassirAli I would say you start with Python, syntaxes are easy to read/understand and anything you can think of has already been asked on forums so you will always land on a solution. Once you get how function/class/OOP/method/properties/loops work you can start with Pandas library. Pandas is an amazing library for data cleaning.
Personally, I chose R because my primary focus was statistics/econometrics and R is definitely stronger in that area.
I think R is much easier to learn than DAX. As long as you got a good statistical foundation, once you learn the basic structure of R scripts and functions, much of it becomes a repetitive fill-in-the-blanks exercise.
“Everything is easier to learn than DAX” - Antriksh Sharma, circa 2020
@bradsmith Thanks for the useful resources. I have always wanted to learn how to do Machine Learning.
@AfroLatino happy to help! Can’t wait to hopefully see some of your work in the future.
Found about this book that is said to be an absolute essential for machine learning with R, the author has made PDF version available for free:
https://faculty.marshall.usc.edu/gareth-james/ISL/ISLR Seventh Printing.pdf
Introduction to Statistical Learning (usc.edu)
@bradsmith Will do my best!
That’s a great book for R! I was going to add some books to the list of resources and that’s a great place to start!