Automated machine learning in Microsoft Power BI
kypexin
New Altair Community Member
Hey-hey,
A bit of off-topic from usual forum questions, but I thought I still want to discuss it with the community.
Microsoft has announced Automated Machine Learning (AutoML) for Dataflows in Power BI: https://powerbi.microsoft.com/en-us/blog/creating-machine-learning-models-in-power-bi/
Some deja vu, right?
What's your opinion, has anyone tried it so far? How do you look at the future of automated ML in general? Seems like that more and more products and data science platforms are going to introduce this sort of feature in the nearest future. Do you think automated ML hasn't yet uncovered its fully potential and is going to be a prospective technology in a matter of just a few years (yes, I personally do think it's a future... but how close it is?).
A bit of off-topic from usual forum questions, but I thought I still want to discuss it with the community.
Microsoft has announced Automated Machine Learning (AutoML) for Dataflows in Power BI: https://powerbi.microsoft.com/en-us/blog/creating-machine-learning-models-in-power-bi/
Some deja vu, right?
What's your opinion, has anyone tried it so far? How do you look at the future of automated ML in general? Seems like that more and more products and data science platforms are going to introduce this sort of feature in the nearest future. Do you think automated ML hasn't yet uncovered its fully potential and is going to be a prospective technology in a matter of just a few years (yes, I personally do think it's a future... but how close it is?).
Tagged:
2
Answers
-
Hey,
Here are couple of thoughts on the general Auto ML trend and how it relates to the BI / data viz / prep market:- Auto ML will become the standard approach for the vast majority of data science projects requiring ML. It is the only way to overcome the data scientist bottleneck. Very soon, I would expect that more than 90% of models in production will be created with some Auto ML flavor. The remaining (small) percentage will be very specific use cases which would require individualized solutions. Workflows will cover a lot of these, but there will always be the need for coding for truly new use cases as well. The good news is that more models will created by non-traditional data scientists while those can focus more on the really hard problems then.
- Auto ML makes more sense as part of prep / ETL processes than it makes in visualizations / BI reports. The problem is that current data architectures for viz-based approaches often provide the data in some aggregated form to the user while the biggest value of ML is on an event level. This granularity is available as part of the data prep workflows though. So we can expect that pretty much all data prep / ETL vendors will either build one or partner with an Auto ML solution very soon.
- Most data viz / BI vendors will also incorporate Auto ML to their visualizations, but it will have less impact. Those vendors will realize (for the reasons above) that the connection to Auto ML makes most sense within their prep tool (if they have one). Embedding Auto ML directly as part of the visualizations is definitely a nice gimmick and may be the first touch point for many users with ML. And I see situations where ML models can help with explanations for what you are currently seeing. But both areas are not what most people would refer to as a ML model in production use. And it is definitely far away from automated decision making.
- Every data science platform will have Auto ML as part of their solution. We are almost there already to be honest. I have seen dozens of Auto ML solutions at this point already, all are coming with pretty much a similar set of functions.
- Because of 1-4, Auto ML will be commoditized and no longer be a stand-alone category in a couple of years. While the hype is high right now, the perceived value often is more wishful thinking than reality. There will be some reality checks soon when people realize that the hypertuned model is not very robust, or that the expected error rates cannot be hold because of wrong validations (due to a separation of data prep and ML), or because of concept drift. This is all part of data science, of course, but many new users of Auto ML are simply not aware of this. As a follow-up, I would expect price drops for Auto ML because a) the commoditization and b) since reality kicks in that a ML model is a ML model, no matter who or what has generated it.
- Auto ML solutions will not differentiate through accuracy of the models. Or though speed of model building. Those are KPIs important for the data scientists, but LOB owners do not really care. The main differentiators will be around covered use cases, the model deployment and management, how well the Auto ML solution is integrated with the remaining platform for data science, and - potentially most important - how well the model can be integrated with end user applications including data viz apps.
This is all very exciting! I personally use RapidMiner Auto Model almost exclusively whenever I need a model now. I do not even see enough need to try and optimize the model even further. It is fun and saves me tons of time. But at the same time it is good to know that I could use the generated process as a starting point if I wanted to.The one thing which surprised me most in the past 6 to 12 months was how quickly people have been accepting automatically generated models and trusted them as much as those generated by humans. I talk a lot about transparency and trust in my presentations, mainly because this used to be one of biggest hurdles to overcome before you put a model into production. But it seems that the market is maturing rapidly and that this becomes less and less of an issue now. Which is great as well, I just did not expect it to happen that quickly.
Anyway, here are those thoughts on Auto ML, BI, data prep, and all the rest
Cheers,
Ingo3 - Auto ML will become the standard approach for the vast majority of data science projects requiring ML. It is the only way to overcome the data scientist bottleneck. Very soon, I would expect that more than 90% of models in production will be created with some Auto ML flavor. The remaining (small) percentage will be very specific use cases which would require individualized solutions. Workflows will cover a lot of these, but there will always be the need for coding for truly new use cases as well. The good news is that more models will created by non-traditional data scientists while those can focus more on the really hard problems then.