Every now and then, we hear data professionals talking about how automating is coming after their job. Let’s debunk this myth for all and once!
We often see it on various WhatsApp and telegram discussions that AutoML is going to replace Data Scientists. A common perception is when a product like protonAutoML can clean data, engineer features, build models, tune parameters, explain results and deploy, what is the need of a data scientist?
This is very far from the truth in reality. It is the same fear, back in the 90s when excel was introduced and people thought bookkeepers would be out of their jobs. The opposite happened, record keepers became more efficient with excel and jobs grew.
Don’t worry! The use of AutoML will not hamper the data scientist’s opportunities for work or creativity.
There is no need to fear any AI apocalypse (yet) because we have technologies like Auto-ML to help make things easier and increase productivity, so we don’t we can free our brain from repetitive and manual tasks and focus on business problems. Thinking about business problems, finding out areas of revenue increase, which projects to implement, creating one row per customer for the given project are what makes data professionals stand out. After all, models alone won’t bring new revenue opportunities, if the company doesn’t know how to use them.
We know that machine learning is a very effective technique, and it is continuously improving. We also know that machine learning has given us many other methods like supervised, unsupervised and reinforcement learning. However, there are many everyday tasks performed by data scientists on a day-to-day basis which can be automated. This helps reduce the effort, and even sometimes, results are better than what one can achieve manually. We can only try few permutations and combinations manually but pre-written logic and algorithms with vast computing powers can try endless combinations.
Think of it, you want to tune the parameters of machine learning models. You can try bayesian optimization, random grid search and so on and you would be limited by time and speed. You would most likely take time to tune the parameters. Here comes protonAutoML, the computing is happening 24*7 to find out the best parameters for you, thus freeing you from so many hours of boring, repetitive work.
Same thing with data cleaning, feature engineering. Say, there is a date column, which has many formats within that column. Now you would have to manually make it in the same format in pandas or sql and then you will extract days, month, year etc and perhaps even subtract from today to get days interval. Now imagine this kind of complexity with different variables. Why not just automate these task with autoML? Makes sense to save time.
Another popular example is recommender systems. In the recommender system, we have to analyse massive amounts of data, and then we have to put it in a form so that it can be presented for consumption. Now, this is a process that requires a lot of effort and resources. So, here AutoML is used to reduce the amount of work performed by humans and give them more time to concentrate on other important things.
Why is the data scientist a vital part of a team and needed?
It is because he/she needs to deal with the messy details of the business, able to make trade cases for the business on data quality and quantity, rather than on the details in the data. Working on a business problem to understand the requirements and needs cannot be identified using AutoML. Data governance is often the issue, only data scientists can understand. Convincing stakeholders is altogether another challenge that we face everyday in our life.
There is so much data and so many different ways to do it that it is impossible to find one approach which is always superior.
The data most of the times is complex and has almost infinite variety; it is often impossible to take it all into account.
Critical responsibilities of data scientists (even in 20 years from now!)
1. Formulate problems and think of multiple approaches
2. Prepare a good amount of data for analysis and store it in a secure and controlled environment.
4. Interpret models and their outcome, use various if else scenarios
5. Present the business outcomes after model implementation. Show the bigger picture to stakeholders.
6. Productionalize the AI results and measure the ROI and dollar value.
“ Fear not; AutoML is not taking over.”
Look at the positive side of adopting Automation in the business. The more Automation applied successfully, the more the departments will want the piece of the action. Not everyone knows the back-end of how this technology works and wants to experience its benefits, which will increase the demand of data scientists.