As artificial intelligence and machine learning become universal in our society, more and more companies have started to invest in these technologies to support their business decisions, reduce costs, and drive production efficiency. As novel as it may sound, AI and machine learning are not magic. A 2019 article published by Pactera and Nimdzi Insights displays how 85% percent of AI initiatives fail. So, why are companies with sufficient technology, funding, and personnel not seeing the results from their data science initiatives?
To answer this question, we first have to understand the process of an AI initiative from start to deployment. In operationalizing AI, there are two main phases; the training phase and the inference phase.
- Data scientists decide which models to use and how to implement these models
- Models are trained with data and outputs are evaluated for prediction accuracy
- Data transforms into ML models ready to be applied to “real world” data
- The ML model is applied to its particular use case
- The company determines if the model is generalizing to the company’s real data properly and with sufficient accuracy
- Adjustments to the model are made and more training data may be added as required
- The company can utilize insights from the models to drive business decisions
With so many companies reporting numerous failed machine learning projects, there are clearly many challenges that come with operationalizing artificial intelligence. Companies without the proper direction and planning in terms of AI will struggle to keep pace with competitors and fall behind in their respective industries. Successfully implementing machine learning algorithms into business processes is essential, and companies face increasing pressure to invest carefully in data science in order to see results.
Here are 2 main reasons why ML and AI initiatives are failing to get into the business process, along with ways to overcome these challenges.
1. Transitioning from the Training Phase to the Inference Phase
A major difficulty companies run into is the bridging between the data science and operational professionals teams. It is already difficult for companies to find data scientists with industry experience as well as strong leadership and communication ability. Even with a strong data science team though, companies often run into problems when machine learning models are ready to be passed on to the operations team to be incorporated in the business process (moving from training to inference).
Lack of communication and understanding of the purpose and design of the model, along with differences in the tools of data scientists and operational professionals are some of the main issues with this process. As explained in a Forbes article on operationalizing AI, data scientists and machine learning engineers often build their models on notebooks and other tools tailored specifically towards data science whereas the operations side will use its own unique tools. This leads to difficulties and delays in deploying these models in addition to a lack of communication between company teams.
Strategies for Success:
Ensuring that all teams collaborate effectively and have a proper understanding of these data science initiatives is essential in getting more value out of AI. Data scientists and operations professionals must be on the same page when it comes to moving from the training stage to the inference stage. To do this, a formalized handoff process should be defined that describes how data science projects are handed over to the operations project management team. With data scientists and project managers in other departments on the same page, more AI initiatives will have better management, leading to a smoother integration of data science initiatives into the business process. Still, finding the right data scientists, engineers, and MLOps (machine learning operations) professionals is difficult for many companies, and can be costly or require a lengthy onboarding process, meaning many companies will be better off looking beyond internal resources for success.
2. AI Interpretability
Another challenge that companies run into is data scientists and engineers, who are solely hired to build ML models, are often not able to effectively interpret their algorithms and findings to business leaders. From a business perspective, there has to be absolute certainty that value will be brought out of an AI initiative, and oftentimes, a lack of understanding between themselves and their data scientists will lead to the failure of the initiative. In addition to this, certain models that utilize unsupervised learning will have very high accuracy with very little explainability. This means that company decision-makers will have to trust that their data is reliable and that their model was trained accurately in order to decide to go through with their AI initiative. More often than not, the combination of these challenges forces business leaders to turn down AI initiatives and continue on with traditional business processes.
Strategies for Success:
Companies must ensure that they have skilled project managers with knowledge of both the technical and business sides leading these initiatives. It is crucial for there to be a strong connection between the data science team and upper management. Data science teams need leaders who understand both sides to effectively communicate with decision-makers and implement these models. Having influential communicators who can translate the work of the data scientists to business leaders can have very positive effects on business leaders’ trust and approval of data science initiatives. In addition, it is also important for data scientists to not rely solely on unsupervised learning. Including human data evaluation and supervised or semi-supervised learning can be better for assessing the quality of algorithms along with interpretability for the business side.
Pandio, one of The Data Standard’s top sponsors, is a distributed messaging service that has been designed with the idea of helping companies connect their data more effectively to AI/ML models in the cloud. Leveraging the new and incredibly powerful open source technology of Apache Pulsar, Pandio has developed a one-of-a-kind distributed messaging system that delivers high throughput (up to billions of events per day), low latency, and zero data loss. As such, Pandio is a worthwhile consideration for companies looking to gain value out of AI initiatives, which can in turn yield excellent returns. Feel free to reach out if you have any questions or are interested in learning more about Pandio’s services.
For more of our blog posts and content, check out our website at The Data Standard, the premier community of data scientists, enthusiasts, and thought leaders. We aim to foster conversations and share insight among the leading professionals in data science through empathy and mindfulness during the pandemic.
About the Author Koosha Jadbabaei is a Data Scientist and Technical Writer working with the Data Standard. Koosha is currently a student at the University of California, San Diego majoring in Data Science and minoring in Entrepreneurship/Innovation. Along with his work at The Data Standard, Koosha is an undergraduate researcher at UCSD’s Data Science Department, working to detect political bias and misinformation in Twitter Tweets through sentiment analysis and clustering. He is interested in data analysis, machine learning, and data visualization, and is passionate about using data to tackle difficult problems and make a positive impact on the lives of others.