Avoiding Scary Outcomes from Your AI Initiatives

Glen Ford Glen Ford
November 5, 2018 AI & Machine Learning

Ready to learn Artificial Intelligence? Browse courses like  Uncertain Knowledge and Reasoning in Artificial Intelligence developed by industry thought leaders and Experfy in Harvard Innovation Lab.

In the spirit of Halloween, let’s focus on something really scary: over half of enterprise AI projects fail. They’re strategic, often board-driven, expensive and highly visible, and yet most of them flop. AI initiatives that never go into production cost people their jobs and reputations. When something goes wrong after the model is deployed, sometimes there are nasty headlines and the need for crisis management.

AI projects fail for many reasons but these common data training mistakes can significantly improve the odds of a project’s success when avoided or corrected.

Don’t ask your data scientists to prepare your training data

It’s not unusual for data scientists to collect and annotate the relatively small data sample required to prove an algorithm’s fundamental concept. At this early stage the data set is manageable. The team can control quality and data-based bias issues are easy to detect. But when the same team of expensive, highly skilled data scientists is expected to produce a full-blown training data set, AI initiatives can go down in flames.

With a difference in scale of two or three orders of magnitude, the task overwhelms a data science team. A data preparation exercise of this size requires tools for managing data items, labeling and annotation tasks, and people to whom the tasks are assigned. It requires a lot of hours of work that data scientists don’t have. And it requires specific skills for managing large projects, designing annotation and labeling tasks, auditing working output, and creating processes for verifying label and annotation accuracy, people and skills that small AI teams don’t have.

Don’t buy pre-labeled training data

After piloting their algorithm, some AI teams opt to acquire pre-labeled training data from commercial or open sources. For the fortunate few, pre-labeled data offers a representative sample of their problem universe, is annotated in ways that are appropriate to the use case, and is available in sufficient volume and accuracy to train the algorithm. For everyone else, the pre-labeled data is a mirage. It may introduce sample bias into the training data. It may be affected by measurement bias from whatever instrument generated the data. And it may reflect prejudicial or societal bias on the part of the people who labeled the data. It’s not uncommon for an enterprise data science team to get an algorithm to, say, an 80% confidence level using pre-labeled data, and no higher.

Don’t go over the waterfall

Most software development teams learned a long time ago that the traditional waterfall methodology is a scary ride. Waterfall adherents regard applications as monoliths moving down an assembly line. First, the entire application was mapped out. Then it was architected and then coded. Finally, the entire system entered testing and, ultimately, deployment.

The waterfall method is vulnerable to complexity. Complexity makes applications bigger, with more interdependencies. In waterfall, these interdependencies aren’t fully explored until the testing stage, at which point any oversights or miscalculations that are discovered drive the entire project back to the architecture or coding stage.

Complexity also makes project lifecycles longer. And with long lifecycles development projects are at the mercy of business and technology changes, which can force an application do-over or result in a deployed system that is irrelevant.

No one writes software this way anymore. But this is how most enterprises still manage AI projects.

Software developers long ago migrated to a more iterative, agile, style of development. Data science teams should do the same.

AI initiatives that evolve in an agile fashion get broken up into manageable bite-sized pieces. This technique allows teams to learn continuously and re-prioritize where necessary, which ultimately leads to delivering value quicker and building the right model more efficiently. An agile approach gives teams flexibility and emphasizes constant development and testing.

An agile data science team would train an algorithm on a specific part of its larger role, get that part to an acceptable level of confidence, and then deploy that part, even if other parts of the algorithm aren’t production ready. In doing this, the team is getting value from their AI investment faster.

It can be scary to think about the risk surrounding AI initiatives today. But data science teams can reduce their fright if their data is trained right and their development process is sound.

Originally posted in insideBIGDATA

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Glen Ford

    Tags
    Artificial Intelligence
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Greedy Algorithm and Dynamic Programming

    Greedy Algorithm and Dynamic Programming

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.