{"id":1984,"date":"2019-10-01T02:43:36","date_gmt":"2019-10-01T02:43:36","guid":{"rendered":"http:\/\/kusuaks7\/?p=1589"},"modified":"2024-03-19T09:48:21","modified_gmt":"2024-03-19T09:48:21","slug":"why-data-annotation-is-the-secret-to-hacking-ai","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/why-data-annotation-is-the-secret-to-hacking-ai\/","title":{"rendered":"Why Data Annotation is the Secret to Hacking AI"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"1984\" class=\"elementor elementor-1984\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-f7c09d2 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"f7c09d2\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4abd2f7\" data-id=\"4abd2f7\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-41e0f7c elementor-widget elementor-widget-text-editor\" data-id=\"41e0f7c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn case you\u2019ve been living under a rock, artificial intelligence (AI) is everywhere. It\u2019s infiltrated almost every aspect of our private and professional lives. From healthcare to transportation, AI aims to redefine how information is collected, integrated, and analyzed; ultimately leading to more informed insights and delivering better outcomes. But for all its hype, the full promise of AI rarely comes to fruition because of one four-letter word: \u201cdata.\u201d\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9cb3ece elementor-widget elementor-widget-text-editor\" data-id=\"9cb3ece\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWhile the AI story is all the rage, the data narrative is not as prominently discussed. Sure, data may not be as sexy as the automated systems that can learn and process information quicker than a human, but it is equally as important. And don\u2019t get me wrong, we all know that AI requires vast amounts of data to continually learn and identify patterns that humans can\u2019t. After all, it\u2019s the ability to process this information and make instant decisions that has led to AI being such a game changer for industries that rely on massive volumes of data.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dd92fe0 elementor-widget elementor-widget-text-editor\" data-id=\"dd92fe0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tBut the real story is not about the algorithms powering the AI revolution, instead it\u2019s about the quality of data powering these systems. What enterprises really need as they develop their AI strategy is to integrate, clean, link, and supplement their data so they have an accurate foundation on which to build\u00a0and train their machine learning algorithms.\u00a0 For many organizations, this makes AI difficult if not impossible.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8862b79 elementor-widget elementor-widget-text-editor\" data-id=\"8862b79\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\u201cData-related challenges are a top reason (our) clients have halted or canceled artificial-intelligence projects,\u201d said IBM\u2019s senior vice president of cloud and cognitive software, Arvind Krishna, speaking at\u00a0<em><a href=\"https:\/\/www.wsj.com\/articles\/data-challenges-are-halting-ai-projects-ibm-executive-says-11559035800\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\">The Wall Street Journal\u2019s<\/a><\/em>\u00a0Future of Everything Festival. He\u2019s certainly not alone in his assessment.\u00a0 According to a report by\u00a0<em>MIT Technology Review<\/em>, insufficient data quality was one of\u00a0<a href=\"https:\/\/insights.techreview.com\/live-ai-poll-key-stats\/\" target=\"_blank\" rel=\"noreferrer noopener\" label=\" (opens in a new tab)\" class=\"broken_link\">the biggest challenges<\/a>\u00a0to employing AI. What\u2019s more, 85% of AI projects will \u201cnot deliver\u201d for organizations, according to research and advisory company\u00a0<em>Gartner<\/em>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1d6313e elementor-widget elementor-widget-text-editor\" data-id=\"1d6313e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tCompanies need to think of AI and machine learning as the engines that will drive the amazing things they want to accomplish. But like every engine, it needs the right fuel to run well.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3314795 elementor-widget elementor-widget-heading\" data-id=\"3314795\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><h2><strong>Enter Data Annotation<\/strong><\/h2><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-9cfad95 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"9cfad95\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-03a6dd6\" data-id=\"03a6dd6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f5db911 elementor-widget elementor-widget-text-editor\" data-id=\"f5db911\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tData annotation (also referred to as data labeling) is quite critical to ensuring your AI and machine learning projects can scale. It provides that initial setup for training a machine learning model with what it needs to understand and how to discriminate against various inputs to come up with accurate outputs.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-00d8367 elementor-widget elementor-widget-text-editor\" data-id=\"00d8367\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThere are many different types of data annotation modalities, depending on what kind of form the data is in. It can range from image and video annotation, text categorization, semantic annotation, and content categorization. Humans are needed to identify and annotate specific data so machines can learn to identify and classy information. Without these labels, the machine learning algorithm will have a difficult time computing the necessary attributes.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-95f0fcc elementor-widget elementor-widget-text-editor\" data-id=\"95f0fcc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe unfortunate reality about all of this is that it\u2019s still a very manual process requiring manual labor. While tools for annotation are getting better, the difference between an ill-designed tool and an intuitive one makes significant difference in annotation productivity. According to\u00a0<a href=\"https:\/\/www.cognilytica.com\/2019\/04\/19\/infographic-data-prep-and-labeling\/\" target=\"_blank\" rel=\"noreferrer noopener\" label=\" (opens in a new tab)\" class=\"broken_link\">some estimates<\/a>, 80% of AI project time is currently spent on data preparation. But even small errors in the data could prove to be disastrous. In this area, humans actually have a leg up on machines. We\u2019re are simply better than computers at managing subjectivity, understanding intent, and coping with ambiguity \u2013 all of which are important factors of data annotation.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1ab3891 elementor-widget elementor-widget-text-editor\" data-id=\"1ab3891\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tRegardless of modality, the vast majority of problems in which AI models are being built to address them can fit into\u00a0one (or many) of the below annotation tasks:\n<ul>\n \t<li>Sequencing: text or\u00a0time series from which there\u2019s a start (left boundary) an end (right boundary) and a label.\u00a0(e.g.,\u00a0recognize the name of a person in a text, identify a paragraph discussing penalties in a contract)<\/li>\n<\/ul>\n<ul>\n \t<li>Categorization: binary classes, multiple classes, one label, multi-labels, flat or\u00a0hierarchic, otologic (e.g.,\u00a0categorize a book according to the BISAC ontology, categorize an image as offensive or not offensive)<\/li>\n<\/ul>\n<ul>\n \t<li>Segmentation: find paragraph\u00a0splits, find an object in image, find transitions between speakers, between topics, etc. (e.g., spot objects and people in a picture, find the transition between topics in a news broadcast)<\/li>\n<\/ul>\n<ul>\n \t<li>Mapping: language-to-language, full text to summary, question to answer, raw data to normalized data (e.g., translate from French to English, normalize a date from free text to standard format)<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-045af51 elementor-widget elementor-widget-text-editor\" data-id=\"045af51\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tUsually, complex problems can be solved as a sequence or a combination of tasks. For example, when you unlock your phone with face identification, machine learning is used to spot your nose and eyes (segmentation) and categorize as you or not-you (categorization). Think about when you talk to Alexa or Siri, machine learning is used to map your voice to words (mapping), recognize sequences such as\u00a0instruction,\u00a0name of a song, etc.(sequences)\u00a0and play music, tell weather, etc. (categorization).\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5c577c8 elementor-widget elementor-widget-text-editor\" data-id=\"5c577c8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tAt the end of the day, even the most technically advanced algorithm cannot address or solve a problem without the right data. We know having access to data is quite valuable, but having access to data with a learnable \u2018signal\u2019 consistently added at a massive scale is the biggest competitive advantage nowadays. That\u2019s the power of data annotation.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>In case you\u2019ve been living under a rock, artificial intelligence (AI) is everywhere. It\u2019s infiltrated almost every aspect of our private and professional lives. From healthcare to transportation, AI aims to redefine how information is collected, integrated, and analyzed; ultimately leading to more informed insights and delivering better outcomes. But for all its hype, the<\/p>\n","protected":false},"author":648,"featured_media":4106,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97],"ppma_author":[3396],"class_list":["post-1984","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":3396,"user_id":648,"is_guest":0,"slug":"michael-goldberg","display_name":"Michael Goldberg","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Goldberg","first_name":"Michael","job_title":"","description":"Michael Goldberg is Vice President, Marketing and Communications at&nbsp;<a href=\"https:\/\/innodata.com\/\" target=\"_blank\" rel=\"noopener\">Innodata<\/a>&nbsp;that bridges human expertise with machine learning.&nbsp; He has spoken at various industry conferences, and have shared his&nbsp;thoughts in a variety of publications about technology, advertising, and emerging media trends. &nbsp;"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1984","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/648"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1984"}],"version-history":[{"count":7,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1984\/revisions"}],"predecessor-version":[{"id":36478,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1984\/revisions\/36478"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/4106"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1984"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1984"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1984"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1984"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}