{"id":9464,"date":"2020-08-27T09:59:51","date_gmt":"2020-08-27T09:59:51","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/?p=9464"},"modified":"2023-11-15T10:42:54","modified_gmt":"2023-11-15T10:42:54","slug":"automated-inspiration","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/automated-inspiration\/","title":{"rendered":"Automated Inspiration"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"9464\" class=\"elementor elementor-9464\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2502510c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2502510c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2774e781\" data-id=\"2774e781\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-407b9b8d elementor-widget elementor-widget-text-editor\" data-id=\"407b9b8d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><sup>I<\/sup>n the 19th century, doctors might have prescribed mercury for mood swings and arsenic for asthma. It might not have occurred to them to wash their hands before your surgery. They weren\u2019t\u00a0<em>trying<\/em>\u00a0to kill you, of course\u2014they just didn\u2019t know any better.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>These early doctors had valuable data scribbled in their notebooks, but each held only one piece in a grand jigsaw puzzle. Without modern tools for sharing and analyzing information\u2014as well as a science for making sense of that data\u2014there wasn\u2019t much to stop superstition from influencing what could be seen through a keyhole of observable facts.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6855c37 elementor-widget elementor-widget-text-editor\" data-id=\"6855c37\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>Humans have come a long way with technology since then, but today\u2019s boom in machine learning (ML) and artificial intelligence (AI) isn\u2019t really a break with the past. It\u2019s the continuation of the basic human instinct to make sense of the world around us so that we can make smarter decisions. We simply have dramatically better technology than we\u2019ve ever had before.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph {\"align\":\"center\",\"backgroundColor\":\"vivid-cyan-blue\",\"fontSize\":\"medium\",\"style\":{\"color\":{\"text\":\"#f2ecec\"}}} -->\n<p class=\"has-text-align-center has-vivid-cyan-blue-background-color has-text-color has-background has-medium-font-size\" style=\"color: #f2ecec;\">&#8220;Today\u2019s boom in machine learning and artificial intelligence isn\u2019t really a break with the past. It\u2019s the continuation of the basic human instinct to make sense of the world around us so that we can make smarter decisions. We simply have dramatically better technology than we\u2019ve ever had before.&#8221;<\/p>\n<!-- \/wp:paragraph -->\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d38a882 elementor-widget elementor-widget-text-editor\" data-id=\"d38a882\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>One way to think of this pattern through the ages is as a revolution of data sets, not data points. The difference isn\u2019t trivial. Data sets helped shape the modern world. Consider the scribes of Sumer (modern day Iraq), who pressed their styluses to tablets of clay more than 5,000 years ago. When they did so, they invented not just the first system of writing, but the first data storage and sharing technology.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>If you\u2019re inspired by the promise of AI\u2019s better-than-human abilities, consider that stationery gives us superhuman memory. Though it\u2019s easy to take writing for granted today, the ability to store data sets reliably represents a ground-breaking first step on the path to higher intelligence.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-552083d elementor-widget elementor-widget-text-editor\" data-id=\"552083d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Unfortunately, retrieving information from clay tablets and their pre-electronic cousins is a pain. You can\u2019t snap your fingers at a book to get its word count. Instead, you\u2019d have to upload every word into your brain to process it. This made early data analysis time-consuming, so initial forays into it stuck to the essentials. While a kingdom might analyze how much gold it raised in taxes, only an intrepid soul would try the same line of effortful reasoning on an application like, say, medicine, where millennia of tradition encouraged just winging it.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Luckily, our species produced some incredible pioneers. For example, John Snow\u2019s map of deaths during the 1858 cholera outbreak in London inspired the medical profession to reconsider the superstition that the disease was caused by miasma (toxic air) and to start taking a closer look at the drinking water.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-eafdbee elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"eafdbee\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bd348c5\" data-id=\"bd348c5\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-22c0899 elementor-widget elementor-widget-text-editor\" data-id=\"22c0899\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>If you know \u201cThe Lady With The Lamp,\u201d Florence Nightingale, for her heroic compassion as a nurse, you might be surprised to learn that she was also an analytics pioneer. Her inventive infographics during the Crimean War saved many lives by identifying poor hygiene as a leading cause of hospital deaths and inspiring her government to take sanitation seriously.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>The one-data set era took off as the value of information began to assert itself in a growing number of fields, leading to the invention of the computer. No, not the electronic buddy you\u2019re used to today. \u201cComputer\u201d started out as a human profession, with its practitioners performing computations and processing data manually to extract its value.<\/p>\n<!-- \/wp:paragraph -->\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1e8a97d elementor-widget elementor-widget-text-editor\" data-id=\"1e8a97d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>The beauty of data is that it allows you to form an opinion out of something better than thin air. By taking a look at information, you\u2019re inspired to ask new questions, following in the footsteps of Florence Nightingale and John Snow. That\u2019s what the discipline of analytics is all about: inspiring models and hypotheses through exploration.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-22688a3 elementor-widget elementor-widget-heading\" data-id=\"22688a3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><h2><strong>From Data Sets To Data Splitting<\/strong><\/h2><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-10f36e5 elementor-widget elementor-widget-text-editor\" data-id=\"10f36e5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>In the early 20th century, a desire to make better decisions under uncertainty led to the birth of a parallel profession: statistics. Statisticians help you test whether it\u2019s sensible to behave as though the phenomenon an analyst found in the current data set also applies beyond it.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>A famous example comes from Ronald A. Fisher, who developed the world\u2019s first statistics textbook. Fisher describes performing a hypothesis test in response to his friend\u2019s claim that she could taste whether milk was added to tea before or after the water. Hoping to prove her wrong, he was instead forced by the data to conclude that she could.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3a36fc1 elementor-widget elementor-widget-text-editor\" data-id=\"3a36fc1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Analytics and statistics have a major Achilles\u2019 heel: If you use the same data point for hypothesis generation and for hypothesis testing, you\u2019re cheating. Statistical rigor requires you to call your shots before you take them; analytics is more a game of advanced hindsight. They were almost tragicomically incompatible, until the next major revolution\u2014data splitting\u2014changed everything.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Data splitting is a simple idea, but to a data scientist like myself, it\u2019s one of the most profound. If you have only one data set, you must choose between analytics (untestable inspiration) and statistics (rigorous conclusions). The hack? Split your data set into two pieces, then have your cake and eat it too!<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a70e3d1 elementor-widget elementor-widget-text-editor\" data-id=\"a70e3d1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The two-data set era replaces the analytics-statistics tension with coordinated teamwork between two different breeds of data specialist. Analysts use one data set to help you frame your questions, then statisticians use the other data set to bring you rigorous answers.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Such luxury comes with a hefty price tag: quantity. Splitting is easier said than done if you\u2019ve struggled to scrape together enough information for even one respectable data set. The two-data set era is a fairly new development that goes hand-in-hand with better processing hardware, lower storage costs and the ability to share collected information over the internet.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fb2e9cc elementor-widget elementor-widget-text-editor\" data-id=\"fb2e9cc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>In fact, the technological innovations that led to the two-data set era rapidly ushered in the next phase, a three-data set era of automated inspiration. There\u2019s a more familiar word for it: machine learning.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Using a data set destroys its purity as a source of statistical rigor. You only get one shot, so how do you know which \u201cinsight\u201d from analytics is most worthy of testing? Well, if you had a third data set, you could use it to take your inspiration for a test drive. This screening process is called validation; it\u2019s at the heart of what makes machine learning tick.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Once you\u2019re free to throw everything at the validation wall and see what sticks, you can safely let everyone have a go at coming up with a solution: seasoned analyst, intern, tea leaves and even algorithms with no context about your business problem. Whichever solution works best in validation becomes a candidate for the proper statistical test. You\u2019ve just empowered yourself to automate inspiration!<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ec7a8b6 elementor-widget elementor-widget-heading\" data-id=\"ec7a8b6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><strong>Automated Inspiration<\/strong><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-93201c6 elementor-widget elementor-widget-text-editor\" data-id=\"93201c6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<!-- wp:paragraph -->\n<p>This is why <a href=\"https:\/\/www.experfy.com\/blog\/what-machine-learning-data-poisoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">machine learning<\/a> is a revolution of data sets, not just data. It depends on the luxury of having enough data for a three-way split.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Where does AI fit into the picture? Machine learning with deep neural networks is technically called deep learning, but it got another nickname that stuck: AI. Although AI once had a\u00a0<a href=\"http:\/\/bit.ly\/quaesita_ai\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\"><u>different meaning<\/u><\/a>, today you\u2019re most likely to find it used as a synonym for deep learning.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Deep neural networks earned their hype by virtue of outclassing less sophisticated ML algorithms on many complex tasks. But they require much more data to train them, and with processing requirements beyond those of a typical laptop. That\u2019s why the rise of modern AI is a cloud story; the cloud allows you to rent someone else\u2019s data center instead of committing to building your deep learning rig, making AI a try-before-you-buy proposition.<\/p>\n<!-- \/wp:paragraph -->\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-71ca990 elementor-widget elementor-widget-text-editor\" data-id=\"71ca990\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>With this puzzle piece in place, we have the full complement of professions: ML\/AI, analytics and statistics. The umbrella term that encompasses all of them is called\u00a0<a href=\"http:\/\/bit.ly\/quaesita_datasci\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\"><u>data science<\/u><\/a>, the discipline of making data useful.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Modern data science is the product of our three-data set era, but many industries routinely generate more than enough data. So is there a case for four data sets?<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Well, what\u2019s your next move if the model you just trained gets a low validation score? If you\u2019re like most people, you\u2019ll immediately demand to know why! Unfortunately, there\u2019s no data set you can ask. You might be tempted to go sleuthing in your validation data set, but unfortunately debugging breaks its ability to screen your models effectively.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>By subjecting your validation data set to analytics, you\u2019re effectively turning your three data sets back into two. Instead of finding help, you\u2019ve unwittingly gone back an era!<\/p>\n<!-- \/wp:paragraph -->\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-59ae7b7 elementor-widget elementor-widget-text-editor\" data-id=\"59ae7b7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The solution lies outside the three data sets you\u2019re already using. To unlock smarter training iteration and hyperparameter tuning, you\u2019ll want to join the cutting edge: an era of four data sets.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>If you think of the other three data sets as giving you inspiration, iteration and rigorous testing, then the fourth fuels acceleration, shortening your AI development cycle through advanced analytics techniques geared at providing clues as to what approaches to try on each round. By embracing four-way data splitting, you\u2019ll be in the best position to take advantage of data abundance! Welcome to the future.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>The technological innovations that led to the two-data set era rapidly ushered in the next phase, a three-data set era of automated inspiration. There\u2019s a more familiar word for it: machine learning.<\/p>\n","protected":false},"author":335,"featured_media":9465,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[226,583,225],"ppma_author":[2050],"class_list":["post-9464","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-ai","tag-data-set","tag-ml"],"authors":[{"term_id":2050,"user_id":335,"is_guest":0,"slug":"cassie-kozyrkov","display_name":"Cassie Kozyrkov","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/04\/medium_df35f80d-2bff-4fe3-b741-a94d51320e00-150x150.jpg","user_url":"https:\/\/careers.google.com\/?src=Online\/LinkedIn\/linkedin_profilepage&amp;utm_source","last_name":"Kozyrkov","first_name":"Cassie","job_title":"","description":"Cassie Kozyrkov is Chief Decision Scientist at Google, Inc. With a unique combination of deep technical expertise, and world-class public-speaking skills, she has provided guidance on more than 100 projects and designed Google's analytics program, personally training over 15000 Googlers in statistics, decision-making, and machine learning.\u00a0"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9464","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/335"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=9464"}],"version-history":[{"count":8,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9464\/revisions"}],"predecessor-version":[{"id":34092,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9464\/revisions\/34092"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/9465"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=9464"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=9464"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=9464"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=9464"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}