{"id":10777,"date":"2020-10-29T09:59:37","date_gmt":"2020-10-29T09:59:37","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/?p=10777"},"modified":"2023-10-16T11:49:56","modified_gmt":"2023-10-16T11:49:56","slug":"next-frontier-machine-learning-anyone-master","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/next-frontier-machine-learning-anyone-master\/","title":{"rendered":"The Next Frontier In Machine Learning Is Something Anyone Can Master"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"10777\" class=\"elementor elementor-10777\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1a591194 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1a591194\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1198568c\" data-id=\"1198568c\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ba01cfb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ba01cfb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0433a80\" data-id=\"0433a80\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-39e5208 elementor-widget elementor-widget-text-editor\" data-id=\"39e5208\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><p id=\"b35f\"><mark>It is frustrating trying to learn about machine learning. Do I use YOLO, Keras, Tensorflow, PyTorch or all of them together somehow? And even if you figure out the PhD stuff, you still have to then master about three other disciplines to get it to work in production; devops, programming, and counting disciplines.<\/mark><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-24d9e57 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"24d9e57\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8ad03d0\" data-id=\"8ad03d0\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-eb19374 elementor-widget elementor-widget-text-editor\" data-id=\"eb19374\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Well I am here for you brothers and sisters in computers, I have good news for your peace of mind.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4137177 elementor-widget elementor-widget-text-editor\" data-id=\"4137177\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The most important thing in getting machine learning to work properly is training data.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-069d49e elementor-widget elementor-widget-text-editor\" data-id=\"069d49e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>And even more importantly, the skills required to make a good training set have nothing to do with math, computers, or engineering.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-f9daca5 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"f9daca5\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6fedfd2\" data-id=\"6fedfd2\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5a01bee elementor-widget elementor-widget-heading\" data-id=\"5a01bee\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is training data?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-cca027f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"cca027f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fa8403b\" data-id=\"fa8403b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6f79902 elementor-widget elementor-widget-image\" data-id=\"6f79902\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"666\" height=\"593\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_1WiF2zrRAUvmHXxfUcRP0w.jpeg\" class=\"attachment-large size-large wp-image-33445\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_1WiF2zrRAUvmHXxfUcRP0w.jpeg 666w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_1WiF2zrRAUvmHXxfUcRP0w-300x267.jpeg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_1WiF2zrRAUvmHXxfUcRP0w-610x543.jpeg 610w\" sizes=\"(max-width: 666px) 100vw, 666px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-f5ffb9f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"f5ffb9f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1293617\" data-id=\"1293617\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-facc311 elementor-widget elementor-widget-text-editor\" data-id=\"facc311\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The tools to take that data and turn it into a <a href=\"https:\/\/www.experfy.com\/blog\/how-to-build-a-machine-learning-model\/\" target=\"_blank\" rel=\"noreferrer noopener\"> machine learning model<\/a> that works <a href=\"https:\/\/blog.machinebox.io\/detect-fake-news-by-building-your-own-classifier-31e516418b1d\" target=\"_blank\" rel=\"noreferrer noopener\">exist today<\/a> and are easy to use. You don\u2019t have to go Berkeley to use them, you just have to be a little familiar with computers.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cb84347 elementor-widget elementor-widget-text-editor\" data-id=\"cb84347\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Is it obvious where the magic is yet?<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-87408e9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"87408e9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-69406fd\" data-id=\"69406fd\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c4e71c3 elementor-widget elementor-widget-heading\" data-id=\"c4e71c3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">The Magic.<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-0bc1c95 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0bc1c95\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a80728d\" data-id=\"a80728d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-13a9f88 elementor-widget elementor-widget-image\" data-id=\"13a9f88\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"275\" height=\"252\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_mgIabNcs-qtTnzsf72SO2Q.gif\" class=\"attachment-large size-large wp-image-33446\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7b370fb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7b370fb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9bf3ce6\" data-id=\"9bf3ce6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4a1057c elementor-widget elementor-widget-text-editor\" data-id=\"4a1057c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Where did the 2000 examples e-mails come from? YOU! That\u2019s right, anyone can create a dataset with just a bit of elbow grease, some focus, and the ability to follow a few simple conventions that I\u2019ll outline below after I finish making this awesome point.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dba643f elementor-widget elementor-widget-text-editor\" data-id=\"dba643f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Sure, you could go online and find an existing dataset of spam, but if you\u2019re actually try to solve a problem at your company, you ought to create your own dataset that comes from the problem you are trying to solve.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d1c8f24 elementor-widget elementor-widget-text-editor\" data-id=\"d1c8f24\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>For example, perhaps your company gets inundated with e-mails that you\u2019d consider spam but don\u2019t fall under the traditional classification of such that perhaps GMAIL uses today.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-173b0cf elementor-widget elementor-widget-text-editor\" data-id=\"173b0cf\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Well, all you need to do is go through your companies e-mails and put examples of spam into one folder, and examples of not spam into another.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-56b4f4d elementor-widget elementor-widget-text-editor\" data-id=\"56b4f4d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>I agree, it is a bit tedious, but it is worth it in the end because what you are doing is GOLD! YOU ARE BASICALLY MAKING GOLD!!<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-e9e578b elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e9e578b\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a773d15\" data-id=\"a773d15\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a31fc9c elementor-widget elementor-widget-image\" data-id=\"a31fc9c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-1024x768.jpeg\" class=\"attachment-large size-large wp-image-33447\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-1024x768.jpeg 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-300x225.jpeg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-768x576.jpeg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-1536x1152.jpeg 1536w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-610x458.jpeg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-750x563.jpeg 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw-1140x855.jpeg 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/10\/1_QXk-5v4IQq06P_kFtiI2qw.jpeg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-97985b1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"97985b1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-d04162c\" data-id=\"d04162c\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6470b37 elementor-widget elementor-widget-text-editor\" data-id=\"6470b37\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>It is practically a miracle what having a human-curated, problem-specific data set will do for your machine learning model\u2019s accuracy. Companies would kill* for such a dataset.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-58724ff elementor-widget elementor-widget-text-editor\" data-id=\"58724ff\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>To summarize, the reasons for this are two-fold;<br \/><br \/><\/p>\n<p><br \/><ol class=\"wp-block-list\"><br \/><li>Machine learning models perform best when they\u2019re trained for a specific problem, and aren\u2019t being too generalized. It is like trying to use a hammer on a screw instead of a screwdriver. A human thinking about the problem they\u2019re trying to solve, and building a curated training set around that is the best way to have success with your model.<\/li><br \/><li><mark>There is less and less room for generalized machine learning models to make a difference for businesses, and so they need to build models based on their own datasets, which some companies might not have or don\u2019t know how to obtain.<\/mark><\/li><br \/><\/ol><br \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3f982cf elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3f982cf\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-70cc4b3\" data-id=\"70cc4b3\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-7ae3e6e elementor-widget elementor-widget-heading\" data-id=\"7ae3e6e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What about visual data?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-bb11cb1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bb11cb1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0c9537b\" data-id=\"0c9537b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-361e8c6 elementor-widget elementor-widget-text-editor\" data-id=\"361e8c6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>It is the same thing. There is <a href=\"https:\/\/techcrunch.com\/2019\/08\/05\/scale-ai-and-its-22-year-old-ceo-lock-down-100-million-to-help-label-silicon-valleys-data\/\" target=\"_blank\" rel=\"noreferrer noopener\">an explosion in data labeling startups<\/a> for visual imagery because of all of what I excellently mentioned above. It really requires the same work as finding spam e-mails; go through your images and label them.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2b89a27 elementor-widget elementor-widget-text-editor\" data-id=\"2b89a27\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now I get it \u2014 it sounds incredibly tedious, and it is tedious, which is why you can hire third-party companies to do the labeling for you, and I know plenty of <a href=\"https:\/\/www.basic.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">good ones<\/a> out there that I\u2019ve worked with in the past.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-848cce9 elementor-widget elementor-widget-text-editor\" data-id=\"848cce9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>But on smaller scales, you can do this yourself, and become a machine learning master! Here are some simple guidelines to get started;<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a184a91 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a184a91\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-520d675\" data-id=\"520d675\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3e3392e elementor-widget elementor-widget-heading\" data-id=\"3e3392e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Guidelines<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-71a2157 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"71a2157\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e250811\" data-id=\"e250811\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1dcf1ba elementor-widget elementor-widget-text-editor\" data-id=\"1dcf1ba\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><\/p>\n<p><ol class=\"wp-block-list\"><\/p>\n<p><li>For classification, you must have a balanced dataset. That means, if you have 1000 spam e-mails, you also need 1000 not-spam e-mails. The tolerance for this is minimum to none, meaning don\u2019t try and get away with 1300 spam and 700 not-spam \u2014 I\u2019ll know<\/li><\/p>\n<p><li>You must have a very clean dataset. That means, don\u2019t have spam e-mails labeled as not-spam and vice versa. Incorrect labels will throw your model off by a surprising amount. I\u2019ve seen accuracy increase by 20% or more after a thorough dataset cleaning.<\/li><\/p>\n<p><li>Be random. If you can, select a random selection of spam and regular e-mails. Don\u2019t just start at the beginning of your e-mail history and work your way backward. If you do this, you could be training a model on what spam USED to look like, not what it looks like today.<\/li><\/p>\n<p><li>Context matters. If you\u2019re training a model to detect people in your office lobby from your security camera, use example images from the same camera, some with people and others without. You must train the model on what is <strong>NOT<\/strong> people as much as you are training it on what is people. Does that make sense? If not\u2026 well\u2026 I dunno this is a blog post so I can\u2019t really answer your questions directly. Moving on\u2026<\/li><\/p>\n<p><li>Consistency matters. When I was building <a href=\"https:\/\/towardsdatascience.com\/i-trained-fake-news-detection-ai-with-95-accuracy-and-almost-went-crazy-d10589aa57c\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">a fake news dataset<\/a>, I kept having to start over because I kept changing my mind about what constitutes fake news; satire? blog posts? opinion pieces? Draw the line in the sand somewhere and stick to it throughout the whole process.<\/li><\/p>\n<p><li>Validation. Set aside 20% of your training dataset and don\u2019t train your model with it. Use the 80% to train the model, and the 20% to then test your model to see how well it is performing. And make sure you choose that 20% randomly.<\/li><\/p>\n<p><\/ol><\/p>\n<p><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3c9b9db elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3c9b9db\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ea8453b\" data-id=\"ea8453b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a7f9e31 elementor-widget elementor-widget-heading\" data-id=\"a7f9e31\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Conclusions<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1eb5987 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1eb5987\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-518cf7a\" data-id=\"518cf7a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-59db56c elementor-widget elementor-widget-text-editor\" data-id=\"59db56c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Good datasets matter. Practice making them, play around with <a href=\"https:\/\/machinebox.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">the tools<\/a> to see what kind of accuracy you get, or just keep this all in mind when you call a data labeling company to label your data for you.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>The most important thing in getting machine learning to work properly is training data. And even more importantly, the skills required to make a good training set have nothing to do with math, computers, or engineering.<\/p>\n","protected":false},"author":227,"featured_media":10778,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[205,94,92,852],"ppma_author":[1909],"class_list":["post-10777","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-data","tag-data-science","tag-machine-learning","tag-training"],"authors":[{"term_id":1909,"user_id":227,"is_guest":0,"slug":"aaron-edell","display_name":"Aaron Edell","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Edell","first_name":"Aaron","job_title":"","description":"Aaron Edell is the co-founder and CEO of Machine Box, Inc., an award-winning startup that builds production-ready machine learning models that anyone can integrate, deploy and scale. A veteran speaker and writer on the topics of machine learning, metadata, and content management, he has published papers on metadata and machine learning and consulted major media and entertainment companies on content management since 2005."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/10777","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/227"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=10777"}],"version-history":[{"count":5,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/10777\/revisions"}],"predecessor-version":[{"id":33450,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/10777\/revisions\/33450"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/10778"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=10777"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=10777"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=10777"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=10777"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}