{"id":22509,"date":"2020-12-17T10:17:12","date_gmt":"2020-12-17T10:17:12","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/limitations-of-machine-learning\/"},"modified":"2023-09-21T16:31:02","modified_gmt":"2023-09-21T16:31:02","slug":"limitations-of-machine-learning","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/limitations-of-machine-learning\/","title":{"rendered":"The Limitations of Machine Learning"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22509\" class=\"elementor elementor-22509\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-158ee60 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"78477\" data-id=\"158ee60\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-24cd2bd\" data-eae-slider=\"67708\" data-id=\"24cd2bd\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-dbe9489 elementor-widget elementor-widget-text-editor\" data-id=\"dbe9489\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p class=\"has-medium-font-size\"><em>Machine learning is now seen as a silver bullet for solving all problems, but sometimes it is not the answer.<\/em><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6e6033c elementor-widget elementor-widget-text-editor\" data-id=\"6e6033c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote class=\"wp-block-quote\"><p>\u201cIf a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.\u201d<\/p><p><strong><em>\u2014 Andrew Ng<\/em><\/strong><\/p><\/blockquote>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-432650f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"43545\" data-id=\"432650f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8b42902\" data-eae-slider=\"66163\" data-id=\"8b42902\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4ac140b elementor-widget elementor-widget-text-editor\" data-id=\"4ac140b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"b184\">Most people reading this are likely familiar with machine learning and the relevant algorithms used to classify or predict outcomes based on data. However, it is important to understand that machine learning is not the answer to all problems. Given the usefulness of machine learning, it can be hard to accept that sometimes it is not the best solution to a problem.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ad1d9f4 elementor-widget elementor-widget-heading\" data-id=\"ad1d9f4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">In this article, I aim to convince the reader that there are times when machine learning is the right solution, and times when it is the wrong solution.<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-330c99a elementor-widget elementor-widget-text-editor\" data-id=\"330c99a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"7587\">Machine learning, a subset of artificial intelligence, has revolutionalized the world as we know it in the past decade. The information explosion has resulted in the collection of massive amounts of data, especially by large companies such as Facebook and Google. This amount of data, coupled with the rapid development of processor power and computer parallelization, has now made it possible to obtain and study huge amounts of data with relative ease.<\/p>\n<p id=\"8fd7\">Nowadays, hyperbole about machine learning and artificial intelligence is ubiquitous. This is perhaps rightly so, given the potential for this field is massive. The number of AI consulting agencies has soared in the past few years, and, according to a report from\u00a0<a href=\"https:\/\/www.techrepublic.com\/article\/the-10-highest-paying-ai-jobs-and-the-massive-salaries-they-command\/\" target=\"_blank\" rel=\"noreferrer noopener\">Indeed<\/a>, the number of jobs related to AI ballooned by 100% between 2015 and 2018.<\/p>\n<p id=\"fcc1\">As of December 2018,\u00a0<a href=\"https:\/\/www.forbes.com\/sites\/gilpress\/2018\/12\/15\/ai-in-2019-according-to-recent-surveys-and-analysts-predictions\/#c39f59714c30\" target=\"_blank\" rel=\"noreferrer noopener\">Forbes<\/a>\u00a0found that 47% of business had at least one AI capability in their business process, and a report by\u00a0<a href=\"http:\/\/deloitte\/\" target=\"_blank\" rel=\"noreferrer noopener\">Deloitte<\/a>\u00a0projects that a penetration rate of enterprise software with AI built-in, and cloud-based AI development services, will reach an estimated 87 and 83 percent respectively. These numbers are impressive \u2014 if you are planning to change careers anytime soon, AI seems like a pretty good bet.<\/p>\n\n<p id=\"26fb\">So it all seems great right? Companies are happy and, presumably, consumers are also happy \u2014 otherwise, the companies would not be using AI.<\/p>\n\n<p id=\"70e8\">It is great, and I am a huge fan of machine learning and AI. However, there are times when using machine learning is just unnecessary, does not make sense, and other times when its implementation can get you into difficulties.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b4e5165 elementor-widget elementor-widget-heading\" data-id=\"b4e5165\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Limitation 1 \u2014 Ethics<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-510c8da elementor-widget elementor-widget-text-editor\" data-id=\"510c8da\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"66f5\">Machine learning, a subset of artificial intelligence, has revolutionalized the world as we know it in the past decade. The information explosion has resulted in the collection of massive amounts of data, especially by large companies such as Facebook and Google. This amount of data, coupled with the rapid development of processor power and computer parallelization, has now made it possible to obtain and study huge amounts of data with relative ease.<\/p>\n\n<p id=\"b9f8\">It is easy to understand why machine learning has had such a profound impact on the world, what is less clear is exactly what its capabilities are, and perhaps more importantly, what its limitations are. Yuval Noah Harari famously coined the term \u2018dataism\u2019, which refers to a putative new stage of civilization we are entering in which we trust algorithms and data more than our own judgment and logic.<\/p>\n\n<p id=\"1dc0\">Whilst you may find this idea laughable, remember the last time you went on vacation and followed the instructions of a GPS rather than your own judgment on a map \u2014 do you question the judgment of the GPS? People have literally driven into lakes because they blindly followed the instructions from their GPS.<\/p>\n\n<p id=\"a0d5\">The idea of trusting data and algorithms more than our own judgment has its pros and cons. Obviously, we benefit from these algorithms, otherwise, we wouldn\u2019t be using them in the first place. These algorithms allow us to automate processes by making informed judgments using available data. Sometimes, however, this means replacing someone\u2019s job with an algorithm, which comes with ethical ramifications. Additionally, who do we blame if something goes wrong?<\/p>\n\n<p id=\"7ada\">The most commonly discussed case currently is self-driving cars \u2014 how do we choose how the vehicle should react in the event of a fatal collision? In the future will we have to select which ethical framework we want our self-driving car to follow when we are purchasing the vehicle?<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bd9860b elementor-widget elementor-widget-heading\" data-id=\"bd9860b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">If my self-driving car kills someone on the road, whose fault is it?<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4bc8c14 elementor-widget elementor-widget-text-editor\" data-id=\"4bc8c14\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f649\">Whilst these are all fascinating questions, they are not the main purpose of this article. Clearly, however, machine learning cannot tell us anything about what normative values we should accept, i.e. how we should act in the world in a given situation. As David Hume famously said, one cannot \u2018derive an ought from an is\u2019.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d721a04 elementor-widget elementor-widget-heading\" data-id=\"d721a04\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Limitation 2 \u2014 Deterministic Problems<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c6ba68b elementor-widget elementor-widget-text-editor\" data-id=\"c6ba68b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"40cf\">This is a limitation I personally have had to deal with. My field of expertise is environmental science, which relies heavily on computational modeling and using sensors\/IoT devices.<\/p>\n\n<p id=\"a247\">Machine learning is incredibly powerful for sensors and can be used to help calibrate and correct sensors when connected to other sensors measuring environmental variables such as temperature, pressure, and humidity. The correlations between the signals from these sensors can be used to develop self-calibration procedures and this is a hot research topic in my research field of atmospheric chemistry.<\/p>\n\n<p id=\"e298\">However, things get a bit more interesting when it comes to computational modeling.<\/p>\n\n<p id=\"0c74\">Running computer models that simulate global weather, emissions from the planet, and transport of these emissions is very computationally expensive. In fact, it is so computationally expensive, that a research-level simulation can take weeks even when running on a supercomputer.<\/p>\n\n<p id=\"8e4d\">Good examples of this are MM5 and WRF, which are numerical weather prediction models that are used for climate research and for giving you weather forecasts on the morning news. Wonder what weather forecasters do all day? Run and study these models.<\/p>\n\n<p id=\"ea6e\">Running weather models is fine, but now that we have machine learning, can we just use this instead to obtain our weather forecasts? Can we leverage data from satellites, weather stations, and use an elementary predictive algorithm to discern whether it is going to rain tomorrow?<\/p>\n\n<p id=\"bb02\">The answer is, surprisingly, yes. If we have knowledge of the air pressures around a certain region, the levels of moisture in the air, wind speeds, and information about neighboring points and their own variables, it becomes possible to train, for example, a neural network. But at what cost?<\/p>\n\n<p id=\"a52c\">Using a neural network with a thousand inputs to determine whether it will rain tomorrow in Boston is possible. However, utilizing a neural network misses the entire physics of the weather system.<\/p>\n\n<p id=\"e390\"><mark><strong>Machine learning is stochastic, not deterministic.<\/strong><\/mark><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-614af2e elementor-widget elementor-widget-heading\" data-id=\"614af2e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">A neural network does not understand Newton\u2019s second law, or that density cannot be negative \u2014 there are no physical constraints.<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7c27cfb elementor-widget elementor-widget-text-editor\" data-id=\"7c27cfb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"74cd\">However, this may not be a limitation for long. There are multiple researchers looking at adding physical constraints to neural networks and other algorithms so that they can be used for purposes such as this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9ecf410 elementor-widget elementor-widget-heading\" data-id=\"9ecf410\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Limitation 3 \u2014 Data<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1af4dbf elementor-widget elementor-widget-text-editor\" data-id=\"1af4dbf\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a31f\">This is the most obvious limitation. If you feed a model poorly, then it will only give you poor results. This can manifest itself in two ways: lack of data, and lack of&nbsp;<strong>good<\/strong>&nbsp;data.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bd702e7 elementor-widget elementor-widget-heading\" data-id=\"bd702e7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Lack of Data<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c5bceac elementor-widget elementor-widget-text-editor\" data-id=\"c5bceac\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0636\">Many machine learning algorithms require large amounts of data before they begin to give useful results. A good example of this is a neural network. Neural networks are data-eating machines that require copious amounts of training data. The larger the architecture, the more data is needed to produce viable results. Reusing data is a bad idea, and data augmentation is useful to some extent, but having more data is always the preferred solution.<\/p>\n\n<p id=\"497a\">If you can get the data, then use it.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-804327f elementor-widget elementor-widget-heading\" data-id=\"804327f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Lack of Good Data<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4334a20 elementor-widget elementor-widget-text-editor\" data-id=\"4334a20\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"d66a\">Despite the appearance, this is not the same as the above comment. Let\u2019s imagine you think you can cheat by generating ten thousand fake data points to put in your neural network. What happens when you put it in?<\/p>\n\n<p id=\"8146\">It will train itself, and then when you come to test it on an unseen data set, it will not perform well. You had the data but the quality of the data was not up to scratch.<\/p>\n\n<p id=\"e2bb\">In the same way that having a lack of good features can cause your algorithm to perform poorly, having a lack of good ground truth data can also limit the capabilities of your model. No company is going to implement a machine learning model that performs worse than human-level error.<\/p>\n\n<p id=\"aa57\">Similarly, applying a model that was trained on a set of data in one situation may not necessarily apply as well to a second situation. The best example of this I have found so far is in breast cancer prediction.<\/p>\n\n<p id=\"fbc3\">Mammography databases have a lot of images in them, but they suffer from one problem that has caused significant issues in recent years \u2014 almost all of the x-rays are from white women. This may not sound like a big deal, but actually, black women have been shown to be&nbsp;<a href=\"https:\/\/www.acr.org\/Media-Center\/ACR-News-Releases\/2018\/New-ACR-and-SBI-Breast-Cancer-Screening-Guidelines-Call-for-Significant-Changes-to-Screening-Process\" rel=\"noopener\">42 percent more likely to die from breast cancer<\/a>&nbsp;due to a wide range of factors that may include differences in detection and access to health care. Thus, training an algorithm primarily on white women adversely impacts black women in this case.<\/p>\n\n<p id=\"52be\">What is needed in this specific case is a larger number of x-rays of black patients in the training database, more features relevant to the cause of this 42 percent increased likelihood, and for the algorithm to be more equitable by stratifying the dataset along the relevant axes.<\/p>\n\n<p id=\"c61b\">If you are skeptical of this or would like to know more, I recommend you look at this&nbsp;<a href=\"https:\/\/news.mit.edu\/2019\/using-ai-predict-breast-cancer-and-personalize-care-0507\" rel=\"noopener\">article<\/a>.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-454bf70 elementor-widget elementor-widget-heading\" data-id=\"454bf70\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Limitation 4 \u2014 Misapplication<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ddb1fdb elementor-widget elementor-widget-text-editor\" data-id=\"ddb1fdb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"ded5\">Related to the second limitation discussed previously, there is purported to be a \u201c<a href=\"https:\/\/futurism.com\/machine-learning-crisis-science\" rel=\"noopener\"><em>crisis of machine learning in academic research<\/em><\/a>\u201d whereby people blindly use machine learning to try and analyze systems that are either deterministic or stochastic in nature.<\/p>\n\n<p id=\"5b1d\">For reasons discussed in limitation two, applying machine learning on deterministic systems will succeed, but the algorithm which not be learning the relationship between the two variables, and will not know when it is violating physical laws. We simply gave some inputs and outputs to the system and told it to learn the relationship \u2014 like someone translating word for word out of a dictionary, the algorithm will only appear to have a facile grasp of the underlying physics.<\/p>\n\n<p id=\"ea1d\">For stochastic (random) systems, things are a little less obvious. The crisis of machine learning for random systems manifests itself in two ways:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6fba7de elementor-widget elementor-widget-text-editor\" data-id=\"6fba7de\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>P-hacking<\/li><li>Scope of the analysis<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6f5fd7b elementor-widget elementor-widget-heading\" data-id=\"6f5fd7b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">P-hacking<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-31c4ea6 elementor-widget elementor-widget-text-editor\" data-id=\"31c4ea6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"6272\">When one has access to large data, which may have hundreds, thousands, or even millions of variables, it is not too difficult to find a statistically significant result (given that the level of statistical significance needed for most scientific research is&nbsp;<em>p &lt; 0.05<\/em>). This often leads to spurious correlations being found that are usually obtained by p-hacking (looking through mountains of data until a correlation showing statistically significant results is found). These are not true correlations and are just responding to the noise in the measurements.<\/p>\n\n<p id=\"b052\">This has resulted in individuals \u2018fishing\u2019 for statistically significant correlations through large data sets, and masquerading these as true correlations. Sometimes, this is an innocent mistake (in which case the scientist should be better trained), but other times, it is done to increase the number of papers a researcher has published \u2014 even in the world of academia, competition is strong and people will do anything to improve their metrics.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c704a73 elementor-widget elementor-widget-heading\" data-id=\"c704a73\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Scope of the Analysis<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6bfa973 elementor-widget elementor-widget-text-editor\" data-id=\"6bfa973\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"6be9\">There are inherent differences in the scope of the analysis for machine learning as compared with statistical modeling \u2014 statistical modeling is inherently confirmatory, and machine learning is inherently exploratory.<\/p>\n\n<p id=\"b9e3\">We can consider confirmatory analysis and models to be the kind of thing that someone does in a Ph.D. program or in a research field. Imagine you are working with an advisor and trying to develop a theoretical framework to study some real-world system. This system has a set of pre-defined features that it is influenced by, and, after carefully designing experiments and developing hypotheses you are able to run tests to determine the validity of your hypotheses.<\/p>\n\n<p id=\"dc3e\">Exploratory, on the other hand, lacks a number of qualities associated with the confirmatory analysis. In fact, in the case of truly massive amounts of data and information, the confirmatory approaches completely break down due to the sheer volume of data. In other words, it simply is not possible to carefully lay out a finite set of testable hypotheses in the presence of hundreds, much less thousands, much less millions of features.<\/p>\n\n<p id=\"a11f\">Therefore and, again, broadly speaking, machine learning algorithms and approaches are best suited for exploratory predictive modeling and classification with massive amounts of data and computationally complex features. Some will contend that they can be used on \u201csmall\u201d data but why would one do so when classic, multivariate statistical methods are so much more informative?<\/p>\n\n<p id=\"b246\">ML is a field which, in large part, addresses issues derived from information technology, computer science, and so on, these can be both theoretical and applied problems. As such, it is related to fields such as physics, mathematics, probability, and statistics but ML is really a field unto itself, a field which is unencumbered by the concerns raised in the other disciplines. Many of the solutions ML experts and practitioners come up with are painfully mistaken\u2026but they get the job done.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-31923b3 elementor-widget elementor-widget-heading\" data-id=\"31923b3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Limitation 5 \u2014 Interpretability<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-52fa741 elementor-widget elementor-widget-text-editor\" data-id=\"52fa741\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f6e3\"><a href=\"https:\/\/www.experfy.com\/blog\/ai-ml\/introduction-to-the-white-box-ai-the-concept-of-interpretability\/\" target=\"_blank\" rel=\"noreferrer noopener\">Interpretability <\/a>is one of the primary problems with machine learning. An AI consultancy firm trying to pitch to a firm that only uses traditional statistical methods can be stopped dead if they do not see the model as interpretable. If you cannot convince your client that you understand how the algorithm came to the decision it did, how likely are they to trust you and your expertise?<\/p>\n\n<p id=\"3216\">As bluntly stated in \u201c<em>Business Data Mining \u2014 a machine learning perspective<\/em>\u201d:<\/p>\n\n<p id=\"053c\"><em>\u201cA business manager is more likely to accept the [machine learning method] recommendations if the results are explained in business terms\u201d<\/em><\/p>\n\n<p id=\"c565\">These models as such can be rendered powerless unless they can be interpreted, and the process of human interpretation follows rules that go well beyond technical prowess. For this reason, interpretability is a paramount quality that machine learning methods should aim to achieve if they are to be applied in practice.<\/p>\n\n<p id=\"228b\">The blossoming -omics sciences (genomics, proteomics, metabolomics and the like), in particular, have become the main target for machine learning researchers precisely because of their dependence on large and non-trivial databases. However, they suffer from the lack of interpretability of their methods, despite their apparent success.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-37acfaa elementor-widget elementor-widget-heading\" data-id=\"37acfaa\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Summary and Peter Voss\u2019 List<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-865455c elementor-widget elementor-widget-text-editor\" data-id=\"865455c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c727\">While it is undeniable that AI has opened up a\u00a0<a href=\"https:\/\/www.technologyreview.com\/s\/545416\/could-ai-solve-the-worlds-biggest-problems\/\" target=\"_blank\" rel=\"noreferrer noopener\">wealth of promising opportunities<\/a>, it has also led to the emergence of a mindset that can be best described as \u201c<a href=\"https:\/\/www.sciencealert.com\/ai-machine-learning-solve-all-humanity-problems-unrealistic\" target=\"_blank\" rel=\"noreferrer noopener\">AI solutionism<\/a>\u201d. This is the philosophy that, given enough data, machine learning algorithms can\u00a0<a href=\"https:\/\/www.technologyreview.com\/s\/545416\/could-ai-solve-the-worlds-biggest-problems\/\" target=\"_blank\" rel=\"noreferrer noopener\">solve all of humanity\u2019s problems<\/a>.<\/p>\n\n<p id=\"0f35\">As I hope I have made clear in this article, there are limitations that, at least for the time being, prevent that from being the case. A neural network can never tell us how to be a good person, and, at least for now, do not understand Newton\u2019s laws of motion or Einstein\u2019s theory of relativity. There are also fundamental limitations grounded in the underlying theory of machine learning, called computational learning theory, which are primarily statistical limitations. We have also discussed issues associated with the scope of the analysis and the dangers of p-hacking, which can lead to spurious conclusions. There are also issues with the interpretability of results, which can negatively impact businesses that are unable to convince clients and investors that their methods are accurate and reliable.<\/p>\n\n<p id=\"7e86\">Whilst in this article I have covered very broadly some of the most important limitations of AI, to finish, I will outline a list published in an\u00a0<a href=\"https:\/\/medium.com\/@petervoss\/why-machine-learning-wont-cut-it-f523dd2b20e3#.wifeugkuq\" target=\"_blank\" rel=\"noreferrer noopener\">article<\/a>\u00a0by Peter Voss in October 2016, outlining a more comprehensive list on the limitations of AI. Whilst current mainstream techniques can be very powerful in narrow domains, they will\u00a0<em>typically\u00a0<\/em>have some or all of a list of constraints that he sets out and which I\u2019ll quote in full here:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-859fbc6 elementor-widget elementor-widget-text-editor\" data-id=\"859fbc6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>Each narrow application needs to be specially trained<\/li><li>Require large amounts of&nbsp;<em>hand-crafted, structured<\/em>&nbsp;training data<\/li><li>Learning must generally be supervised: Training data must be tagged<\/li><li>Require lengthy offline\/ batch training<\/li><li>Do not learn incrementally or interactively, in real-time<\/li><li>Poor transfer learning ability, reusability of modules, and integration<\/li><li>Systems are opaque, making them very hard to debug<\/li><li>Performance cannot be audited or guaranteed at the \u2018long tail\u2019<\/li><li>They encode correlation, not causation or ontological relationships<\/li><li>Do not encode entities or spatial relationships between entities<\/li><li>Only handle very narrow aspects of natural language<\/li><li>Not well suited for high-level, symbolic reasoning or planning<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ba20e77 elementor-widget elementor-widget-text-editor\" data-id=\"ba20e77\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"351b\">All that being said, machine learning and artificial intelligence will continue to revolutionize industry and will only become more prevalent in the coming years. Whilst I recommend you utilize machine learning and AI to their fullest extent, I also recommend that you remember the limitations of the tools you use \u2014 after all, nothing is perfect.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Machine learning is now seen as a silver bullet for solving all problems, but sometimes it is not the answer. There are times when machine learning is the right solution, and times when it is the wrong solution.<\/p>\n","protected":false},"author":682,"featured_media":18195,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[183],"tags":[97,1135,92],"ppma_author":[3471],"class_list":["post-22509","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence","tag-ethics","tag-machine-learning"],"authors":[{"term_id":3471,"user_id":682,"is_guest":0,"slug":"matthew-stewart","display_name":"Matthew Stewart","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/04\/medium_c57055f3-5301-4262-af65-4cc7d40cbf3d-150x150.jpg","author_category":"","user_url":"https:\/\/criticalfutureglobal.com\/","last_name":"Stewart","first_name":"Matthew","job_title":"","description":"Matthew Stewart is a Machine Learning consultant on AI for\u00a0<a href=\"https:\/\/www.criticalfutureglobal.com\/\" target=\"_blank\" rel=\"noopener\">Critical Future<\/a>, and machine learning engineer at Scalable Magic, an AI-based digital media startup. He is also a Graduate Teaching Assistant and a Ph.D. Candidate at Harvard University."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22509","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/682"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22509"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22509\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18195"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22509"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22509"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22509"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22509"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}