Domain Expertise vs Machine Learning Debate

When I’m getting ready to reason with a man, I spend one-third of my time thinking about myself and what I am going to say and two-thirds thinking about him and what he is going to say. —Abraham Lincoln (1809-1865)

If human beings had to reason with a machine, or more specifically, had to teach machines to reason—is Lincoln’s formula still relevant?

Framing the debate on the superiority of machine learning vs. domain expertise

This debate has gone on for quite some time now. It was even rehearsed by several luminaries of the data science world at a Strata conference in 2012, as you can see in the video here.

With the rapid progress in the field of artificial intelligence, human beings are apprehensive about robots conquering human thinking! If one extends that argument to expert systems of data science, then one can easily see why some people would see machine learning supplanting domain knowledge.

But the real question lies not in exploring whether data scientists require domain knowledge to build expert systems, but whether the representation phase of data can be accurately achieved without involving domain experts.  Domain experts are presumed to be far more capable of identifying, articulating, and demonstrating day-to-day process problems in business.  As these experts can jolly well explain a research problem to peers, it is probably absurd to even consider that an expert system can be constructed without their involvement or guidance. The same should hold true in the case of superior algorithms required to create such systems.

Let us take the example of creating a machine learning system that can distinguish spam messages from non-spam messages based on the science of algorithmic filters. In this case, the email users, including the data scientists themselves, are domain experts, since they happen to use the email system every day. To build an expert anti-spam application, developers would be expected to work closely with the user community to understand their spam-related problems. Here the developers themselves may be a sub-domain expert within the broad category of broad mail-domain experts (users).

Thus, we realize that the fundamental strategy behind designing an expert system lies in identifying patterns in usage data; and then deriving general rules or principles from those patterns. This principle of generalization points to the ability of an expert system to perform new, tasks after having experienced a learning data set.

So here we establish the first criterion for performance: Performing tasks based on prior experience with data. We may even conclude that machine learning operates from the standpoint of prediction—based on known properties discovered from the training data.

Domain experts usually have an in-depth knowledge of operational processes or tasks and also understand the rules of thumb that control the domain. Domain knowledge is gained from actual, practical experience. In reality, data scientists attempt to convert this practical knowledge into meaningful algorithms to automate processing tasks.

Thus data scientists and domain experts are the two complimentary sides of a complete system development project. In developing expert systems, it is not enough for data scientists to ask questions or find patterns—they also need to understand the results.

An example from the field of Molecular Biology

Machine learning may be necessary to research the human cell, but it cannot be the starting point. Therefore, data scientists must collaborate with molecular biologists to understand the complexity of cell behavior under different conditions to analyze or process the findings of one’s research. The biologist is equipped with a priori knowledge, known as domain expertise that provides accurate insights for monitoring and analyzing the results of an experiment or a series of experiments.

In ideal situations, machine learning will be utilized not just to predict the behavior of complex processes or organisms but also to harness the power of machine learning to help intellectual communities understand the reasons behind the behavior.

The flip side of the argument: Is domain knowledge really necessary?

An example of a competition in a crowd-sourced environment

In a Kaggle competition, a panel of space agencies developed a competition writing algorithms for studying the impact of darkness on images of space. The winner happened to be a student of glaciology, thus proving domain knowledge was inconsequential in winning this competition.

This competition routinely raises this question:  if data scientists can develop such fine algorithms in any discipline or business field, then who needs domain experts?

If now we compare the two opposite viewpoints presented above, the general consensus is likely to sway in favor of a compromise—hitting the middle ground between machine learning and domain expertise.

The domain experts still play a critical role in helping data scientists understand and articulate the business or process problem and help understand the results. As a case in point, an economist may create the best algorithm to automatically grade SAT answer papers, but the education experts must be involved in designing the grading criteria and sample questions and answers. These educators are the only people proficient in interpreting the output.

Let’s compare the relative strengths and weaknesses of domain experts and data scientists.

Machine Learning Experts/Data Scientists: Pros

  1. Can ask questions without understanding processes or tasks
  2. Can study data to discover repetitive patterns
  3. Can reconstruct process knowledge by studying data
  4. Can use data patterns to predict results

Machine Learning Experts/Data Scientists: Cons

  1. Cannot analyze the existing models of business processes accurately
  2. Have the potential to misuse models
  3. Lack depth of understanding of business functions

Domain Experts: PROs

  1. Can provide practical insights from past experience
  2. Can help refine a question with practical knowledge
  3. Can accurately shape or model tasks for analysis
  4. Can guide analytics in the right direction
  5. Can evaluate the effectiveness of a result

A domain expert’s strength lies in close observation of day-to-day process problems, while a data scientist’s strength is building generalized solutions in the form of algorithms by studying specific data patterns.

After reviewing the previous arguments, we can take a more balanced view that both the data scientists and domain experts need to collaborate or work in harmony to accurately solve business process problems. A final example helps show the need for collaboration.

An example: Learning Management Systems (LMS)

A critical first step in developing a learning system starts with gathering information about potential learner’s competency in specific skills or tasks. The objective behind collecting this information is to estimate the scope of the learning system in terms of topic or task coverage. Quite often, the approach is back-to-front; the learning outcomes and performance tests are defined before creating the learning content.

While trying to create an automated system that claims to help the learner achieve the desirable performance objectives, the machine learning experts have to collaborate with education experts, subject-area experts, and learning and development experts to design and create an effective, automated learning system.  This example helps us see the value of cross-domain experts in developing learning systems.

  • Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Leave a Comment
    Next Post

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »