Using Machine Learning to Detect Tax Fraud

Deborah Pianko Deborah Pianko
May 2, 2019 AI & Machine Learning

The terms “artificial intelligence” and “machine learning” immediately bring up thoughts from movies like “The Matrix” where machines become self-aware and want to end the world. While this may make for an exciting plot in Hollywood, it is not a reality outside of the theater.

In real life, however, machine learning—which gives computers the ability to see hidden patterns in existing data and progressively improve performance (“learn”) without being explicitly programmed—serves as a practical tool to data analysts. The job is not to turn robots into people, but instead, efficiently find recurring themes that would otherwise remain obscured inside of large amounts of data to provide end-users with actionable information.

These technologies have played a pivotal role in reducing fraud, waste, and abuse in organizations of all types and sizes, including departments of revenue that collect taxes. As tax season ends, tax agencies can especially benefit from the use of several different machine learning techniques to improve upon their current levels of fraud detection. Yet the question remains, “Which technique is best?” 

Supervised Learning

At SAS, we say the best technique is multiple techniques. And this is even more true in the world of fraud detection, where traditional business rules and various types of machine learning are layered upon one another so that the whole is greater than the sum of its parts. One of the most popular types of machine learning is called supervised learning. Also referred to as predictive modeling, this enables tax agencies to use all the fraud and audit cases they’ve worked in the past to figure out which attributes of these cases are most highly correlated with a successful case. They may then use this “FBI’s Most Wanted” sketch to automatically search for similar cases in the future.

Supervised learning techniques like predictive modeling are used by tax agencies who want to find more fish in holes they have fished previously. For this reason, it is used heavily to identity theft detection and audit selection in previously audited taxpayer segments. Therefore, the machine must have prior cases from which to learn. However, the case data used to teach the machine must also include failures (e.g. no change audits) to learn what characteristics were in play when it turned out to be a bad lead.

Machine learning not only saves time on building a fraud detection routine but also can remove bias against certain taxpayers if done properly. Auditors traditionally rely on gut instincts about fraud to build a set of “if/then” rules to finding taxpayers to audit. While this method effectively harnesses the critical eye of an experienced auditor, it typically misses subtle clues hidden in the data and can lead to overfishing in traditionally lucrative parts of the pond.

The Value of Unsupervised Learning

If supervised learning is fishing where people have fished before, then another type of machine learning—unsupervised learning—is fishing where no one has fished before. While supervised learning improves the number of fish caught in known areas, taxpayer segments with high rates of non-compliance remain undiscovered. 

Unsupervised machine learning is used when prior case data is not available and the tax agency doesn’t necessarily set out knowing what they’re looking for (hence the term “unsupervised”). The question they’re trying to answer is: What don’t I know about yet? This technique allows the machine to go on an unsupervised walkabout without having been previously exposed to your data and bring your attention to anything that seems out of the ordinary. 

One of these techniques, called “clustering,” is one way of doing just this. The machine automatically puts all tax returns into groups that have similarities—or clusters—and then identifies returns falling outside these clusters as outliers that require additional investigation.  

Both supervised and unsupervised approaches provide tremendous value for government tax authorities, especially when used upon complex data sets like tax returns, financial transactions, taxpayer contacts, accounts receivables, network traffic, and even employee activities.

Changing with the Times

Machine learning can provide departments of revenue with immediate benefits in reducing fraud and abuse, however, there is still room for growth in improving models. Governments want to improve tax models to create a “feedback loop.” This is where machine learning environments using new incoming data to constantly change the attributes and weights of the “FBI’s Most Wanted” sketch in real time.

Machines can also be configured to automatically alert users that their current predictive models have degraded in accuracy, meaning that different parts may need to be reconfigured to get the most accurate answers.

Machine learning can be a difficult concept for some to understand. Yet the tax system has remained ripe with fraud and abuse despite the best efforts of tax auditors. With each change of the tax code, there will be new fraudsters looking for loopholes and blind spots to exploit. While machine learning will not catch every criminal, these capabilities provide auditors with a valuable and powerful tool to reduce the amount of money stolen.

That money belongs to the people and should be spent on improving government programs. It does not belong to thieves that found a hole in the system. Machine learning can help close these gaps.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Deborah Pianko

    Tags
    Machine Learning
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post

    AI-Powered Strategy Will Transform The C-Suite

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.