Is Python Becoming the King of the Data Science Forest?

Cameron Turner Cameron Turner
October 21, 2016 Future of Work
R has served as the de facto tool used for big data analytics. According to RedMonk’s bi-annual rankings of the top 20 programming languages, as measured by activity on StackOverflow and GitHub repositories,  R is ranked #15 among all programming languages. This ranking is both surprising and impressive for a domain-specific language.  Interestingly, Python is at the top of the list among the top-dogs—Java, Javascript and PHP—that are used for general purpose web-programming.  Lesser-known languages such as Julia are also represented in the rankings, although not in the top-20 list.  The first quarter plot for 2014 ranking is shown here.

Despite R’s apparent success—as MongoDB’s Matt Asays has argued—while R was once the language of choice for data scientists, it is quickly ceding ground to Python. One of the reasons for a perceived decrease in R’s popularity it is argued is its complex programming environment that requires special training. According to Robert Muenchen at the University of Tennessee, even for data scientists who possess expertise in statistical tools such as SAS, SPSS and Stata—R remains a tough language to master.  This is largely because R uses misleading function and parameter names. If SAS, SPSS and Stata use the sort command to sort data sets, R has the same command but it does not sort data sets; instead R uses the command to sort individual variables.  In R, one must use the order function to sort data sets and that too happens in a rather convoluted manner.  In addition, R suffers from sparse non-standard output, and it has too many commands to master. R also provides a sloppy control over variables and naming or remaining variables is an overly complex exercise, at least for the novice.

Python, on the other hand, is much easier to master—even though it may still be harder than other programming languages used to develop web applications.  The fact that Python is used to develop web applications is what makes it an attractive choice for data science.  If you are struggling to find qualified data scientists, why not train your existing Python developers to work in your data science teams?  Furthermore, given the wide applicability of the language, we are witnessing what Tal Yarkoni of UT Austin calls the Pythonification of tools that are appropriate for data science.
The increasing homogenization (Pythonification?) of the tools I use on a regular basis primarily reflects the spectacular recent growth of the Python ecosystem. A few years ago, you couldn’t really do statistics in Python unless you wanted to spend most of your time pulling your hair out and wishing Python were more like R (which, is a pretty remarkable confession considering what R is like). Neuroimaging data could be analyzed in SPM (MATLAB-based), FSL, or a variety of other packages, but there was no viable full-featured, free, open-source Python alternative. Packages for machine learning, natural language processing, web application development, were only just starting to emerge.
These days, tools for almost every aspect of scientific computing are readily available in Python. And in a growing number of cases, they’re eating the competition’s lunch. While there is little doubt that Python is going to become a dominant language for data scientists, how is it faring against other languages of the web?

The growing popularity of Python is not surprising given its versatility.  To be sure, R still is far more powerful when it comes to data analytics.  However, Python is catching up, but does this really mean that its large number of followers are going to supplant R?  The chart above needs to be nuanced because it compares apples and oranges.  Charts like these are often used to make misguided arguments about R’s impending demise.  So, how does demand for R compare with other statistical tools such as SAS?

The growing popularity of Python is not surprising given its versatility.  To be sure, R still is far more powerful when it comes to data analytics.  However, Python is catching up, but does this really mean that its large number of followers are going to supplant R?  The chart above needs to be nuanced because it compares apples and oranges.  Charts like these are often used to make misguided arguments about R’s impending demise.  So, how does demand for R compare with other statistical tools such as SAS?

This helps us nuance our understanding and see that while Python has significant traction, given its use in domains other than data science, the demand for R is also on the rise and the latter is not going to become obsolete anytime soon.  R continues to enjoy popularity among academics.
We would love to hear how you are staffing your current teams and what role R and Python play in your environment.
See a follow-up post on this topic: Can Python Replace R for Developing Predictive Models?
Need help with your R or Python project or simply need data scientists and visualizers to augment your existing team? Post your project in the Experfy Marketplace to solicit bids from vetted experts. Experfy has the world’s top data experts, who specialize in specific industry data and can ask the right questions of your data. You can also email [email protected] for more information.
  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Cameron Turner

    Tags
    Big DataData ScienceData Science ForestPython
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Marketing Analytics and Content Marketing

    Marketing Analytics and Content Marketing

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Future of Work
    Future of Work
    Where Will, the Future of Work, Take Place? (Office, Remote, or Hybrid)

    Changes in machine learning and advances in automation have already changed work for many industries. Still, the COVID-19 pandemic and recent labor shortages forced many brands to rethink what the future of work will look like going forward. The U.S. Bureau of Labor Statistics recently reported a dropping unemployment rate of 4.2% during the fourth

    4 MINUTES READ Continue Reading »
    Future of Work
    7 Tech Companies Changing the Future of Work

    Much has been reported about the impact of the COVID-19 pandemic in the traditional workplace. The effects of the pandemic are expected to be long-lasting, making it challenging for companies across all industries to keep operations running smoothly. Globally, companies had to be agile and adapt to a new normal, in addition to dealing with

    5 MINUTES READ Continue Reading »
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.