A Short History Of Big Data

Mark Rijmenam Mark Rijmenam
June 7, 2019 Big Data, Cloud & DevOps

90% of the available data has been created in the last two years and the term Big Data has been around 2005 when it was launched by O’Reilly Media in 2005.  However, the usage of Big Data and the need to understand all available data has been around much longer.

 

In fact, the earliest records of using data to track and control businesses date back from 7.000 years ago when accounting was introduced in Mesopotamia in order to record the growth of crops and herds. Accounting principles continued to improve, and in 1663, John Graunt recorded and examined all information about mortality roles in London. He wanted to gain an understanding and build a warning system for the ongoing bubonic plague. In the first recorded record of statistical data analysis, he gathered his findings in the book Natural and Political Observations Made upon the Bills of Mortality, which provides great insights into the causes of death in the seventeenth century. Because of his work, Graunt can be considered the father of statistics. From there on, the accounting principles improved but nothing spectacular happened. Until in the 20<sup>th</sup> century the information age started. The earliest remembrance of modern data is from 1887 when Herman Hollerith invented a computing machine that could read holes punched into paper cards in order to organize census data.

The 20th Century

The first major data project is created in 1937 and was ordered by the Franklin D. Roosevelt’s administration in the USA. After the Social Security Act became law in 1937, the government had to keep track of contribution from 26 million Americans and more than 3 million employers. IBM got the contract to develop punch card-reading machine for this massive bookkeeping project.

The first data-processing machine appeared in 1943 and was developed by the British to decipher Nazi codes during World War II. This device, named Colossus, searched for patterns in intercepted messages at a rate of 5.000 characters per second. Thereby reducing the task from weeks to merely hours.

In 1952 the National Security Agency (NSA) is created and within 10 years contract more than 12.000 cryptologists. They are confronted with information overload during the Cold War as they start collecting and processing intelligence signals automatically.

In 1965 the United States Government decided to build the first data centre to store over 742 million tax returns and 175 million sets of fingerprints by transferring all those records onto magnetic computer tape that had to be stored in a single location. The project was later dropped out of fear for ‘Big Brother’, but it is generally accepted that it was the beginning of the electronic data storage era.

In 1989 British computer scientist Tim Berners-Lee invented eventually the World Wide Web. He wanted to facilitate the sharing of information via a ‘hypertext’ system. Little could he know at the moment the impact of his invention.

As of the ‘90s, the creation of data is spurred as more and more devices are connected to the internet. In 1995 the first super-computer is built, which was able to do as much work in a second than a calculator operated by a single person can do in 30.000 years.

The 21st Century

In 2005 Roger Mougalas from O’Reilly Media coined the term Big Data for the first time, only a year after they created the term Web 2.0. It refers to a large set of data that is almost impossible to manage and process using traditional business intelligence tools.

2005 is also the year that Hadoop was created by Yahoo! built on top of Google’s MapReduce. Its goal was to index the entire World Wide Web and nowadays the open-source Hadoop is used by a lot of organizations to crunch through huge amounts of data.

As more and more social networks start appearing and the Web 2.0 takes flight, more and more data is created on a daily basis. Innovative startups slowly start to dig into this massive amount of data and also governments start working on Big Data projects. In 2009 the Indian government decides to take an iris scan, fingerprint, and photograph of all of its 1.2 billion inhabitants. All this data is stored in the largest biometric database in the world.

In 2010 Eric Schmidt speaks at the Techonomy conference in Lake Tahoe in California and he states that “there were 5 exabytes of information created by the entire world between the dawn of civilization and 2003. Now that same amount is created every two days.”

In 2011 the McKinsey report on Big Data: The next frontier for innovation, competition, and productivity, states that in 2018 the USA alone will face a shortage of 140.000 – 190.000 data scientist as well as 1.5 million data managers.

In the past few years, there has been a massive increase in Big Data startups, all trying to deal with Big Data and helping organizations to understand Big Data and more and more companies are slowly adopting and moving towards Big Data. However, while it looks like Big Data is around for a long time already, in fact, Big Data is as far as the internet was in 1993. The large Big Data revolution is still ahead of us so a lot will change in the coming years. Let the Big Data era begin!

 

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Mark Rijmenam

    Tags
    Big Data & Technology
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    NLP vs. NLU: From Understanding a Language to Its Processing

    NLP vs. NLU: From Understanding a Language to Its Processing

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.