Data Basics and Data Architecture

Ankit Rathi Ankit Rathi
June 25, 2018 Big Data, Cloud & DevOps

Ready to learn Data Science? Browse Data Science Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.

What is Data?

Data is a collection of facts (numbers, words, measurements, observations, etc) that has been translated into a form that computers can process.

Data is a set of values of qualitative or quantitative variables. Pieces of data are individual pieces of information. While the concept of data is commonly associated with scientific research, data is collected by a huge range of organizations and institutions. ~ Wikipedia

Every object has some attributes which can be described in qualitative or quantitative way. Like if you go to a grocery store and look for chocolates, it has a type, brand, color, shape, weight, packing; if we collect information of these attributes of all the chocolates in the store, that set of items with attributes is data.

Mostly, data is represented in tabular form but there can be other structures and then there can be data which can't have a fix structure as well. We will discuss this in detail in coming posts.

Why Data is important?

Data has become important for everyone like never before, because it makes us to take informed decisions, improve operations. We can only improve things & activities which we can measure, and when we measure anything, it is described in a form of data. So the data about things and activities can be collected, processed & analysed, new insights can be generated, which can give us certain competitive advantage, and hence data is becoming a game changer these days in almost any business.

How to exploit data?

Variables are measurements, characteristics or attributes of an item. We can measure the height of a person, or we can measure the amount of time a person stays on a website, or they might be more qualitative characteristics. So it might be the places that the person looks on the website or Whether that we think the person visiting is a man or a woman.

Data can be exploit by using appropriate mechanism for collecting relevant data, processing data and analyzing data further to generate insights. Due to ever evolving storage & computing techniques available, its now possible to use data to visualize, analyze & predict outcomes. These activities can give immense competitive advantage in today's business world.

Case Study: Rathi Pizza Inc

So lets take a hypothetical case study to understand data basics throughout this series. Rathi Pizza Inc is a big restaurant chain in the world known for making and selling delicious pizzas. What is the data involved in pizza? A pizza has attributes like base type, toppings, size, price etc.

pizza id pizza type base type      toppings  size  price

1         Pan         Cheese Burst  Pepperoni  Small  10
2         Greek     Hand Tossed      Mushrooms  Medium  20
3         Italian     Thin Crust      Onions      Large  30
4         Veg         Cheese Burst  Bacon      Small  10

In above data-set, we can see measurements of different attributes of 4 pizzas. We can analyse this data to understand the distribution & relationships of these attributes which can help us generate further insights.

What is Data Architecture?

Data architecture is a set of rules, policies, standards and models that govern and define the type of data collected and how it is used, stored, managed and integrated within an organization and its database systems.

Data Architecture (the thing) is the way in which information flows around the organisation. What is plumbed from where to where. The picture of pipes at the top of the page is there for a reason.

Data Architecture (the discipline) is the effort to control it – the design, the models, policies, rules, standards, etc. Anything that designs the pipework and tries to get the contents (the data) to the right place at the right time.

Data Architecture is as much a business decision as it is a technical one, as new business models and entirely new ways of working are driven by data and information.

The Data Architect being the person who does one to try and control the other.

Why Data Architecture is required?

If you want to leverage and operationalize data proactively, you need to invest in your underlying data architecture and compile the information map for your organization. Data quality is more important now than ever before, and it should be categorized and correlated to validate that it is meaningful to the business. 

A solid information architecture will also set up your foundation for a data governance program. You have to know what the data is and assign business meaning to it, with the proper terminology. You can define what information is considered sensitive, and run audits against it.

In the age of Big Data, the ability to visually model and map out all of the data from these sources, and track data lineage between them, can help you understand the information in the organization and build quality into the data process. To effectively assemble and utilize the information, you need a business-driven data architecture design.

How to define & build Data Architecture?

To create the data architecture, one has to define business information needs.

Building Contextual View: Contextual view describes graphically the interaction of the system with the various entities in its environment. The interactions consist of data-flows from and to such entities.The contextual view clarifies the boundary of the system.

Building Conceptual View: Conceptual view is a high-level description of a business's informational needs. It typically includes only the main concepts and the main relationships among them.

Building Logical View: Logical view of a specific problem domain expressed independently of a particular technology or product but in terms of data structures such as relational tables and columns, object-oriented classes, or XML tags.

Building Physical View: Physical view is how and where the information resides. The physical view is a technical description of the implementation of the logical view.

Case Study: Rathi Pizza Inc

Lets get back to our case study and apply what we have learnt. So, we need to manage our company's data and data architecture is integral part of data management. First, we will build contextual view to identify our architecture's boundary and the external systems it will interact with. Then we will build conceptual view of our data to identify major entities and its relationship. Once this is done, we will come to logical view of our data architecture to identify attributes of entities and their relationship and how to group/re-group these attributes in entities to serve the purpose. The last step is to build the physical view, where we look at technology and infrastructure we need to build to support our data architecture.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Ankit Rathi

    Tags
    Data Science
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    How has VPN evolution changed remote access?

    How has VPN evolution changed remote access?

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.