$99.00
Certification

Industry recognized certification enables you to add this credential to your resume upon completion of all courses

Need Custom Training for Your Team?
Get Quote
Call Us

Toll Free (844) 397-3739

Inquire About This Course
Instructor
Ellie Ordway, Instructor - Comprehensive Pig

Ellie Ordway

Ellie Ordway-West is a Physicist and Professional Data Scientist. She holds Masters of Science degree in Physics from the University of Missouri St. Louis. She has deep expertise in Apache Pig and has created several data science pipelines, completing all data preprocessing and aggregation in Apache Pig. Currently, she utilizes machine learning algorithms to create data-driven decisions at one of the largest telecom companies in the world.

Instructor: Ellie Ordway

An In-depth Training for Apache Pig

  • Unlimited access to online self-paced videos
  • Taught by Data Scientist at one of the largest telecom companies in the world 
  • Coding exercises and over 40 quizzes and programming exercises

Duration: 2h 15m

Course Description

This course is a general overview of the Apache Pig Framework. It will provide an introduction to the structure and methodologies of Apache Pig and an overview of Pig Latin, the Language of Apache Pig. No prior knowledge of Pig or Pig Latin is assumed, but it may be helpful to be familiar with one other programming language, such as python. This course will include interactive tutorials for processing and aggregating data with Apache Pig, it will cover many of the functionality that is built into the language as well as how to incorporate user defined functions into pig scripts to further increase their functionality. In the end you should be able to read and understand pig code and write your own scripts that you can implement in the interactive grunt shell or directly from the command line.

What am I going to get from this course?

  • Process and aggregate data with Apache Pig

Prerequisites and Target Audience

What will students need to know or do before starting this course?

  • Some familiarity with hadoop and map reduce is helpful, but not necessary.

Who should take this course? Who should not?

  • Anyone who will be working with large data sets, data engineers, data scientists and developers

Curriculum

Module 1: Introduction

02:28
Lecture 1 Intro and Overview
01:37
Lecture 2 About the Instructor
00:51

Module 2: What is Pig?

01:52
Lecture 3 So what is Pig anyway?
01:08
Quiz 1
Lecture 4 Why It's Called Pig
00:44
Quiz 2

Module 3: Data Types

03:20
Lecture 5 Basic Types 1
00:27
Quiz 3 Basic Data Types Q1
Lecture 6 Basic Types 2
00:27
Quiz 4 Basic Data Types Q2
Lecture 7 Non Basic Types
01:51
Quiz 5 Non Basic Types Quiz
Lecture 8 Nulls vs Empty
00:35

Module 4: Getting Started with Pig

12:49
Lecture 9 Introduction to the Data
01:29
Lecture 10 Getting Hadoop
01:16
Quiz 6 Setting up Hadoop

If you don't have access to a hadoop environment, download and set up a sandbox now.

Lecture 11 Starting Hadoop and moving data
07:15
Quiz 7 Start Hadoop and Move Data

Lecture 12 Three Ways to Run Pig Commands
00:47
Lecture 13 Utility Commands: Help and Quit
00:52

Quiz 8 Try it out: Help and Quit
Lecture 14 Common Development Environments
01:10

Module 5: Basic Elements of a Pig Script

14:55
Lecture 15 Pig Latin Statements
01:39
Lecture 16 Load Data
01:17
Quiz 9 Load Data Quiz
Lecture 17 Store/dump Data
02:12
Quiz 10 Store/Dump quiz
Lecture 18 Setting up Sublime Text
00:47
Quiz 11 Set up Sublime Text Exercise
Lecture 19 Load Data Example
03:29
Quiz 12 Load Data Exercise
Lecture 20 Store/dump Example
04:41
Lecture 21 Quick Note about pig Logs
00:50

Module 6: Relational Operators

58:49
Lecture 22 Describe
00:58
Quiz 13 Describe Exercise

Lecture 23 Limit and Sample
03:28
Lecture 24 Group
00:44
Lecture 25 Foreach
03:33
Quiz 14 Group Exercise
Lecture 26 Flatten
01:45
Lecture 27 Join
08:27
Quiz 15 Join Exercise
Lecture 28 Disambiguation
06:14
Quiz 16 Disambiguation Exercise
Lecture 29 Union
03:52
Lecture 30 Cogroup
02:48
Lecture 31 Distinct
02:37
Lecture 32 Cross
01:28
Lecture 33 Filter
03:32
Quiz 17 Filter Exercise
Lecture 34 Split
02:47
Quiz 18 Split Exercise
Lecture 35 Conditional Statements
02:33
Lecture 36 Order
03:35
Quiz 19 Order By Exercise
Lecture 37 Rank
06:57
Lecture 38 Nested Foreach
03:31
Quiz 20 Nested ForEach Exercise

Module 7: Built In Functions

33:56
Lecture 39 Intro
01:17
Lecture 40 Eval Functions
06:31
Lecture 41 Eval Functions 2
01:37
Quiz 21 Eval Functions Exercise
Lecture 42 Arithmetic Functions
04:04
Quiz 22 Arithmetic Functions Exercise
Lecture 43 Datetime Functions
10:18
Lecture 44 String Functions
06:01
Quiz 23 String Functions Exercise
Lecture 45 Tuple/map/bag
01:22
Lecture 46 User Defined Functions
02:46

Module 8: Configuring Pig

07:01
Lecture 47 Part 3 Intro
00:19
Lecture 48 Parametrization
04:31
Lecture 49 Utility Commands
02:11

Reviews

8 Reviews

Abel C

December, 2016

What a kick-ass intro to Apache Pig and its language Pig Latin! If you are working with large data sets, then this course will be very useful. Lots of quizzes and programming tasks make this course very approachable. I was able to learn a lot in a short time. If you are looking to get started with Pig, take this course. Great course overall. Good number of examples to help you master the subject matter. Four and a half-stars!

Kevin L

May, 2017

As a big data scientist, I find this is an excellent course to learn from. Especially unlimited access to online self-paced videos. Equally, it is a collaborative learning experience with coding exercises and may quizzes and programming exercises. It helped me to understand writing and coding pig scripts and increase their functionality. As a big data scientist, this course helped me as my knowledge was a decade old, and helped me understand current developments. It helped me work very well with large data sets.

Anna D

May, 2017

This is really an in-depth training program for learning Apache Pig. Indeed as promised by the instructor in the tutorial, I was able to read and understand pig code and write my own scripts.

Jake H

May, 2017

I find the interactive tutorials an excellent idea. It helped me grasp the tutorials very well including user defined functions into pig scripts. It is an invaluable experience.

Murali K

July, 2017

This is undoubtedly an intelligent instructional curriculum for training Apache Pig. As a data research associate, I’m so glad I ran into this remarkable course. It encouraged me to move quite effectively with high data arrays.

Errol T

July, 2017

Awesome course. Easy to figure out. Thank you for making this course! The subject was well organized, the demonstrations were great and there was a good balance between theory and real problems that you had to deal with yourself.

Bruno M

July, 2017

Indeed as vouched for by the teacher, I was capable of understanding and interpreting pig code and formulating my personal scripts. I picked up the tutorials very well working with user defined functions into pig scripts. It’s valuable knowledge.

Naveen K

July, 2017

Tutorials were designed well. As a data research worker, this course of study helped as my knowledge was old, and I learned contemporary issues. Correspondingly, it is collaborative training with coding practices and many questions and programming activities. It pushed me to learn to transcribe and code pig scripts and develop their functionality.