We oversee the transformation of candidates from training, through projects, and placement into full-time employment. The first eight weeks will be a blend of theory, practical labs, and keynote speakers. The last four weeks will consist of hands-on projects where the students will have access to exclusive Experfy projects from real companies.
What am I going to get from this course?
Apply Deep Learning techniques, primarily using TensorFlow to solve real-world business problems and extract insights and value from business data sets. Having completed 8 weeks of coursework and 4 weeks of project work, candidates will be highly employable as Data Scientists.
Prerequisites and Target Audience
What will students need to know or do before starting this course?
Students should know basic python programming, elementary statistics, basic calculus and linear algebra. Any business experience is a bonus. Background in physical sciences, computer sciences, or mathematics is ideal.
Who should take this course? Who should not?
Candidates should take this course if they have a good grounding in the physical sciences or business experience as a software engineer and are comfortable in statistics and/or mathematics. Candidates without a strong mathematics and programming background will need to spend extra time covering the fundamentals or taking preliminary programming and statistics courses.
Module 1: Week 1 - Data Science Overview
We will cover the preliminaries such as what it is to be a Data Scientist, some applications and data sets. We will also cover issues of privacy in big data. Introduction, What a Data Scientist does, Main tools, Overview of “big data”, Current landscape, Data Privacy, A look at some data sets.
We continue our review of Data Science covering several important topics including cloud computing, databases and software engineering. Data cleaning, Cloud computing, SQL, NoSQL.
We will learn the theory and practice of some basic statistics that is relevant to data science. Statistics, Frequentist vs Bayesian, Distributions, Examples.
To gain familiarity with the statistical package R – a widely used and useful open source tool for analyzing data. An overview of the R language, Libraries, Exercises.
To learn about the various classes of algorithms, and some specific algorithms within each class. We learn when and how to use the right algorithm on the right data set. Data Structures, Algorithms.
Module 2: Week 2 – Hadoop, Spark and Flink
To gain familiarity with the Apache Hadoop ecosystem, an open source framework for analyzing massive data sets including MapReduce, YARN and Ambari. What is Hadoop? The Hadoop ecosystem, MapReduce, Yarn, Hive, Batch processing, Vendors, Use cases.
To gain an understanding of the various open source data streaming tools being developed and used in data science today, including Apache Spark, Beam and Flink. We will also learn about the lambda architecture and how it is used in Data Science. Data streams, Real time stream processing, Spark, Dataframes, RDD’s, Lambda architecture.
We take as in depth look at the newly released Apache Spark 2.0 including new features and functionality from version 1.6. Spark 2.0 – what’s new? Hands on labs.
In Day 4 we continue our examination of Spark 2.0, finishing up with more hands-on labs to really get to grips with this popular streaming technology. heTn we embark on our investigation into new kid on the block, Apache Flink. Spark labs continued, Apache Flink.
A full day dedicated to exploring the mysteries of Apache Flink, discovering why it is fast becoming a popular data streaming technology. We increase and cement our knowledge through further hands on lab work. Apache Flink continued, Flink Labs.
Module 3: Week 3 – Python & Julia
To gain some familiarity and insight into the programming language Python. We will also look at other programming languages used in data science including Ruby, Go and Java. Python overview, Python labs, Other languages.
We take a deep dive into some of the more popular python data science libraries, including scipy numpy and pandas, while getting some hands on experience with labs working with real data sets. SciPy, NumPy, Pandas, Matplotlib, Labs.
We gain exposure to scikit-learn, a popular machine learning toolkit written in python. We will test our knowledge and gain further familiarity with labs. Scikit-learn, Labs.
We will learn about the relatively new language Julia and understand the reasons why it is fast becoming a de facto language of data science. Julia overview, Libraries , Labs.
We will further explore the Julia features and capability by having a day of immersive hands on labs using Julia and some of its libraries. Hands on labs with Julia.
Module 4: Week 4 – Machine Learning
To gain some familiarity with machine learning theory, algorithms and applications as they are being used today in data science. We will also take a look at common open source machine learning tools and libraries. Introduction to ML, Supervised, Unsupervised and Reinforcement learning, ML libraries, Regression.
We will take a deep dive into supervised learning where we are working with labelled data sets. Supervised learning, Decision Trees, Ensembles, Naïve Bayesian Classifier, Labs.
We will continue our look at supervised learning algorithms using them on real data sets. k-Nearest Neighbor, Support Vector Machines, Lab.
We will continue our examination of machine learning algorithms with a look at unsupervised learning, where the data sets are unlabelled. Unsupervised learning, Clustering, PCA, Lab.
We will finish our overview of machine learning with a look at reinforcement learning whereby rewards are given for successful goal achievement. We will also briefly examine genetic algorithms. Reinforcement learning, Genetic Algorithms, Labs.
Module 5: Week 5 – Deep Learning I
To examine and become familiar with the various types of deep learning methods being applied in the field of data science (and beyond) today. To learn about computer vision, the various approaches, algorithms, current consumer and industry applications. Computer vision overview, CNN theory and practice, CV Frameworks.
To gain an understanding of GPU processing, in terms of hardware, architecture, software, programming and implementation of big data jobs on a GPU cluster. GPU computing, Theory, Frameworks, Labs.
We will look at the various ways that deep learning is being used for image classification, including theory and real world examples of what is being achieved within the industry today. Image classification, Use cases, Labs.
We will examine the fundamentals of self-driving cars, what is needed and how this is being deployed in companies today and in real world testing and driving situations. Autonomous vehicles, Labs.
A full day of computer vision labs to get candidates fully comfortable with this technology. CV Labs.
Module 6: Week 6 – Deep Learning II RNN, LSTM & NLP
To understand the principles and practices underlying natural language processing, including text, speech, reading and writing. RNN overview, LSTM, NLP, Frameworks.
To understand the theory and technologies behind speech recognition including acoustic wave analysis. Text, Speech, Labs.
Here we will look at generalized time series data and how recurrent neural networks can be used to process this data. Time series data, Use cases from industry, Labs.
We will specifically look at the processing of market data using RNN’s. This has application in algorithmic trading and risk management. Market data, Algorithms, Labs.
We will look at emerging technologies in the deep learning space that are starting to have an impact on business and scientific applications including biological and quantum computing. Emerging paradigms, Neural Turing Machine, Probabilistic programming, Biocomputing, Quantum Computing.
Module 7: Week 7 – Deep Learning III
We will look at the latest trends and advancements in deep learning starting with deep reinforcement learning, a technology Deepmind have been having a lot of success with in achieving superhuman levels in Atari games and Go. Deep reinforcement learning, Case studies, Labs.
Gaussian processes have achieved some impressive results in the field of machine learning of late. We will look at some of these results and the theory behind it. Gaussian Processes, Theory, Practice, Use cases.
Bayesian inference is another technology that is yielding some breakthrough results in the current research and technology space – we will take a look at how and why. Bayesian Inference, Theory, Practice, Use cases.
Cognitive computing is the combination of many machine learning technologies in an attempt to build a more general intelligence, similar to how biological brains can achieve many different kinds of goals. We will investigate the latest progress in this field. Cognitive computing, IBM Watson, Hints from neuroscience, Human Brain Project.
Neuromorphic computing is an attempt to mimic the human brain in hardware. We will look at the state of the art and recent advances in this field. Neuromorphic computing, Theory, Practice, Implementations.
Module 8: Week 8 – IoT, Visualization & Reporting
To understand the Internet of Things (IoT), its development, the technologies that are enabling it, where it’s going and how fast it will get us there. IoT overview, Sensor data, Data streams, Hardware, Protocols.
We will take some real IOT data and do some analysis on it using the latest tools and technologies. IoT Lab.
To learn about the various visualization practices, tools and frameworks, both open source and proprietary. Visualization Tools, d3, Shiny, Ggplot2.
We will look at the vital skill of how to communicate well in a business environment. We also look at how businesses are run including organizational structure, strategic planning, marketing and current global trends in AI technologies. Business Reporting, What’s in a report, Global trends, Case studies.
We will look back on what we have achieved over the past eight weeks, and look forward to starting our four week projects with real companies. Wrap up day, Look back on what was covered/achieved, Onwards to projects.
Module 9: Weeks 9-12 - Projects