Data Science Digest
Data Science is an amalgamation of many other fields like mathematics, technology & domain; it has its own concepts, process & tools. It’s really tough to know each and everything related to the subject unless you have really worked on complex data science problems in the industry for a couple of years.
In this post, I have tried to aggregate & organize all the data science related topics from Quora (generic definitions), Medium (in-depth working) & GitHub (code). This post is organized in these sections of data science area:
Data Science Introduction
In this section, you can get introduced to data science world. What is data science? Why it is important? What is the difference between Artificial Intelligence, Data Science, Machine Learning & Deep Learning?
Data Science Prerequisites
Before diving deep into data science, one needs to cover a lot of ground like decent understanding of linear algebra, statistics, probability & data engineering.
Data Science Concepts
In this section, you can learn the data science concepts like types of learning and when to use which kind of learning algorithms?
Data Science Algorithms
This section covers various (mostly used) data science algorithms in detail. Which kind of problems these algorithms solve & what are the pros & cons of using these algorithms?
- Classification (k-Nearest Neighbors, Logistic Regression, Decision Trees, Naive Bayes)
- Regression (Linear, Polynomial, Ridge, Lasso, ElasticNet)
- Support Vector Machines
- Neural Nets
- Random Forests
- Clustering (K-Means, Mean-Shift, DBSCAN, EM-GMM, Agglomerative Hierarchical)
- Deep Learning (CNNs, RNNs, LSTMs)
Data Science Process
In this section, you will get to know data science as a process; once you have a problem, what approach will you take? How will you collect & clean data? Which evaluation and tuning technique will you use to optimize your data science algorithm.
- Data Science Process (Data Collection, Data Cleaning, Modeling, Model Evaluation, Model Tuning, Prediction)
- Exploratory Data Analysis
- Feature Engineering
- Ensembling (Bagging, Boosting & Stacking)
Data Science Tools
This section covers the tools being used in data science field like R, Python, SQL or machine learning platforms provided by Azure & Amazon.