{"id":491,"date":"2015-08-28T07:28:43","date_gmt":"2015-08-28T04:28:43","guid":{"rendered":"http:\/\/kusuaks7\/?p=96"},"modified":"2025-02-10T14:25:44","modified_gmt":"2025-02-10T14:25:44","slug":"become-data-scientist","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/become-data-scientist\/","title":{"rendered":"Ingredients in the making of a Data Scientist"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"491\" class=\"elementor elementor-491\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-effa61d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"effa61d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4d8d74d\" data-id=\"4d8d74d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5f2ff0c2 elementor-widget elementor-widget-text-editor\" data-id=\"5f2ff0c2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tHow does one prepare for a career in data science? \u00a0What credentials enable you to become a data scientist? \u00a0These are frequently asked questions. \u00a0Swami Chandrasekaran, the\u00a0Executive Architect at IBM Watson, <a href=\"http:\/\/nirvacana.com\/thoughts\/becoming-a-data-scientist\/\" target=\"_blank\" rel=\"noopener noreferrer\">offers a roadmap<\/a>. Chandrasekaran\u0092s suggested curriculum is compelling, and his analogy of a metro map is a useful one. \u00a0He presents us with ten metro lines comprising of:\n<ol>\n \t<li>Fundamentals<\/li>\n \t<li>Statistics<\/li>\n \t<li>Programming<\/li>\n \t<li>Machine Learning<\/li>\n \t<li>Text Mining \/ Natural Language Processing<\/li>\n \t<li>Data Visualization<\/li>\n \t<li>Big Data<\/li>\n \t<li>Data Ingestion<\/li>\n \t<li>Data Munging<\/li>\n \t<li>Toolbox<span id=\"more-359\"><\/span><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-78a31aa elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"78a31aa\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-aeccb05\" data-id=\"aeccb05\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-502a3e1 elementor-widget elementor-widget-heading\" data-id=\"502a3e1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Fundamentals<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3025669 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3025669\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-484c222\" data-id=\"484c222\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-32430c4 elementor-widget elementor-widget-text-editor\" data-id=\"32430c4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Metric_(mathematics)\" rel=\"noopener\">Metrics<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Linear_algebra\" rel=\"noopener\">Linear Algebra<\/a>\u00a0Fundamentals<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Hash_Function\" rel=\"noopener\">Hash Functions<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Binary_Tree\" rel=\"noopener\">Binary Tree<\/a>,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Big_O_notation\" rel=\"noopener\">O(n)<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_Algebra\" rel=\"noopener\">Relational Algebra<\/a>,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Database\" rel=\"noopener\">DB Basics<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Inner_join#Inner_join\" rel=\"noopener\">Inner<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Join_(SQL)#Outer_join\" rel=\"noopener\">Outer<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Cross_join#Cross_join\" rel=\"noopener\">Cross<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Theta_join#.CE.B8-join_and_equijoin\" rel=\"noopener\">Theta Join<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/CAP_Theorem\" rel=\"noopener\">CAP Theorem<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Table_(information)\" rel=\"noopener\">Tabular Data<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Entropy\" rel=\"noopener\">Entropy<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_frame\" rel=\"noopener\">Data Frames<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Time_series\" rel=\"noopener\">Series<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Sharding\" rel=\"noopener\">Sharding<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/OLAP\" rel=\"noopener\">OLAP<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Dimensional_modeling\" rel=\"noopener\">Multidimensional Data Model<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Extract,_transform,_load\" rel=\"noopener\">Extract\/Transform\/Load(ETL)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Business_reporting\" rel=\"noopener\">Reporting<\/a>\u00a0vs\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Business_intelligence\" rel=\"noopener\">BI<\/a>\u00a0vs\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Analytics\" rel=\"noopener\">Analytics<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Json\" rel=\"noopener\">JSON<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Xml\" rel=\"noopener\">XML<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/NoSQL\" rel=\"noopener\">NoSQL<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Regular_expression\" rel=\"noopener\">Regex<\/a><\/li>\n \t<li>Vendor Landsacpe<\/li>\n \t<li>Env Setup<\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-92870e4 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"92870e4\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ec59b54\" data-id=\"ec59b54\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9e9fc6f elementor-widget elementor-widget-heading\" data-id=\"9e9fc6f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Statistics<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-9c3de97 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"9c3de97\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-24d6159\" data-id=\"24d6159\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5b4a402 elementor-widget elementor-widget-text-editor\" data-id=\"5b4a402\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li>Pick a\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_set\" rel=\"noopener\">Dataset<\/a>\u00a0(<a href=\"http:\/\/archive.ics.uci.edu\/ml\/\" rel=\"noopener\">UCI Repo<\/a>)<\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Descriptive_statistics\" rel=\"noopener\">Descriptive Statistics<\/a>(<a href=\"https:\/\/en.wikipedia.org\/wiki\/Mean\" rel=\"noopener\">mean<\/a>,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Median\" rel=\"noopener\">median<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Range_(statistics)\" rel=\"noopener\">range<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Standard_deviation\" rel=\"noopener\">SD<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Variance\" rel=\"noopener\">Var<\/a>)<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Exploratory_data_analysis\" rel=\"noopener\">Exploratory Data Analysis<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Histograms\" rel=\"noopener\">Histograms<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Percentiles\" rel=\"noopener\">Percentiles<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Outliers\" rel=\"noopener\">Outliers<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Probability_Theory\" rel=\"noopener\">Probability Theory<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Bayes_Theorem\" rel=\"noopener\">Bayes Theorem<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Random_variables\" rel=\"noopener\">Random Variables<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Cumulative_distribution_function\" rel=\"noopener\">Cumulative Distribution Function (CDF)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Probability_distribution#Continuous_probability_distribution\" rel=\"noopener\">Continuous Distributions<\/a>\u00a0(<a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_distribution\" rel=\"noopener\">Normal<\/a>,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Poisson_distribution\" rel=\"noopener\">Poisson<\/a>,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Gaussian_distribution\" rel=\"noopener\">Gaussian<\/a>)<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Skewness\" rel=\"noopener\">Skewness<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/ANOVA\" rel=\"noopener\">Analysis of Variance (ANOVA)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Probability_density_function\" rel=\"noopener\">Probability Density Function (PDF)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Central_Limit_Theorem\" rel=\"noopener\">Central Limit Theorem<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Monte_Carlo_Method\" rel=\"noopener\">Monte Carlo Method<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Hypothesis_Testing\" rel=\"noopener\">Hypothesis Testing<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/P-Value\" rel=\"noopener\">p-Value<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Chi-squared_test\" rel=\"noopener\">Chi-square Test<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Estimation_theory\" rel=\"noopener\">Estimation<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Confidence_interval\" rel=\"noopener\">Confidence Interval (CI)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Maximum_likelihood\" rel=\"noopener\">Maximum Likelihood Estimation (MLE)<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Kernel_density_estimate\" rel=\"noopener\">Kernel Density Estimate<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Regression_analysis\" rel=\"noopener\">Regression<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Covariance\" rel=\"noopener\">Covariance<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Correlation\" rel=\"noopener\">Correlation<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Pearson_product-moment_correlation_coefficient\" rel=\"noopener\">Pearson Coeff<\/a><\/li>\n \t<li><a href=\"http:\/\/www.abs.gov.au\/websitedbs\/a3121120.nsf\/home\/statistical+language+-+correlation+and+causation\" rel=\"noopener\">Causation<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Least_squares\" rel=\"noopener\">Least Squares Fit<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Euclidean_Distance\" rel=\"noopener\">Euclidean Distance<\/a><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-d1a14cb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d1a14cb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8a08823\" data-id=\"8a08823\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5a682ec elementor-widget elementor-widget-heading\" data-id=\"5a682ec\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Programming<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-f42b7b4 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"f42b7b4\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6c7cbe2\" data-id=\"6c7cbe2\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c0d7a13 elementor-widget elementor-widget-text-editor\" data-id=\"c0d7a13\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li><a href=\"http:\/\/docs.python.org\/2\/tutorial\/\" rel=\"noopener\">Python Basics<\/a><\/li>\n \t<li><a href=\"http:\/\/sunburst.usd.edu\/~bwjames\/tut\/excel\/\" rel=\"noopener\">Working in Excel<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/R_(programming_language)\" rel=\"noopener\">R<\/a>\u00a0Setup,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/RStudio\" rel=\"noopener\">R Studio<\/a><\/li>\n \t<li><a href=\"http:\/\/math.illinoisstate.edu\/dhkim\/rstuff\/rtutor.html\" class=\"broken_link\" rel=\"noopener\">R Basics<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Expression_(programming)\" rel=\"noopener\">Expressions<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Variable_(computer_science)\" rel=\"noopener\">Variables<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/SPSS\" rel=\"noopener\">IBM SPSS<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/RapidMiner\" rel=\"noopener\">Rapid Miner<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Vector_(mathematics_and_physics)\" rel=\"noopener\">Vectors<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Matrix_(mathematics)\" rel=\"noopener\">Matrices<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Array_data_structure\" rel=\"noopener\">Arrays<\/a><\/li>\n \t<li>Factors<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/List_(abstract_data_type)\" rel=\"noopener\">Lists<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_frame\" rel=\"noopener\">Data Frames<\/a><\/li>\n \t<li>Reading\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Comma-separated_values\" rel=\"noopener\">CSV<\/a>\u00a0Data<\/li>\n \t<li>Reading\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Raw_data\" rel=\"noopener\">RAW Data<\/a><\/li>\n \t<li>Subsetting Data<\/li>\n \t<li><a href=\"http:\/\/www.r-tutor.com\/r-introduction\/data-frame\" rel=\"noopener\">Manipulate Data Frames<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Subroutine\" rel=\"noopener\">Functions<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Factor_analysis\" rel=\"noopener\">Factor Analysis<\/a><\/li>\n \t<li>Install Pkgs<\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-fa8392f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"fa8392f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-58f2198\" data-id=\"58f2198\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-aa024d8 elementor-widget elementor-widget-heading\" data-id=\"aa024d8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Machine Learning<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1a61f38 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1a61f38\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-03da4fe\" data-id=\"03da4fe\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ad68fc6 elementor-widget elementor-widget-text-editor\" data-id=\"ad68fc6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li>What is\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Machine_learning\" rel=\"noopener\">ML<\/a>?<\/li>\n \t<li>Numerical Var<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Categorical_variable\" rel=\"noopener\">Categorical Variable<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Supervised_learning\" rel=\"noopener\">Supervised Learning<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Unsupervised_learning\" rel=\"noopener\">Unsupervised Learning<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Concept_learning\" rel=\"noopener\">Concepts<\/a>,\u00a0<a href=\"http:\/\/www.cse.unsw.edu.au\/~billw\/mldict.html#inputunit\" class=\"broken_link\" rel=\"noopener\">Inputs<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/www.cse.unsw.edu.au\/~billw\/mldict.html#attribute\" class=\"broken_link\" rel=\"noopener\">Attributes<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Training_set\" rel=\"noopener\">Training &amp; Test Data<\/a><\/li>\n \t<li>Classifier<\/li>\n \t<li>Prediction<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Lift_(data_mining)\" rel=\"noopener\">Lift<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Overfitting\" rel=\"noopener\">Overfitting<\/a><\/li>\n \t<li><a href=\"http:\/\/www.montefiore.ulg.ac.be\/~lwh\/AIA\/aia-21-11-05.pdf\" rel=\"noopener\">Bias &amp; Variance<\/a><\/li>\n \t<li>Trees &amp; Classification<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Statistical_classification\" rel=\"noopener\">Classification<\/a>, Classification Rate<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Decision_tree_learning\" rel=\"noopener\">Decision Trees<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Boosting_(machine_learning)\" rel=\"noopener\">Boosting<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Naive_Bayes_classifier\" rel=\"noopener\">Na\u00efve Bayes Classifiers<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/K-nearest_neighbors_algorithm\" rel=\"noopener\">K-Nearest Neighbor<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Logistic_regression\" rel=\"noopener\">Logistic Regression<\/a><\/li>\n \t<li>Regression, Ranking<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Linear_Regression\" rel=\"noopener\">Linear Regression<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Perceptron\" rel=\"noopener\">Perceptron<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\" rel=\"noopener\">Clustering<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Hierarchical_clustering\" rel=\"noopener\">Hierarchical Clustering<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/K-means_clustering\" rel=\"noopener\">K-means Clustering<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Neural_network\" rel=\"noopener\">Neural Networks<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Sentiment_analysis\" rel=\"noopener\">Sentiment Analysis<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Collaborative_filtering\" rel=\"noopener\">Collaborative Filtering<\/a><\/li>\n \t<li>Tagging<\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-142877f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"142877f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e18a7c7\" data-id=\"e18a7c7\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f71f7d8 elementor-widget elementor-widget-heading\" data-id=\"f71f7d8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Text Mining\/Natural Language Processing<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-4321da1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4321da1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-10f1e49\" data-id=\"10f1e49\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-dc392a4 elementor-widget elementor-widget-text-editor\" data-id=\"dc392a4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Text_corpus\" rel=\"noopener\">Corpus<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Named-entity_recognition\" rel=\"noopener\">Named Entity Recognition<\/a><\/li>\n \t<li>Text Analysis<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/UIMA\" rel=\"noopener\">UIMA<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Document-term_matrix\" rel=\"noopener\">Term Document Matrix<\/a><\/li>\n \t<li><a href=\"http:\/\/nlp.stanford.edu\/IR-book\/html\/htmledition\/term-frequency-and-weighting-1.html\" rel=\"noopener\">Term Frequency &amp; Weight<\/a><\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Support_vector_machine\" rel=\"noopener\">Support Vector Machines<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Association_rule_learning\" rel=\"noopener\">Association Rules<\/a><\/li>\n \t<li>Market Based Analysis ( Market Basket Analysis ? )<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Feature_extraction\" rel=\"noopener\">Feature Extraction<\/a><\/li>\n \t<li>Using\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_Mahout\" rel=\"noopener\">Mahout<\/a><\/li>\n \t<li>Using\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Weka_(machine_learning)\" rel=\"noopener\">Weka<\/a><\/li>\n \t<li>Using\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Natural_Language_Toolkit\" rel=\"noopener\">Natural Language Toolkit (NLTK)<\/a><\/li>\n \t<li>Classify Text (\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Document_classification\" rel=\"noopener\">Document Classification<\/a>? )<\/li>\n \t<li>Vocabulary Mapping<\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-b73770f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b73770f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-5e70c0b\" data-id=\"5e70c0b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d7257f9 elementor-widget elementor-widget-heading\" data-id=\"d7257f9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Data Visualization<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-d41caf7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d41caf7\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c71d753\" data-id=\"c71d753\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-51309e5 elementor-widget elementor-widget-text-editor\" data-id=\"51309e5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li>Data Exploration in R (<a href=\"http:\/\/stat.ethz.ch\/R-manual\/R-devel\/library\/graphics\/html\/hist.html\" rel=\"noopener\">Hist<\/a>,\u00a0<a href=\"http:\/\/www.r-tutor.com\/elementary-statistics\/numerical-measures\/box-plot\" rel=\"noopener\">Boxplot<\/a>\u00a0etc)<\/li>\n \t<li>Uni, Bi &amp; Multivariate Viz<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Ggplot2\" rel=\"noopener\">ggplot2<\/a><\/li>\n \t<li>Histogram &amp; Pie (Uni)<\/li>\n \t<li>Tree &amp;\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Treemapping\" rel=\"noopener\">Tree Map<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Scatter_plot\" rel=\"noopener\">Scatter Plot<\/a>\u00a0(Bi)<\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Line_chart\" rel=\"noopener\">Line Charts<\/a>\u00a0(Bi)<\/li>\n \t<li>Spatial Charts<\/li>\n \t<li>Survey Plot<\/li>\n \t<li>Timeline<\/li>\n \t<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Decision_tree\" rel=\"noopener\">Decision Tree<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data-Driven_Documents\" rel=\"noopener\">D3.js<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/JavaScript_InfoVis_Toolkit\" class=\"broken_link\" rel=\"noopener\">InfoVis<\/a><\/li>\n \t<li><a href=\"http:\/\/www-958.ibm.com\/software\/data\/cognos\/manyeyes\/\" rel=\"noopener\">IBM ManyEyes<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Tableau_Software\" rel=\"noopener\">Tableau<\/a><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-dd9521b elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"dd9521b\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9a52d3d\" data-id=\"9a52d3d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f300b10 elementor-widget elementor-widget-heading\" data-id=\"f300b10\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Big Data<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-dfb25a8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"dfb25a8\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-61186df\" data-id=\"61186df\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ef999e8 elementor-widget elementor-widget-text-editor\" data-id=\"ef999e8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/MapReduce\" rel=\"noopener\">Map Reduce<\/a>\u00a0Framework<\/li>\n \t<li><a href=\"http:\/\/docs.hortonworks.com\/HDPDocuments\/HDP1\/HDP-1.2.3\/bk_getting-started-guide\/content\/ch_hdp1_getting_started_chp2_1.html\" rel=\"noopener\">Hadoop Components<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/HDFS#Hadoop_Distributed_File_System\" rel=\"noopener\">HDFS<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Replication_(computing)\" rel=\"noopener\">Data Replication<\/a>\u00a0Principles<\/li>\n \t<li>Setup\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_Hadoop\" rel=\"noopener\">Hadoop<\/a>\u00a0( IBM \/ Cloudera \/ HortonWorks )<\/li>\n \t<li><a href=\"http:\/\/hadoop.apache.org\/docs\/stable\/hdfs_design.html#NameNode+and+DataNodes\" class=\"broken_link\" rel=\"noopener\">Name &amp; Data Nodes<\/a><\/li>\n \t<li><a href=\"http:\/\/wiki.apache.org\/hadoop\/JobTracker\" class=\"broken_link\" rel=\"noopener\">Job<\/a>\u00a0&amp;\u00a0<a href=\"http:\/\/wiki.apache.org\/hadoop\/TaskTracker\" class=\"broken_link\" rel=\"noopener\">Task Tracker<\/a><\/li>\n \t<li><a href=\"http:\/\/hadoop.apache.org\/docs\/stable\/mapred_tutorial.html\" class=\"broken_link\" rel=\"noopener\">M\/R Programming<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Sqoop\" rel=\"noopener\">Sqoop<\/a>\u00a0: Loading Data in HDFS<\/li>\n \t<li><a href=\"http:\/\/flume.apache.org\/\" rel=\"noopener\">Flume<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Scribe_(log_server)\" rel=\"noopener\">Scribe<\/a>\u00a0: For Unstructured Data<\/li>\n \t<li>SQL with\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Pig_(programming_tool)\" rel=\"noopener\">Pig<\/a><\/li>\n \t<li>DWH with\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_Hive\" rel=\"noopener\">Hive<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Scribe_(log_server)\" rel=\"noopener\">Scribe<\/a>,\u00a0<a href=\"http:\/\/wiki.apache.org\/hadoop\/Chukwa\" class=\"broken_link\" rel=\"noopener\">Chunkwa<\/a>\u00a0For Weblog<\/li>\n \t<li><a href=\"https:\/\/cwiki.apache.org\/confluence\/display\/MAHOUT\/Quickstart\" rel=\"noopener\">Using Mahout<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_ZooKeeper\" rel=\"noopener\">Zookeeper<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_Avro\" rel=\"noopener\">Avro<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Storm_(event_processor)\" rel=\"noopener\">Storm<\/a>\u00a0: Hadoop Realtime<\/li>\n \t<li><a href=\"https:\/\/github.com\/RevolutionAnalytics\/RHadoop\/wiki\" rel=\"noopener\">Rhadoop<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Rhipe\" rel=\"noopener\">RHIPE<\/a><\/li>\n \t<li><a href=\"https:\/\/github.com\/RevolutionAnalytics\/RHadoop\/wiki\/rmr\" rel=\"noopener\">rmr<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Cassandra\" rel=\"noopener\">Cassandra<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/MongoDB\" rel=\"noopener\">MongoDB<\/a>,\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Neo4j\" rel=\"noopener\">Neo4j<\/a><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-99dc272 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"99dc272\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2b52e2f\" data-id=\"2b52e2f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b4ce0d7 elementor-widget elementor-widget-heading\" data-id=\"b4ce0d7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Data Ingestion<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-98d00e1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"98d00e1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a686ad1\" data-id=\"a686ad1\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-aa984cf elementor-widget elementor-widget-text-editor\" data-id=\"aa984cf\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li>Summary of Data Formats<\/li>\n \t<li>Data Discovery<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_acquisition\" rel=\"noopener\">Data Sources &amp; Acquisition<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_Integration\" rel=\"noopener\">Data Integration<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_fusion\" rel=\"noopener\">Data Fusion<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_transformation\" rel=\"noopener\">Transformation<\/a>, Enrichment<\/li>\n \t<li>Data Survey<\/li>\n \t<li>Google\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/OpenRefine\" rel=\"noopener\">OpenRefine<\/a><\/li>\n \t<li>How much Data?<\/li>\n \t<li>Using ETL<\/li>\n<\/ol>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-eca95de elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"eca95de\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fa6e24b\" data-id=\"fa6e24b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a1a5a33 elementor-widget elementor-widget-heading\" data-id=\"a1a5a33\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_wrangling\" rel=\"noopener\">Data Munging<\/a><\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-e68e656 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e68e656\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c6c3229\" data-id=\"c6c3229\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-2da62dc elementor-widget elementor-widget-text-editor\" data-id=\"2da62dc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Dimensionality_reduction\" rel=\"noopener\">Dimensionality<\/a>\u00a0&amp; Numerosity Reduction<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_normalization\" rel=\"noopener\">Normalization<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_scrubbing\" rel=\"noopener\">Data Scrubbing<\/a><\/li>\n \t<li>Handling Missing Values<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Bias_of_an_estimator\" rel=\"noopener\">Unbiased Estimators<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Data_binning\" rel=\"noopener\">Binning<\/a>\u00a0Sparse Values<\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Feature_extraction\" rel=\"noopener\">Feature Extraction<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Noise_reduction\" rel=\"noopener\">Denoising<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Sampling_(statistics)\" rel=\"noopener\">Sampling<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Stratified_sampling\" rel=\"noopener\">Stratified Sampling<\/a><\/li>\n \t<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Principal_component_analysis\" rel=\"noopener\">Principal Component Analysis<\/a><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2050ea6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2050ea6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2321ff5\" data-id=\"2321ff5\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9e1e848 elementor-widget elementor-widget-heading\" data-id=\"9e1e848\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\"><h4>Toolbox<\/h4><\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7c464ad elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7c464ad\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-25b48ca\" data-id=\"25b48ca\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-bbeac8e elementor-widget elementor-widget-text-editor\" data-id=\"bbeac8e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li>MS Excel w\/\u00a0<a href=\"http:\/\/www.excel-easy.com\/data-analysis\/analysis-toolpak.html\" rel=\"noopener\">Analysis ToolPak<\/a><\/li>\n \t<li><a href=\"http:\/\/www.java.com\/en\/\" rel=\"noopener\">Java<\/a>,\u00a0<a href=\"http:\/\/www.python.org\/\" rel=\"noopener\">Python<\/a><\/li>\n \t<li><a href=\"http:\/\/www.r-project.org\/\" rel=\"noopener\">R<\/a>,\u00a0<a href=\"http:\/\/www.rstudio.com\/\" rel=\"noopener\">R-Studio<\/a>,\u00a0<a href=\"http:\/\/cran.r-project.org\/web\/packages\/rattle\/index.html\" rel=\"noopener\">Rattle<\/a><\/li>\n \t<li><a href=\"http:\/\/www.cs.waikato.ac.nz\/ml\/weka\/\" rel=\"noopener\">Weka<\/a>,\u00a0<a href=\"http:\/\/www.knime.org\/\" rel=\"noopener\">Knime<\/a>,\u00a0<a href=\"http:\/\/rapid-i.com\/content\/view\/181\/\" rel=\"noopener\">RapidMiner<\/a><\/li>\n \t<li><a href=\"http:\/\/hadoop.apache.org\/\" rel=\"noopener\">Hadoop<\/a>\u00a0Dist of Choice<\/li>\n \t<li><a href=\"http:\/\/spark-project.org\/\" rel=\"noopener\">Spark<\/a>,\u00a0<a href=\"http:\/\/storm-project.net\/\" rel=\"noopener\">Storm<\/a><\/li>\n \t<li><a href=\"http:\/\/flume.apache.org\/\" rel=\"noopener\">Flume<\/a>,\u00a0<a href=\"https:\/\/github.com\/facebook\/scribe\" rel=\"noopener\">Scribe<\/a>,\u00a0<a href=\"http:\/\/incubator.apache.org\/chukwa\/\" rel=\"noopener\">Chukwa<\/a><\/li>\n \t<li><a href=\"http:\/\/nutch.apache.org\/\" rel=\"noopener\">Nutch<\/a>,\u00a0<a href=\"http:\/\/www.talend.com\/\" rel=\"noopener\">Talend<\/a>,\u00a0<a href=\"https:\/\/scraperwiki.com\/\" rel=\"noopener\">Scraperwiki<\/a><\/li>\n \t<li><a href=\"http:\/\/search.cpan.org\/~miyagawa\/Web-Scraper-0.37\/lib\/Web\/Scraper.pm\" rel=\"noopener\">Webscraper<\/a>,\u00a0<a href=\"http:\/\/flume.apache.org\/\" rel=\"noopener\">Flume<\/a>,\u00a0<a href=\"http:\/\/sqoop.apache.org\/\" rel=\"noopener\">Sqoop<\/a>\u00a0(Flume Dup?)<\/li>\n \t<li><a href=\"http:\/\/cran.r-project.org\/web\/packages\/tm\/index.html\" rel=\"noopener\">tm<\/a>,\u00a0<a href=\"http:\/\/cran.r-project.org\/web\/packages\/RWeka\/index.html\" rel=\"noopener\">RWeka<\/a>,\u00a0<a href=\"http:\/\/nltk.org\/\" rel=\"noopener\">NLTK<\/a><\/li>\n \t<li><a href=\"http:\/\/www.datadr.org\/\" rel=\"noopener\">RHIPE<\/a><\/li>\n \t<li><a href=\"http:\/\/d3js.org\/\" rel=\"noopener\">D3.js<\/a>,\u00a0<a href=\"http:\/\/ggplot2.org\/\" class=\"broken_link\" rel=\"noopener\">ggplot2<\/a>,\u00a0<a href=\"http:\/\/www.rstudio.com\/shiny\/\" rel=\"noopener\">Shiny<\/a><\/li>\n \t<li><a href=\"http:\/\/www-01.ibm.com\/software\/globalization\/topics\/languageware\/\" rel=\"noopener\">IBM Languageware<\/a><\/li>\n \t<li><a href=\"http:\/\/cassandra.apache.org\/\" rel=\"noopener\">Cassandra<\/a>,\u00a0<a href=\"http:\/\/www.mongodb.org\/\" rel=\"noopener\">MongoDB<\/a><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a87e978 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a87e978\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ad23d2f\" data-id=\"ad23d2f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4806d99 elementor-widget elementor-widget-text-editor\" data-id=\"4806d99\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe only thing that we would add to this extensive framework is, of course, domain expertise within a specific industry, without which one may not be able ask the right questions.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>How does one prepare for a career in data science? \u00a0What credentials enable you to become a data scientist?<\/p>\n","protected":false},"author":11,"featured_media":2785,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[94],"ppma_author":[1606],"class_list":["post-491","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":1606,"user_id":11,"is_guest":0,"slug":"cameron-turner","display_name":"Cameron Turner","avatar_url":{"url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2024\/09\/cameron.jpeg","url2x":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2024\/09\/cameron.jpeg"},"user_url":"","last_name":"Turner","first_name":"Cameron","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/491","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=491"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/491\/revisions"}],"predecessor-version":[{"id":37276,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/491\/revisions\/37276"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/2785"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=491"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=491"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=491"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=491"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}