{"id":526,"date":"2017-08-31T10:23:25","date_gmt":"2017-08-31T07:23:25","guid":{"rendered":"http:\/\/kusuaks7\/?p=131"},"modified":"2025-03-27T13:33:24","modified_gmt":"2025-03-27T13:33:24","slug":"how-to-become-a-data-scientist-part-1-3","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/how-to-become-a-data-scientist-part-1-3\/","title":{"rendered":"How to Become a Data Scientist (Part 1\/3)"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"526\" class=\"elementor elementor-526\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-6c45d264 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"6c45d264\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1fd6196e\" data-id=\"1fd6196e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6fd408cd elementor-widget elementor-widget-text-editor\" data-id=\"6fd408cd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div>\n<p style=\"text-align: center;\"><span style=\"display: none;\">\u00a0<\/span><span style=\"display: none;\">\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><em><strong>Need Data Science Training\u00a0? <a href=\"https:\/\/www.experfy.com\/training\/tracks\/data-science-training-certification\">Browse courses<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/strong><\/em><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>This is Part One in\u00a0a Three-Part Series<\/em><\/span><\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>Part One: What is Data Science? | <a href=\"https:\/\/www.experfy.com\/blog\/how-to-become-a-data-scientist-part-2-3\">Part Two: Learning<\/a> | <a href=\"https:\/\/www.experfy.com\/blog\/how-to-become-a-data-scientist-part-3-3\">Part Three: The Job Market<\/a><\/em><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1119c13 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1119c13\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-14be986\" data-id=\"14be986\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6284a22 elementor-widget elementor-widget-text-editor\" data-id=\"6284a22\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<hr \/>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">I am a recruiter specialised in the field of data science. The idea for this project arose because one of the most common questions I am asked is: \u201c<em>how do I obtain a position as a data scientist?\u201d<\/em> It is not just the regularity of this question that got my attention, but also the diverse backgrounds from where it was coming from. To name a few, I have had this conversation with: software engineers, database developers, data architects, actuaries, mathematicians, academics (of various disciplines), biologists, astronomers, theoretical physicists \u2013 I could go on. And through these conversations, it has become apparent that there is a huge amount of misinformation out there, which has left people confused about what they need to do, in order to break into this field.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">I decided, therefore, that I would investigate this subject to cut through the BS and provide a useful resource\u00a0for anyone looking to move into commercial data science \u2013 whether you are just starting out, or already possess all the necessary skills but have no industry experience. And so I set out with the aim of answering two very broad questions:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3e4e61a elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3e4e61a\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-842bb0b\" data-id=\"842bb0b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-03fee09 elementor-widget elementor-widget-text-editor\" data-id=\"03fee09\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">What skills are required for data science, and how should you go about picking these up? (Chapters One, Two and Three)<\/span><\/span><\/li>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">From a job market perspective, what steps can you take to maximise your chances of gaining employment in data science? (Chapter Four)<\/span><\/span><\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ffd4020 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ffd4020\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-991077c\" data-id=\"991077c\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-258d5e0 elementor-widget elementor-widget-text-editor\" data-id=\"258d5e0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Why am I qualified to write this? Well, I speak with data scientists every day and to be an effective recruiter, I need to understand career paths, what makes a good data scientist, and what employers look for when hiring. So I already possess some knowledge on the matter. But I also wanted to find out directly from those who have trodden this path, so I began speaking with data scientists of different backgrounds to see what I could unearth. And this took me on a journey through ex-software engineers, an ex-astrophysicist and even an ex-particle physicist, who \u2013 to my excitement \u2013 had been involved in one of the biggest scientific breakthroughs of the 21st century.<\/span><\/span><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-49a3c0c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"49a3c0c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0e8f5e5\" data-id=\"0e8f5e5\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-30f3015 elementor-widget elementor-widget-text-editor\" data-id=\"30f3015\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<hr \/>\n<p style=\"text-align: center;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>CHAPTER ONE: \u00a0WHAT IS DATA SCIENCE?<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Different Types of Data Science<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">So you have made the decision to become a data scientist. Great, you are on your way. But now you have another choice, which is: <em>what kind of data scientist do you want to become<\/em>? Because \u2013 it is important to acknowledge \u2013 while data science as a profession has been recognised for a number of years now, there still isn\u2019t a commonly accepted definition of <em>what it actually is<\/em>.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">In reality, the term \u2018data scientist\u2019 is regarded as a broad job title and so it comes in many forms, with the specific demands dependent on the industry, the business, and the purpose\/output of the role in question. As a result, certain skillsets suit certain positions better than others, and this is why the path to data science is not uniform and can be via a diverse range of fields such as statistics, computer science and other scientific disciplines.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">The purpose is the biggest factor that dictates what form data science takes, and this is related to the Type A-Type B classification that has emerged (see here, by Michael Hochster: <a href=\"https:\/\/www.quora.com\/What-is-data-science\/answer\/Michael-Hochster?srid=2sK8&amp;share=98226ca3\" class=\"broken_link\" rel=\"noopener\">What is Data Science?<\/a>). Broadly speaking, the categorisation can be summarised as:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-123feab elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"123feab\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6a529bf\" data-id=\"6a529bf\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-793fdd9 elementor-widget elementor-widget-text-editor\" data-id=\"793fdd9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<ul>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Data science for people (Type A), i.e. analytics to support evidence-based decision making<\/span><\/span><\/li>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Data science for software (Type B), for example: recommender systems as we see in Netflix and Spotify<\/span><\/span><\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2c63f97 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2c63f97\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-cd274b8\" data-id=\"cd274b8\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-167a27a elementor-widget elementor-widget-text-editor\" data-id=\"167a27a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">We may\u00a0see further evolution of these definitions as the field matures, and this is where we will introduce our first expert into the mix: Yanir Seroussi <em>(remember the names, as we will be returning to them throughout). <\/em>Yanir is currently Head of Data Science at Car Next Door (a start-up enabling car sharing), and he wrote about this very topic in his blog: <a href=\"https:\/\/yanirseroussi.com\/2016\/08\/04\/is-data-scientist-a-useless-job-title\/\" rel=\"noopener\">Is Data Scientist a Useless Job Title?<\/a><\/span><span style=\"font-family: arial,helvetica,sans-serif;\">\u00a0If you enjoy this, check out Yanir\u2019s other posts \u2013 he is a regular and eloquent writer on a variety of topics around data science.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-c02782d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c02782d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ac787c4\" data-id=\"ac787c4\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-be57848 elementor-widget elementor-widget-text-editor\" data-id=\"be57848\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Owning Up To The Title<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Before we delve any deeper, it is worth taking a moment to reflect on the \u2018science\u2019 in \u2018data science\u2019,\u00a0because \u2013 in a sense \u2013 all scientists are data scientists, as they all work with data in one form or another. But to take what is generally considered to be data science in industry, what actually makes it a science? Great question! The answer should be: <em>\u2018the scientific method\u2019<\/em>. Given the multi-disciplinary nature of science, the scientific method is the one thing that binds the fields together. If you got this right, full marks to you.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">However, job titles tend to be applied very loosely in industry and so not all data scientists are true scientists. Ask yourself though: can you justify calling yourself a scientist if your role does not involve actual science? Personally, I do not see what is wrong with alternatives like \u2018analyst\u2019, or whatever best fits the position in question. But maybe this is just me, and perhaps I would be better off calling myself a recruitment scientist.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-8aa0f47 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"8aa0f47\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c30033e\" data-id=\"c30033e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fa28b61 elementor-widget elementor-widget-text-editor\" data-id=\"fa28b61\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">For an excellent discussion on this, I thoroughly recommend this post by Sean McClure: <a href=\"http:\/\/www.linkedin.com\/pulse\/20141202183759-103457178-data-scientist-owning-up-to-the-title?trk=prof-post\" rel=\"noopener\">Data Scientist: Owning Up To The Title<\/a> (<em>yes, I admit it \u2013 I plagiarised the heading)<\/em>.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">And with that out the way, we will continue this exploration by considering what areas of expertise you will need to master (if you haven\u2019t already).<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>1. \u00a0Problem Solving<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">If this is not top of your list, amend that list. Immediately. At the core of all scientific disciplines is problem solving: a great data scientist is a great problem solver; it is as simple as that. Need further proof? How about every single person I met for this project, irrespective of background or current working situation, mentioned this as <strong>THE<\/strong> most important factor in data science.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-291f0b7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"291f0b7\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e2cda32\" data-id=\"e2cda32\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ded6c6f elementor-widget elementor-widget-text-editor\" data-id=\"ded6c6f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Clearly, you need to possess the tools to solve the problems, but they are just that: <em>tools<\/em>. In this sense, even the statistical\/machine learning techniques can be thought of as the tools by which you solve problems. New techniques arise, technology evolves; <em>the one constant is problem solving<\/em>.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">To an extent, your ability as a problem solver is dictated by your nature, but at the same time, there is only one-way to improve: <strong>experience, experience, experience.<\/strong> We will re-visit this in Chapter Three, so at this point, just remember this important lesson: you can only master something through doing.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Before we move on, I would like to direct you to another great post from Sean McClure: <a href=\"https:\/\/www.linkedin.com\/pulse\/20141113191054-103457178-the-only-skill-you-should-be-concerned-with?trk=hp-feed-article-title-like\" rel=\"noopener\">The Only Skill You Should Be Concerned With<\/a> <em>(just to be clear, I am not receiving any payment for these pointers, but I am totally open to it. Sean \u2013 if you are reading this, you can send me money anytime).<\/em><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>2. \u00a0Statistics \/ Machine Learning<\/strong><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-aa263bc elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"aa263bc\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-403cd06\" data-id=\"403cd06\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6ce2b57 elementor-widget elementor-widget-text-editor\" data-id=\"6ce2b57\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Ok, having read the above, it might seem like I have trivialised statistics and machine learning. But we are not talking about a power tool here; these are complex \u2013 and to an extent \u2013 esoteric fields, and if you do not possess expert knowledge, you will not be solving data science problems any time soon.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">To provide some much-needed clarification on these terms, machine learning can be viewed as a multi-disciplinary field\u00a0that grew out of <em>both<\/em> artificial intelligence\/computer science <em>and<\/em> statistics. It is often seen as a subfield of AI, and while this is true, it is important to recognise that there is no machine learning without statistics (ML is heavily dependent on statistical algorithms in order to work). For a long time statisticians were unconvinced by machine learning, with collaboration between the two fields being a relatively recent development (see statistical learning theory), and it is interesting to note that high dimensional statistical learning only happened when statisticians embraced ML results (<em>thanks to Bhavani Rascutti, Advanced Analytics Domain Lead at Teradata for this input<\/em>).<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-67dc30a elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"67dc30a\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7056a25\" data-id=\"7056a25\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f8bb1d5 elementor-widget elementor-widget-text-editor\" data-id=\"f8bb1d5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">For the technical readers who are interested in a more detailed account, check out this classic paper published in 2001 by Leo Breiman: Statistical Modelling: The Two Cultures.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>3. \u00a0Computing <\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>a. \u00a0Programming<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">We only need to briefly touch on programming because it should be obvious: this is an absolute must. How can you apply the theory\u00a0if you cannot code a unique algorithm or build a statistical model?<\/span><\/span><\/p>\n<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>b. \u00a0Distributed Computing<\/strong><\/span><\/span><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-cb6593e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"cb6593e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-843fc76\" data-id=\"843fc76\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b5bdd86 elementor-widget elementor-widget-text-editor\" data-id=\"b5bdd86\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Not all businesses have massive datasets but considering the modern world, it is advisable to develop the ability to work with BIG DATA (!). In short: the main memory of a single computer is not going to cut it, and if you want to simultaneously train models across hundreds of virtual machines, you need to get to grips with distributed computation and parallel algorithms.<\/span><\/span><\/p>\n<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>Why the exclamations mark? Personally, I find the misnomer that is \u201cbig data\u201d farcical. The term is continually confused and often used as an umbrella term for all analytics. Furthermore, massive data volumes (and the technologies to store and manage these quantities) are not new like they once were, so it is only a matter of time before it expires from our lexicon. For an expanded discussion on this, there is yet another sensible post from Sean McClure: <\/em><a href=\"http:\/\/www.linkedin.com\/pulse\/data-science-big-two-very-different-beasts-sean-mcclure-ph-d-?trk=prof-post\" rel=\"noopener\"><em>Data Science and Big Data: Two Very Different Beasts<\/em><\/a><em> (this is getting ridiculous now \u2013 I swear I have never even talked to the guy). <\/em><\/span><\/span><\/p>\n<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>c. \u00a0Software Engineering<\/strong><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-6a5a8e6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"6a5a8e6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c2eebd1\" data-id=\"c2eebd1\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b9d6ac2 elementor-widget elementor-widget-text-editor\" data-id=\"b9d6ac2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify; margin-left: 40px;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">For Type A data science, let me be clear: engineering is a separate discipline. So if this is the type of data scientist you want to become, you do not need to be an engineer. However, if you want to put machine learning algorithms into production (i.e. Type B), you will need a strong foundation in software engineering.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>4. \u00a0<\/strong><\/span><\/span><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Data Wrangling<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Data cleaning\/preparation is a crucial and intrinsic part of data science. <em>And this will take up the majority of your time<\/em>. If you fail to remove the noise from your dataset (e.g. wrong\/missing values, non-standardised categories, etc.), then the accuracy of the model will be affected and will ultimately lead to incorrect conclusions. Therefore, if you are not prepared to spend the time and attention on this step, it renders your advanced technical know-how irrelevant.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a5c0f57 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a5c0f57\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ba52a67\" data-id=\"ba52a67\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9914774 elementor-widget elementor-widget-text-editor\" data-id=\"9914774\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">It is also important to note that data quality is a persistent issue in commercial organisations and many businesses have complicated infrastructures when it comes to data storage. So if you are not prepared for this environment and you want to work with nice clean datasets, unfortunately commercial data science is not for you.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>5. \u00a0Tools and Technology<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">As you should have realised by now, developing your ability as a problem solving data scientist should take precedence over everything else: technologies constantly change and can ultimately be learnt in a relatively short timeframe. But we shouldn\u2019t ignore them altogether, so it is useful to be aware of the most widespread tools in use today.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Starting with programming languages, R and Python are the most common; so if you have a choice, perhaps use one of these when you are experimenting.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-969028e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"969028e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e6905ac\" data-id=\"e6905ac\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-84a809d elementor-widget elementor-widget-text-editor\" data-id=\"84a809d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Particularly in Type A data science, having the ability to visualise data in intuitive dashboards is very powerful for communicating with non-technical business stakeholders. You might have the best model and the best insights, but if you cannot present\/explain the findings effectively, what use is it? It really doesn\u2019t matter what tool you use for visualisation \u2013 it could be R, or Tableau (which seems to be the most prevalent at the moment), but honestly \u2013 the tool is unimportant.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Finally,\u00a0SQL is significant, as it is the most common language used to interact with databases in industry;\u00a0whether we are talking about relational databases, or derivatives of SQL used with big data technologies. And it is the bread and butter of data wrangling \u2013 at least when working at larger scales (i.e. not in memory).\u00a0In summary: it really is worth investing your time into.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>6. \u00a0Communication \/ Business Acumen<\/strong><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-54b4d6c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"54b4d6c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-496c689\" data-id=\"496c689\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fd1c623 elementor-widget elementor-widget-text-editor\" data-id=\"fd1c623\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">This should not be understated. Unless you are going into something very specific, perhaps pure research (although let\u2019s face it, there aren\u2019t many of these positions around in industry), the vast majority of data science positions involve business interaction, often with individuals who are not analytically literate.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Having the ability to conceptualise business problems and the environment in which they occur is critical. And translating statistical insights into recommended actions and implications to a lay audience is absolutely crucial, particularly for Type A data science. I was chatting to Yanir about this, and this is how he put it:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-c317762 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c317762\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-b76e673\" data-id=\"b76e673\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ee3e2b9 elementor-widget elementor-widget-text-editor\" data-id=\"ee3e2b9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cI find it weird how some technical people don&#8217;t pay attention to how non-technical people&#8217;s eyes glaze over when they start using jargon. It&#8217;s really important to put yourself in the listener&#8217;s\/reader&#8217;s shoes\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Rock Stars<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">It probably isn\u2019t clear: I have used this heading ironically. No \u2013 data scientists are not rock stars, ninjas, unicorns or any other mythical creature. If you are planning on referring to yourself like this, perhaps take a long look in the mirror. But I digress. The point I want to make here is this: there are some data scientists who possess expert level ability in all of the above, and perhaps more. They are rare and extremely valuable. If you have the natural ability and desire to become one of these, then great \u2013 you are going to be hot property. But if not, remember: you can specialise in certain areas of data science, and quite often, good teams are comprised of data scientists with different specialities. Deciding what to focus on goes back to your interests and capability, and this leads us nicely to the next chapter in our journey.<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-d892c67 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d892c67\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8096b33\" data-id=\"8096b33\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c7dc057 elementor-widget elementor-widget-text-editor\" data-id=\"c7dc057\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<hr \/>\n<p style=\"text-align: center;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>CHAPTER TWO:\u00a0 LOOKING INWARDS<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Now we are making progress.\u00a0Having successfully digested the information in Chapter One, you are nearly ready to begin formulating your personal goals and objectives. But first \u2013 some introspection is required \u2013 so grab a coffee, find a quiet spot, and have a deep think about:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2306e88 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2306e88\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4859810\" data-id=\"4859810\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-803985f elementor-widget elementor-widget-text-editor\" data-id=\"803985f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>Why do you want to be a data scientist?<\/em><\/span><\/span><\/li>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>What type of data science interests you?<\/em><\/span><\/span><\/li>\n \t<li style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>What natural capabilities or relevant skills do you already possess?<\/em><\/span><\/span><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-009a933 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"009a933\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-d256abb\" data-id=\"d256abb\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-771cb71 elementor-widget elementor-widget-text-editor\" data-id=\"771cb71\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Why is this important? Simply put: data science is an expert field, so unless you have already mastered a lot of what we covered in Chapter One, it is not an easy (or quick) journey. There is an important message here, which addresses questions one and two: <strong>you need to have the right reasons <\/strong>for going down this path, otherwise \u2013 chances are \u2013 you will give up when the going gets tough (and it will).<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">To elaborate on this message, enter Dylan Hogg.\u00a0Dylan was previously a software engineer and is now Head of Data Science at The Search Party, a start-up that has built\u00a0a platform that utilises machine learning (NLP) to link employers with relevant candidates (<em>the future of recruitment!).<\/em> Considering he has made the transition from software engineering to data science (a journey he is still on), we discussed what it takes, and he said:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1334f7c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1334f7c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a9d9749\" data-id=\"a9d9749\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-8e2484a elementor-widget elementor-widget-text-editor\" data-id=\"8e2484a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cRegardless of education or experience, there\u2019s something more fundamental, which is your nature of curiosity, determination and tenacity. There are so many times when you hit a problem: perhaps the algorithm isn\u2019t performing in the way it needs to, or perhaps the technology is being a pain. Either way, you can study machine learning algorithms or software engineering best practice, but if you\u2019re not really determined, you&#8217;re going to give up and not get through it\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ecb1b07 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ecb1b07\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2f1dbda\" data-id=\"2f1dbda\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1afaab7 elementor-widget elementor-widget-text-editor\" data-id=\"1afaab7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">There you go: you won\u2019t just face problems when you are learning; you will face them continually in your working life, so you better make sure you are motivated for the right reasons, and not just because you think having \u2018scientist\u2019 in your title is cool.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">But what about question three? Why do your relevant skills matter? Well, where you are starting from affects what type of data science you are most suited to, and what you need to learn for the area that interests you. So to\u00a0answer this question sufficiently, it is necessary to explore the typical paths to data science, beginning with the wider scientific field.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>Note: There are many quantitative disciplines where you will find people with the ability to transition into data science. I won\u2019t cover them all here, but the point is this: if you take the time to really understand the different nuances of data science, you should be able to figure out how relevant your current skillset is, whatever your background. <\/em><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Other Scientific Disciplines<\/strong><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-e152f02 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e152f02\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-d0b93b7\" data-id=\"d0b93b7\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-44d7464 elementor-widget elementor-widget-text-editor\" data-id=\"44d7464\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">This is not the most common route to data science; statistics and computer science are, as we will consider next. But with scientists from many fields having highly relevant skillsets (especially in the world of physics), many have made this jump.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">For an explanation on why, allow me to introduce\u00a0the individual I alluded to within the introduction: Will Hanninger, a Data Scientist with Commonwealth Bank of Australia. In a previous life, Will was a particle physicist with CERN where he worked on the discovery of the Higgs boson (<em>very cool<\/em>), and this is what he had to say:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7c8752c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7c8752c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4eab833\" data-id=\"4eab833\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6c0b83c elementor-widget elementor-widget-text-editor\" data-id=\"6c0b83c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cIn physics,\u00a0you naturally learn a lot of what you need in data science: programming, manipulating data, getting the raw data and transforming it in a useful way. You learn statistics, which is important. And crucially: you learn how to solve problems. These are the basic skills needed for a data scientist\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">So the skillset is highly transferable, with the main box ticked: problem solving. The differences tend to arise in the tools and techniques; for example, while machine learning is synonymous with data science, it is less common in wider science. In any case, we are talking about very smart people here; they have the ability to learn tools and techniques in a short timeframe.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">I also met Sean Farrell for this project; Sean\u2019s background is in astrophysics and he moved into commercial data science with Teradata Australia, where he wrote an excellent blog post on this topic: <a href=\"http:\/\/blogs.teradata.com\/international\/sciences-loss-gain-data-science\/\" class=\"broken_link\" rel=\"noopener\">Why Science\u2019s Loss is a Gain for Data Science<\/a>. The following passage is particularly pertinent:<\/span><\/span><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2251e71 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2251e71\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f8fd93f\" data-id=\"f8fd93f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1334310 elementor-widget elementor-widget-text-editor\" data-id=\"1334310\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cUntil recently there haven\u2019t been any formal training pathways to become a Data Scientist. Most Data Scientists come from backgrounds in statistics or computer science. However, while these other career paths develop some of the skills listed above, they typically don\u2019t cover all of them. Statisticians are very strong on the maths and stats side, but generally have weaker programming skills. Computer scientists are very strong in the programming arena, but typically don\u2019t have as strong a comprehension of statistics. Both have good (yet different) data analysis skill sets but can struggle with creative problem solving, which is arguably the hardest skill to teach\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">To avoid misunderstanding, remember the context here. Sean isn\u2019t saying that all data scientists from statistics or computer science lack creative problem solving; the argument he is making is that science filters extremely effectively for problem solving, arguably more so than statistics\/computer science.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Statistics<\/strong><\/span><\/span><\/p>\n<p dir=\"ltr\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Depending on your perspective, statistics can be viewed as a mathematical tool that facilitates the scientific process, or alternatively: a science in itself. Given this ambiguity then, if you are coming from a statistics background, are you ready-made for data science? Semantics aside, it depends on a few factors:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-2c4a085 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2c4a085\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-269fe5c\" data-id=\"269fe5c\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d922256 elementor-widget elementor-widget-text-editor\" data-id=\"d922256\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<ul>\n \t<li dir=\"ltr\">\n<p dir=\"ltr\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Firstly, do you have experience with machine learning techniques? As we learnt in Chapter One, statistical modelling and machine learning are related, and they overlap in many ways. However the latter possesses significant advantages when applied to massive datasets, and with the adoption of machine learning continuing to rise in all areas of industry, it really is synonymous with all types of data science<\/span><\/span><\/p>\n<\/li>\n \t<li dir=\"ltr\">\n<p dir=\"ltr\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Secondly, at the risk of repeating myself: what area of data science interests you? Clearly a statistics background is better suited to Type A positions, so if your goal is Type B work, you will have some learning to do<\/span><\/span><\/p>\n<\/li>\n \t<li dir=\"ltr\">\n<p dir=\"ltr\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Finally, do you have practical experience working with data? Data wrangling is often a comparative weakness of those coming from statistics, and as we know: it is a crucial component of commercial data science<\/span><\/span><\/p>\n<\/li>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7a580fb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7a580fb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-b45a93d\" data-id=\"b45a93d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-dc606f0 elementor-widget elementor-widget-text-editor\" data-id=\"dc606f0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cA lot of data science work is software engineering. Not always in the sense of designing robust systems, but simply writing software. A lot of tasks you can automate and if you want to run experiments, you have to write code, and if you can do it fast, it makes a huge difference. When I did my PhD, I had to run tens of thousands of experiments every day, and at this scale, it wasn\u2019t possible to do them manually. Having an engineering background meant I could do this with speed, whereas a lot of the students from other backgrounds struggled with basic software issues: they were really good at mathematics but implementing their ideas would take a long time\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-84782c9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"84782c9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6f18048\" data-id=\"6f18048\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-14d0efc elementor-widget elementor-widget-text-editor\" data-id=\"14d0efc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cGood software engineering practices are so valuable when you want to create a robust implementation of a machine learning algorithm in a production environment. It\u2019s all sorts of things \u2013 like maintainable code, a shared code base so multiple people can work on it, things like logging, being able to debug problems in production, scalability \u2013 to know that once things ramp up, you\u2019ve architected it in such a way so that you can parallise it, or add more CPU, if needed. So if you\u2019re looking for the type of roles where you need to get these things into a platform, as opposed to doing exploratory research or answering ad-hoc business questions, software engineering is so valuable\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-0e2be9e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0e2be9e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c1846f0\" data-id=\"c1846f0\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-e1feadf elementor-widget elementor-widget-text-editor\" data-id=\"e1feadf\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">I think that says it all, but to summarise: if you are a software engineer with a good disposition for mathematics, you are in a great position to become a (Type B) data scientist, providing you are prepared to put in the work to master statistics\/machine learning,<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>Mathematics<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">To make an obvious statement: mathematics underpins all areas of data science. Therefore, it seems reasonable to expect that many mathematicians are now plying their trade as data scientists. However, there are relatively few coming directly from mathematics, and this peculiarity peaked my interest. <\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">One explanation is that there are fewer graduates from mathematics (both pure and applied) compared to the other relevant fields of study, but this fails to tell the whole story. And s<\/span><\/span><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">o to dig deeper, I turned to Boris Savkovic, Lead Data Scientist at BuildingIQ (a start-up that uses advanced algorithms to optimise energy use in commercial buildings). Boris has a background in Electrical Engineering and Applied Mathematics and having worked with many mathematicians in his time, he provided the following insights:<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ebf2894 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ebf2894\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-846ef62\" data-id=\"846ef62\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6b130a2 elementor-widget elementor-widget-text-editor\" data-id=\"6b130a2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<blockquote>\n<p style=\"text-align: justify;\"><span style=\"font-size: 11px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><em>\u201cMany mathematicians have a love of theoretical problems, beautiful equations and seeing deep meaning in theorems, whereas commercial data science is empirical, messy and dirty. While some mathematicians love this, many hate it. The real world is complex, you cannot sandbox everything, you have to prioritise, appreciate the incentives of others, compromise the math and technology for short-term vs. medium-term vs. long-term, worry about diminishing returns (80\/20 rule) and deal with both deep theory and deep practice, and everything in-between. In short: you have to be flexible and adaptable to deal with the real world. And this is ultimately what commercial data science is about: finding faster and better practical solutions that make money. For those with heavy mathematical\/theory backgrounds who want to understand everything to the last degree, this can be very difficult, and I have seen a number of mathematics PhDs struggle badly when transitioning from research\/academia to commercial data science\u201d<\/em><\/span><\/span><\/p>\n<\/blockquote>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a2168f9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a2168f9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-caa9181\" data-id=\"caa9181\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6915bb9 elementor-widget elementor-widget-text-editor\" data-id=\"6915bb9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">It is important to note that Boris was referring more to pure mathematicians, and he added that he has also worked with many excellent applied mathematicians in his career. This seems logical because pure mathematics is likely to attract those with a love for the theory, as opposed to real world problems. And theoretical work won\u2019t involve much interaction with data, which is \u2013 you know \u2013 quite important for data science.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">There are exceptions of course and it ultimately comes down to individual character, not purely what someone has studied. And clearly: a lot of what mathematics graduates learn is highly transferable, so picking up the specific statistical\/machine learning techniques shouldn\u2019t be too difficult (if not already known).<\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-8ae7ae5 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"8ae7ae5\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2f9bb7f\" data-id=\"2f9bb7f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4c65439 elementor-widget elementor-widget-text-editor\" data-id=\"4c65439\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">In terms of suitability, most mathematicians are probably best equipped to learn the tools and theory for Type A data science. However, there are mathematicians who study computer science (theoretical computer science is essentially a branch of mathematics) and so people with background may be more suited to Type B data science.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">There is an important lesson to take from all this, and it comes down to understanding the reality of what commercial data science involves. If you truly understand the challenges and that is what you are seeking, then go for it. But if you have a love for the theory more than the practical application, you might want to reassess your thinking.<\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><strong>The Blank Canvas<\/strong><\/span><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">If you are just starting out, perhaps you are in school, you enjoy maths, science and computing, and you like the sound of this thing called data science, well good news: you can choose your path without being constrained by a pre-existing background. And there are now a number of specific data science related courses, which cover both computer science and mathematics\/statistics. Just be prepared for the long haul; you will not become a data scientist over night, as we will see in Part Two, where we will be examining: <strong><em>how to learn.<\/em><\/strong><\/span><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>This is Part One in a three-part series examining how to become a data scientist. Supported by extensive research and expert opinions, it aims to provide a comprehensive guide to anyone looking to move into this field, irrespective of background and experience. The topic of Part One is: &#8220;What is Data Science?&#8221;.<\/p>\n","protected":false},"author":794,"featured_media":24243,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-post-2.php","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[95],"ppma_author":[1614],"class_list":["post-526","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-big-data-amp-technology"],"authors":[{"term_id":1614,"user_id":794,"is_guest":0,"slug":"alec-smith","display_name":"Alec Smith","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Smith","first_name":"Alec","job_title":"","description":"Alec is a specialist recruiter within the field of data science and engineering. The position of an agency recruiter offers a unique, cross-sector perspective of commercial analytics and he leverages this viewpoint to write about various topics within data science, technology and hiring. Originally from the UK, he is currently plying his trade in Sydney, Australia. Follow Alec on Twitter&nbsp;@dataramblings."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/526","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/794"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=526"}],"version-history":[{"count":9,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/526\/revisions"}],"predecessor-version":[{"id":37485,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/526\/revisions\/37485"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/24243"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=526"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=526"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=526"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=526"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}