Tirthajyoti Sarkar

About Me

Dr. Tirthajyoti Sarkar, Principal Engineer at ON Semiconductor, conducts research on and designs advanced semiconductor technology and products, which power various things from smartphones to electric cars, with data centers and washing machines in between. He also moonlights by learning and practicing data science, machine learning, and Python/R programming. He writes for multiple Data Science/Artificial intelligence focused publications and loves to experiment with advanced machine learning techniques for application to semiconductor designs.

Why you should forget loops and embrace vectorization for Data Science

Wherever you have a long list of data and need to perform some mathematical transformation over them, strongly consider turning those python data structures (list or tuples or dictionaries) into numpy.ndarray objects and using inherent vectorization capabilities.

Data Analytics with Python by Web scraping: Illustration with CIA World Factbook

This article goes over a demo Python notebook to illustrate how to crawl web pages for downloading raw information by HTML parsing using BeautifulSoup. Thereafter, it also illustrates the use of Regular Expression module to search and extract important pieces of information what the user demands. Above all, it demonstrates how or why there can be no simple, universal rule or program structure while mining messy HTML parsed texts. One has to examine the text structure and put in place appropriate error-handling checks to gracefully handle all the situations to maintain the flow of the program

Step-by-step guide to build your own ‘mini IMDB’ database

How to use simple Python libraries and built-in capabilities to scrape the web for movie information and store them in a local SQLite database. This article goes over a demo Python notebook to illustrate how to retrieve basic information about movies using a free API service and to save the movie posters and the downloaded information in a lightweight SQLite database. Above all, it demonstrates simple utilization of Python libraries such as urllib, json, and sqlite3, which are extremely useful (and powerful) tools for data analytics/ web data mining tasks.

Some Essential Hacks and Tricks for Machine Learning with Python

It’s never been easier to get started with machine learning. Familiarity and moderate expertise in at least one high-level programming language is useful for beginners in machine learning. You are expected to mostly use the existing machine learning algorithms and apply them in solving novel problems. This requires you to put on a programming hat. It’s widely believed that Python helps developers to be more productive from development to deployment and maintenance.  This article will focus on some essential hacks and tricks in Python focused on machine learning.

How much mathematics does an IT engineer need to learn to get into data science/machine learning?

A great many traditional IT engineers are enthusiastic about learning/contributing to the exciting field of data science and machine learning/artificial intelligence. However it will be incomplete in your preparation for having solid grasp over machine learning or data science techniques without having a refresher in some essential mathematics. Then the question is: What are the essential topics/sub topics of mathematics that an average IT engineer must study/refresh if (s) he wants to enter into the field of business analytics/data science/data mining? How much mathematics does an IT engineer need to learn to get into machine learning?

Eight ways to perform linear regression analysis in Python and how they scale with data set size

For a myriad of data scientists, linear regression is the starting point of many statistical modeling and predictive analysis projects. The importance of fitting (accurately and quickly) a linear model to a large data set cannot be overstated. The goal of this article is primarily to discuss the relative speed/computational complexity of these methods.

Essential beginners’ Q/A for machine learning/data science: Part I

Here are some useful advice and questions and answers for machine learning/data science ‘starters’. We cover key books, foundation knowledge, mathematics, and programming tools needed to kickstart the journey. A curiosity to learn new things and a passion to work hard for it is necessary. You have to acquire knowledge, practice, and internalize concepts as you go. Do your own reading, understand what it is and what it is not, where it might go, and what possibilities it can open up. Then sit back and think about how you can apply machine learning or imbue data science principles into your daily work.

How the good old sorting algorithm helps a great machine learning technique

 In this article, we show how the simple sorting algorithm is at the heart of solving an important problem in computational geometry and how that relates to a widely used machine learning technique. Although there are many discrete optimization based algorithms to solve the SVM problem, this approach demonstrates the importance of using fundamentally efficient algorithms at the core to build complex learning model for AI. A dizzying array of clever algorithms are being developed continuously for solving ML problems to learn patterns from streams of data and build AI infrastructure.

The Harvard Innovation Lab

Made in Boston @

The Harvard Innovation Lab


Matching Providers

Matching providers 2
comments powered by Disqus.