Medium 7e51890c 1624 4210 b129 701afbced096

Being a Data Scientist does not make you a Software Engineer!

Unless you have come into Data Science and Machine Learning (ML) from an IT background and have tangible experience into building enterprise, distributed, solid systems, your Jupyter notebook does not qualify as a great piece of software and sadly does not make you a Software Engineer! This blog shows you how you can build a scalable architecture to surround your witty Data Science solution! This will cover the basics of software engineering with regards to architecture and design and how to apply these on each step of the Machine Learning Pipeline.

Medium a2f5c359 7a85 4f8f 99f0 8b9a01686809

Five things you can do today to start earning customer trust

Most organizations do not recognize how their day-to-day behavior worsens the trust problem. Nearly 6 in 10 customers don’t believe companies have their best interests in mind. Sales and marketing jargon fuels buyers' fears. It reinforces their belief that vendors do not understand their business challenges. Without a focused process for improving customer experience, trust diminishes. Lack of trust impacts a company's top and bottom line results. Repercussions alter sales dynamics. Your acquisition costs rise. Sales cycles are longer, and close rates lower. Average selling price also falls victim.

Medium bc9d864d f86e 4a65 8719 0da51338655a

A Brief Summary of Apache Hadoop: A Solution of Big Data Problem and Hint comes from Google

Welcome to the introduction of Big data and Hadoop where we are going to talk about Apache Hadoop and problems that big data bring with it. And how Apache Hadoop helps to solve all these problems and then we will talk about the Apache Hadoop framework and how it’s work. You will learn all the basic of the Hadoop framework and can work on your further skill to be an expert in data engineer. 

Medium 2d639bd4 0275 485d b960 0e820cfaf088

The Melding of Minds: How AI and Humans are Changing the Workforce

AI is giving us a unique opportunity to rethink how work is done, and how people use their skills and talents to complete critical and differentiating tasks. AI works by applying pattern recognition to categorize structured and unstructured data, to flag anomalies and make recommendations. It can take care of the repetitive tasks so that all employees — from the back office to the CEO — can focus on higher-value projects that help them stay competitive.

Medium d68f0091 903e 47cf a660 4809de372591

Everything You Need to Know About Sales Management

Sales is an integral function of any tech companies, especially enterprise software ones. If your company is building a consumer product, strong marketing strategies will help you gain customers. But if your company is SaaS-based, then strong salespeople are a must to get clients, and learning about sales is the perfect way to know more about the enterprise industry. The 2 most important skills that a salesman needs to learn are deliberate practice and teamwork + accountability.

Medium 12d3cc65 ce0a 4f72 b5c9 6a77b74abccd

The future of the CDO: Chief Data Officers need to sit near the top

Chief data officers are central to the success of organisations. The way that data is managed and indeed tended and nurtured, maybe in much the same way a gardener tends and nurtures the garden, is vital. That means the chief data officer has been catapulted to somewhere near the top. The future role of the CDO is one of increasing importance. It’s about people, process and technology, but the CDO sits at top of that pyramid.

Medium 5d7cc0d5 767a 4ab4 a342 1864772ed7eb

Five Blockchain Trends for You to Consider this Year

For blockchain technology to achieve wide-scale adoption, a decentralised ecosystem has to be developed. To achieve a decentralised society, many more components need to be built, requiring global standards and large investments.  Apart from the different industry layers that need to be developed, we will also continue to see new distributed ledger technologies and more exciting decentralised applications. With the blockchain ecosystem evolving, new applications and technologies come into play as well. Therefore, here are five blockchain trends that you should consider in the coming year.

Medium b533e137 09c1 4935 add4 20a9e31ab501

Overcoming the Pitfalls of Maintenance in Continuous Testing

The need to release software at a rapid pace requires continuous integration (CI) and deployment (CD) is the key to drive the frequency in which code is pushed to production. It is important for testing to start right from the requirements phase and continue all the way till production deployment and monitoring. This is what we call continuous testing (CT). There have been various challenges related to test automation, but the biggest challenge that still continues to haunt teams is the challenge of Maintenance.

Medium e306e596 bf1b 492f 84c8 61033c6aab5f

Data Scientists Are Thinkers

Unlike engineers, designers, and project managers, data scientists are exploration-first, rather than execution-first. Data science roles are a little different. They vary greatly depending on the team structure and size, but generally speaking, execution isn’t where they are at their best. Their most valuable work often comes from exploration. When it comes to complex questions and hypotheses, execution isn’t the answer. Someone has to dive in and figure things out on a deeper level. They have to thoroughly analyze and explore the problem. Data scientists are the perfect candidates to take this on.

Medium a7d7d151 6607 4e0b ba6b 44fc7b0751f8

Verifiable AI Data: Why It’s Critical for the Automation Revolution

Organizations are using data and algorithms based on that data to drive critical and automated decision-making at unprecedented scale.  But what if the data entering the AI algorithms has been compromised along the way or the algorithms themselves altered? Companies need to know that they are using pristine data in their AI systems. They must be able to stand by the integrity of the data and algorithms used by AI. This might be called Verifiable AI — when an organization can provide immutable proof that the data used by their AI systems is unaltered.

Medium 1f4af570 cc98 43f2 a942 35ef10294282

By Jupyter-Is This the Future of Open Science?

People working in science potentially can benefit from every piece of free software code—the operating systems and apps, and the tools and libraries—so the better those become, the more useful they are for scientists. But there's one open-source project in particular that already has had a significant impact on how scientists work—Project Jupyter. Project Jupyter is a set of open-source software projects that form the building blocks for interactive and exploratory computing that is reproducible and multi-language.

Medium b2680c74 d436 4187 8ca5 1d44c8d3d6a1

Top five worst practices for BI and analytics

 The key to achieving BI success by making it accessible to everyone starts with generating insights, then operationalising those insights and being able to place a monetary value on the benefits gained. The goal is to turn data into actionable insights with real business outcomes. However, there are several common mistakes organisations make when rolling out BI and analytics projects that result in their investments ending up as shelfware: unused, forgotten and representing missed opportunities.

Medium d2526721 e371 437a b401 4823cd11624e

Can EdTech broaden access to future jobs?

As Artificial Intelligence continues to automate aspects and functions of various jobs, education is changing faster than ever, with new ideas, technologies, and demands constantly emerging.  At its heart, education is about opportunity, and online learning can potentially make crucial opportunities available to those who would not normally have access to them. This is why it’s crucial that educators see technology not as a threat, but as a tool for enhancing their own pedagogical capacity.

Medium e2724b9f 4833 4099 a193 bbb5c07e7769

Open Science Means Open Source-Or, at Least, It Should

The open science revolution can be said to have begun with open access—the idea that academic papers should be freely available as digital documents. It takes the original idea to the next level, by making that information freely accessible to all. The internet can potentially give everyone with an online connection cost-free access to every article posted online. The same can be said of another important aspect of open science: open data. Before the internet, handling data was a tedious and time-consuming process. But once digitized, even the most capacious databases can be transmitted, combined, compared and analyzed very rapidly. 

Medium fb8ad36d a455 4a0b acfd 413858188e10

Key Kubernetes Commands

Kubernetes is the premier technology for deploying and managing large apps. In this article, we’ll get up and running with K8s on your local machine.  You will know how to set up K8s and run your first K8s app. Also, you will know how to inspect, create, and delete your K8 resources with common commands. Then you’ll deploy your first app. Finally, you’ll see the top K8s commands to know.

Medium 002789db 1229 435d 8580 a06079d0931a

The Machine Learning Race Is Really a Data Race

Machine learning is already becoming a commodity. Companies racing to simultaneously define and implement machine learning are finding, to their surprise, that implementing the algorithms used to make machines intelligent about a data set or problem is the easy part. There is a robust cohort of plug-and-play solutions to painlessly accomplish the heavy programmatic lifting, from the open-source machine learning framework. What’s not becoming commoditized, though, is data. Instead, data is emerging as the key differentiator in the machine learning race. This is because good data is uncommon.

Medium 5ce93f67 0d44 46a0 94fd 5252de3623a7

How Do You Know You Have Enough Training Data?

A crucial issue in machine learning projects is to determine how much training data is needed to achieve a specific performance goal (i.e., classifier accuracy). In this post, we will do a quick but broad in scope review of empirical and research literature results, regarding training data size, in areas ranging from regression analysis to deep learning. The training data size issue is also known in the literature as sample complexity. Specifically, we will present empirical training data size limits for regression and computer vision tasks.

Medium b0258adb a9f9 45e6 ba71 bd3586afafe7

Seven Guidelines to Ensure Ethical AI

With the attention for AI growing, also the call for ethical AI is growing. This is not surprising seeing the many problems we have encountered already. The problems can arise when we rely too much on unaccountable AI. These problems exist due to biased algorithms that are trained using biased data and developed by biased developers. High-quality, unbiased data combined with the right processes to ensure ethical behaviour within a digital environment could significantly contribute to AI that can behave ethically. Since this is difficult to achieve, the European Union published a set of guidelines on how to develop ethical.

Medium 9a2d26fe b6d0 406a bd6a b2e9cb6068ef

High Level Overview of Apache Spark

With the scale of data growing at a rapid and ominous pace, we needed a way to process potential petabytes of data quickly, and we simply couldn’t make a single computer process that amount of data at a reasonable pace. This problem is solved by creating a cluster of machines to perform the work for you, but how do those machines work together to solve the common problem? Spark is the cluster computing framework for large-scale data processing.

Medium 8a029d71 5214 4b07 b0aa a362184c7768

Why AI assistants can’t be robots (for now)

Steady advances in artificial intelligence and natural language processing have made digital assistants increasingly capable of performing complicated voice commands under different circumstances. But does it mean that our digital assistants are ready to escape the confines of smartphones, smart speakers and computers and a bunch of weird gadgets? The only way to make smart assistants really smart is to give it eyes and let it explore the world. While the idea of putting a face on the voices of digital assistants sounds appealing, the truth is that with today’s AI technology, such an idea is doomed to fail.

Medium 30d843d5 6cd3 421d bad6 b677c55e4977

Why We Need Apache Spark

We have a lot of data, and we aren’t getting rid of any of it. We need a way to store increasing amounts of data at scale, with protections against data-loss stemming from hardware failure. Finally, we need a way to digest all this data with a quick feedback loop. Thank the cosmos we have Hadoop and Spark. Apache Spark is a wonderfully powerful tool for data analysis and transformation.

Medium 4e4db8fa c4cc 47ad b090 eb26d6b53993

Opening supercomputing to all agencies

The future is in supercomputers, but until recently, only a handful of agencies have been able to tap into that kind of power. Traditionally, high performance computing (HPC, or supercomputing) has required significant capital investment for large-scale supercomputing infrastructures and operating expenses, and scientists and engineers skilled in HPC application development. Previously, few agencies had these resources and technical expertise. But times have changed, and now the software can be ported out and mainstreamed, and it’s a lot easier to make use of supercomputing in other places.

Medium 60d05963 94b5 446c 89ae 4a02d3cf01a8

Cyber Security in the “When-Not-If” Era

Many large organisations now assume that breaches are simply inevitable, due to the inherent complexity of their business models and the multiplication of attack surfaces and attack vectors which comes with it. This realisation changes fundamentally the dynamics around cyber security. Historically, cyber security has always been seen as an equation between risk appetite, compliance requirements and costs. Compliance and costs were always the harder factors. Risk (was always some form of adjustment variable.

Medium bbaa8da1 1cca 4250 9e00 df6afbb0788e

Eight factors shaping the future of big data, machine learning and AI

The world of big data, machine learning (ML) and AI have developed rapidly over the last 5 years with new technologies, processes and applications changing the way organisations are managing their data. There is a good barometer of what the state-of-the-art is in big data manipulation as well as the concerns of developers and users.  AI and machine learning combined with ever-increasing amounts of data are changing our commercial and social landscapes. A number of themes and issues are emerging within these sectors that CIOs need to be aware of.

The Harvard Innovation Lab

Made in Boston @

The Harvard Innovation Lab