Deep Reinforcement Learning

Ankit Rathi Ankit Rathi
July 17, 2018 Big Data, Cloud & DevOps

Ready to learn Machine Learning? Browse courses like Machine Learning Foundations: Supervised Learning developed by industry thought leaders and Experfy in Harvard Innovation Lab.

While neural networks are responsible for recent breakthroughs in problems like computer vision, machine translation and time series prediction — they can also combine with reinforcement learning algorithms to create something astounding like AlphaGo.

Deep Reinforcement Learning

What is Deep Reinforcement Learning?

To understand deep reinforcement learning, lets first look at some definitions from Wikipedia:

Reinforcement learning (RL) is an area of machine learning inspired by behaviourist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.

Deep learning is loosely related to information processing and communication patterns in a biological nervous system, such as neural coding that attempts to define a relationship between various stimuli and associated neuronal responses in the brain.

Deep reinforcement learning (DRL) is a machine learning method that extends reinforcement learning approach using deep learning techniques.

So by above definitions we can infer that the traditional Reinforcement learning aims to solve problems of how agents can learn to take the best actions on the environment to get the maximum cumulative reward over time. A major part of this process is carefully engineering feature representations. The advances in algorithms for Deep learning have brought up a new wave of successful applications in Reinforcement Learning, because it offers the opportunity to efficiently work with high dimensional input data (like images). In this context the trained deep neural network can be seen as a kind of Deep Reinforcement learning approach, where the agent can learn a state abstraction and a policy approximation directly from its input data.

Why Deep Reinforcement Learning is required?

In those kinds of situations where you use supervised & unsupervised learning , you already have a pretty good idea of the data you have, what’s going on and how to solve the problem. You’re using machine learning to find interesting patterns in that data to get to a better solution, accelerate the process and get to your solution faster. But what about those situations or problem spaces where you have partial data or no data, where an agent can only learn by trial and error. In these situations reinforcement learning comes handy, domain experts and organizations typically know what they want a system to do, but they want to automate or optimize a specific process. Recent advances in Deep learning area has also fueled in Reinforcement learning as it doesn’t need hand-engineered features any more because of this ability. After appropriate many backpropagations, deep neural network knows which information is important to do the task.

How to use Deep Reinforcement Learning?

Reinforcement learning is inspired by behavioral psychology.

Instead of providing the model with ‘correct’ actions, we provide it with rewards and punishments. The model receives information about the current state of the environment (e.g. the computer game screen). It then outputs an action, like a joystick movement. The environment reacts to this action and provides the next state, alongside with any rewards.

The model then learns to find actions that lead to maximum rewards.

Q-learning intuition:

Most modern RL algorithms are some adaptation of Q-Learning. A good way to understand Q-learning is to compare it with playing chess.

Q(S,A) = R + γ * max Q(S’,A’)

The expected future reward Q(S,A) for a given a state S and action A is calculated as the immediate reward R, plus the expected future reward thereafter Q(S’,A’). We assume the next action A’ is optimal.

As a regression problem:

When playing a game, we generate lots of experiences. These experiences are our training data. We can frame the problem of estimating Q(S,A) as a regression problem. To solve this, we can use a neural network.

Training the experiences:

In training process, batch of experiences is trained on neural net using a loss function, where we calculate how far or near is predicted outcome from actual outcome.

Building the model:

In the next step, we build a model that will learn a Q-function for the game.

Exploration:

This is the final step of Q-Learning, where agent will choose some random option for exploration, which will not necessarily the best.

References:

What is deep reinforcement learning, and how does it work?

Welcome to Deep Reinforcement Learning

Deep reinforcement learning: where to start

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Ankit Rathi

    Tags
    Big Data & Technology
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    How data (and buzzwords) can make money for your manufacturing business

    How data (and buzzwords) can make money for your manufacturing business

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.