{"id":25381,"date":"2021-07-07T06:54:06","date_gmt":"2021-07-07T06:54:06","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/?p=25381"},"modified":"2023-08-17T11:34:21","modified_gmt":"2023-08-17T11:34:21","slug":"reinforcement-learning-pump-optimization-water-distribution-plant","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/reinforcement-learning-pump-optimization-water-distribution-plant\/","title":{"rendered":"Reinforcement Learning for Pump Optimization in a Water Distribution Plant"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"25381\" class=\"elementor elementor-25381\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ad61d17 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"82088\" data-id=\"ad61d17\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-756e1c7\" data-eae-slider=\"21554\" data-id=\"756e1c7\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3ef6545 elementor-widget elementor-widget-heading\" data-id=\"3ef6545\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Water Distribution System Overview:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-cb4f90f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"55007\" data-id=\"cb4f90f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-298d28a\" data-eae-slider=\"78289\" data-id=\"298d28a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b96776d elementor-widget elementor-widget-text-editor\" data-id=\"b96776d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>A water distribution system is a water supply network, which has components such as pumps and valves to carry water from storage tanks to water consumers in order to satisfy their requirements.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-88de9ff elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"85328\" data-id=\"88de9ff\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6a66564\" data-eae-slider=\"18104\" data-id=\"6a66564\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6ca354b elementor-widget elementor-widget-heading\" data-id=\"6ca354b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Need for Pump \/ Valve optimization:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1948cf2 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"9688\" data-id=\"1948cf2\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-948f228\" data-eae-slider=\"28999\" data-id=\"948f228\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6478996 elementor-widget elementor-widget-text-editor\" data-id=\"6478996\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Inefficient usage of pumps can lead to enormous problems such as high energy usage, high cost, and overflow in storage tanks and not meeting demand requirements. Real-time pump control is resource-intensive and an infeasible task using calculations. Efficient pump control can lead to low energy, low cost, tank levels under upper-bound and lower-bound constraints, and meeting demand requirements.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-b83ed2f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"60966\" data-id=\"b83ed2f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-da9186f\" data-eae-slider=\"73581\" data-id=\"da9186f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1b406ea elementor-widget elementor-widget-heading\" data-id=\"1b406ea\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Reinforcement Learning as a pump optimization solution:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-13280d0 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"70070\" data-id=\"13280d0\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2258e48\" data-eae-slider=\"36696\" data-id=\"2258e48\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3047c4d elementor-widget elementor-widget-text-editor\" data-id=\"3047c4d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Optimization does no harm in any field. Having said that, it is important to know what can be achieved by using optimization and whether it can solve the requirements. Optimizing pumps in water plants lead to low energy usage and low operation cost. Moreover, optimization should also solve existing requirements, such as fulfilling demand and keeping reservoir levels under constraint.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bf4c728 elementor-widget elementor-widget-text-editor\" data-id=\"bf4c728\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Reinforcement Learning is a subfield of <a href=\"http:\/\/www.experfy.com\/blog\/ai-ml\/coding-deep-learning-for-beginners-types-of-machine-learning\/\" target=\"_blank\" rel=\"noopener\">machine learning<\/a>, whose goal is to find a policy with which the agent can govern the environment to the most advantageous state from any initial state in order to maximize the reward over the time.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b746c50 elementor-widget elementor-widget-text-editor\" data-id=\"b746c50\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>There are two main components in the reinforcement learning solution: Agent and Environment. An RL Agent is an AI Algorithm, and an RL Environment is a task\/simulation that needs to be solved by an RL agent. The environment interacts with the agent by sending its state and a reward. Thus, the environment should be constructed by using simulation, a reward system, and state vectors, which represent the internal state of the simulation.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-77ecde9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"55046\" data-id=\"77ecde9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f73947b\" data-eae-slider=\"26808\" data-id=\"f73947b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f46490a elementor-widget elementor-widget-heading\" data-id=\"f46490a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">1. Agent:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-502b594 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"98928\" data-id=\"502b594\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ccf0a5a\" data-eae-slider=\"16548\" data-id=\"ccf0a5a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1e0114b elementor-widget elementor-widget-text-editor\" data-id=\"1e0114b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The Agent is nothing but an <a href=\"https:\/\/www.experfy.com\/jobs\/ai-machine-learning\">AI algorithm<\/a>. The goal\/solution is to have a \u201cReinforcement Learning Agent\u201d, which can provide an optimized pump schedule that solves the above-listed requirements and saves cost and energy. The RL Agent must communicate to the environment, and it should be reusable.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e940e8c elementor-widget elementor-widget-text-editor\" data-id=\"e940e8c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Reinforcement Learning Agents (AI algorithms) can be model-based and model-free. Popular model-based algorithms are dynamic programming (DP), and popular model-free algorithms are Monte Carlo, Sarsa, and Q Learning. In 2015, DeepMind combined a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Q-learning\" target=\"_blank\" rel=\"noopener\">Q Learning<\/a> algorithm with deep neural networks and produced the Deep Q-Network (DQN), which can solve a wide variety of problems, and it is also one of the most powerful algorithms. DQN approximates the q-value function by using deep neural networks, which also belongs to the model-free algorithm category.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7021181 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"64011\" data-id=\"7021181\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-38c07f1\" data-eae-slider=\"65009\" data-id=\"38c07f1\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-acff5fe elementor-widget elementor-widget-heading\" data-id=\"acff5fe\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">2. Environment:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-c142d8e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"39043\" data-id=\"c142d8e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7478977\" data-eae-slider=\"73557\" data-id=\"7478977\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-bc84a1e elementor-widget elementor-widget-text-editor\" data-id=\"bc84a1e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>An environment consists of two main components: a simulation, and a reward mechanism. The simulation takes an existing state and action taken by an RL agent and estimates the next state, and then updates them. Simulation can be carried out by using any method. Generally, hydraulic models and ML models are the common practice for simulation. The reward mechanism is a feedback loop to the agent. It calculates reward based on the action taken by the RL agent and outcome (simulation results). The reward mechanism is very important as it drives the learning for RL agents; therefore, the reward mechanism is one of the most crucial components. Here are some examples:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cbfbf95 elementor-widget elementor-widget-text-editor\" data-id=\"cbfbf95\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul>\n<li>Penalty on breaking reservoir constraints\u00a0<\/li>\n<li>Reward on keeping the level under constraints<\/li>\n<li>Small penalty on turning the pump ON during expensive tariff hours\u00a0<\/li>\n<li>Small reward on turning the pump ON during cheaper tariff hours<\/li>\n<li>Reward on meeting the demand\u00a0<\/li>\n<li>Penalty on not meeting the demand<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-462172a elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"18235\" data-id=\"462172a\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-513b753\" data-eae-slider=\"95981\" data-id=\"513b753\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-97166cb elementor-widget elementor-widget-heading\" data-id=\"97166cb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">3. Action:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7c68b53 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"52642\" data-id=\"7c68b53\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-653ea76\" data-eae-slider=\"99546\" data-id=\"653ea76\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-72a2b24 elementor-widget elementor-widget-text-editor\" data-id=\"72a2b24\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The Reinforcement Learning Agent takes an action to maximize the reward for the episode. In a water distribution system, the control variables are pumps (binary \/ variable speed) and valves. Based on the current state and reward received from the previous action, the agent decides the next action. For example, in the case of a binary pump, the agent will decide when to turn the pump ON and OFF.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-823584d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"65690\" data-id=\"823584d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6239ff3\" data-eae-slider=\"11822\" data-id=\"6239ff3\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a580d53 elementor-widget elementor-widget-heading\" data-id=\"a580d53\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">4. Input\/Output:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-841122a elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"67238\" data-id=\"841122a\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c67c467\" data-eae-slider=\"8634\" data-id=\"c67c467\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-58cd65b elementor-widget elementor-widget-text-editor\" data-id=\"58cd65b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Once an RL agent is trained, it requires some parameters as inputs, and it recommends schedules in real-time. In the case of a water distribution system, an RL agent needs to know water demand and state variables (reservoir levels). Based on the current states (reservoir levels) and the predicted demand, the RL agent will take an action for the next timestamp. Since the RL agent is trained on reward function, it will take actions to meet all requirements:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a1af2b4 elementor-widget elementor-widget-text-editor\" data-id=\"a1af2b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul>\n<li>Keep the reservoir level under constraint\u00a0<\/li>\n<li>Meet the demand<\/li>\n<li>Low energy usage<\/li>\n<li>Meet the toggle count constraint for pumps<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-bb73508 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"64442\" data-id=\"bb73508\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-18cbc62\" data-eae-slider=\"75786\" data-id=\"18cbc62\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d365655 elementor-widget elementor-widget-heading\" data-id=\"d365655\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h5 class=\"elementor-heading-title elementor-size-default\">5. Terminal State:<\/h5>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-7670ad7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"97904\" data-id=\"7670ad7\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1e4767f\" data-eae-slider=\"36976\" data-id=\"1e4767f\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-de2cb2c elementor-widget elementor-widget-text-editor\" data-id=\"de2cb2c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>An Episode in an RL is the length of the simulation that ends with a terminal state at the end. In other words, an RL agent can take action or recommend schedules until it reaches a terminal state. Deciding the terminal state is highly dependent on the business solution. Here are some examples of terminal states. The RL agent can have multiple rules for terminal states.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-aec3c3f elementor-widget elementor-widget-text-editor\" data-id=\"aec3c3f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul>\n<li>The agent\u00a0is allowed to\u00a0run until a fixed number of steps get to the most advantageous state. The environment keeps track of the number of steps taken, and it terminates the episode when the agent reaches the limit.\u00a0\u00a0<\/li>\n<li>The agent can decide to do nothing instead of changing one of the\u00a0control variables\/speed ratios.\u00a0\u00a0<\/li>\n<li>If the agent adheres to a state for a predefined number of consecutive steps, the episode is terminated as well.<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3fbe663 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"37284\" data-id=\"3fbe663\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6cff8f6\" data-eae-slider=\"79124\" data-id=\"6cff8f6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-81f0f96 elementor-widget elementor-widget-heading\" data-id=\"81f0f96\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Platforms:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-df8e4f8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"31162\" data-id=\"df8e4f8\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-5309ccb\" data-eae-slider=\"76104\" data-id=\"5309ccb\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-dcb3811 elementor-widget elementor-widget-text-editor\" data-id=\"dcb3811\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>There are many RL platforms that are currently available that allow users to develop and compare multiple RL algorithms. Having said that, a few of them are the more popular ones such as:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0d81497 elementor-widget elementor-widget-text-editor\" data-id=\"0d81497\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul>\n<li>Amazon SageMaker RL<\/li>\n<li>Google\u2019s Dopamine<\/li>\n<li>Facebook\u2019s ReAgent<\/li>\n<li>Huskarl<\/li>\n<li>Deepmind\u2019s bSuite<\/li>\n<li>OpenAI Gym<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2c7e78e elementor-widget elementor-widget-text-editor\" data-id=\"2c7e78e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Among all of these, OpenAI Gym is a good starting point for beginners. It already provides a wide variety of simulated Reinforcement Learning environments, and it has rich documentation.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Water Distribution System Overview: A water distribution system is a water supply network, which has components such as pumps and valves to carry water from storage tanks to water consumers in order to satisfy their requirements.\u00a0 Need for Pump \/ Valve optimization: Inefficient usage of pumps can lead to enormous problems such as high energy<\/p>\n","protected":false},"author":1176,"featured_media":28847,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-post-2.php","format":"standard","meta":{"footnotes":""},"categories":[183],"tags":[111,92,695],"ppma_author":[3974],"class_list":["post-25381","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-ai-amp-machine-learning","tag-machine-learning","tag-reinforcement-learning"],"authors":[{"term_id":3974,"user_id":1176,"is_guest":0,"slug":"harshpatel","display_name":"Harsh Patel","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/06\/Harsh-Patel-1-150x150.jpeg","author_category":"","user_url":"","last_name":"Patel","first_name":"Harsh","job_title":"","description":"Harsh Patel is an Artificial Intelligence (AI) enthusiast passionate about cutting-edge technology and solving real-world problems. Harsh Patel is passionate about reinforcement learning (RL), computer vision and natural language processing (NLP)."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/25381","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/1176"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=25381"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/25381\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/28847"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=25381"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=25381"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=25381"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=25381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}