{"id":31460,"date":"2017-06-02T19:29:10","date_gmt":"2017-06-02T16:29:10","guid":{"rendered":"https:\/\/www.altoros.com\/blog\/?p=31460"},"modified":"2019-05-07T02:52:42","modified_gmt":"2019-05-06T23:52:42","slug":"mastering-game-development-with-deep-reinforcement-learning-and-gpus","status":"publish","type":"post","link":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/","title":{"rendered":"Mastering Game Development with Deep Reinforcement Learning and GPUs"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79_2 counter-hierarchy ez-toc-counter ez-toc-transparent ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#Deep_Q-learning_on_GPU\" >Deep Q-learning on GPU<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#How_does_it_all_happen_to_work\" >How does it all happen to work?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#Accelerating_further_with_the_A3C_algorithm\" >Accelerating further with the A3C algorithm<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#Want_details_Watch_the_video\" >Want details? Watch the video!<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#Related_slides\" >Related slides<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#Further_reading\" >Further reading<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#About_the_expert\" >About the expert<\/a><\/li><\/ul><\/nav><\/div>\n<h3><span class=\"ez-toc-section\" id=\"Deep_Q-learning_on_GPU\"><\/span>Deep Q-learning on GPU<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Deep learning solutions has penetrated multiple industries with a view to improve our everyday experiences. Deep learning is employed in <a href=\"https:\/\/www.altoros.com\/blog\/the-magic-behind-google-translate-sequence-to-sequence-models-and-tensorflow\/\">translation<\/a>, <a href=\"https:\/\/www.altoros.com\/blog\/cross-modal-machine-learning-as-a-way-to-prevent-improper-pathology-diagnostics\/\">medicine<\/a>, <a href=\"https:\/\/www.altoros.com\/blog\/using-long-short-term-memory-networks-and-tensorflow-for-image-captioning\/\">media and entertainment<\/a>, <a href=\"https:\/\/www.altoros.com\/blog\/deep-learning-for-cybersecurity-identifying-anomalies-and-malicious-traffic\/\">cybersecurity<\/a>, <a href=\"https:\/\/www.altoros.com\/blog\/using-machine-learning-and-tensorflow-to-recognize-traffic-signs\/\">automobile industry<\/a>, etc.<\/p>\n<p>At a recent <a href=\"https:\/\/www.meetup.com\/TensorFlow-Denver\/events\/238464734\/\" target=\"_blank\" rel=\"noopener noreferrer\">TensorFlow meetup<\/a> in Denver, <a href=\"https:\/\/www.linkedin.com\/in\/ericharper451\/\" target=\"_blank\" rel=\"noopener noreferrer\">Eric Harper<\/a> of NVIDIA explored the world of deep reinforcement learning, highlighting the areas it is applied at, diving into the underlying algorithms and approaches. He also outlined how GPU-based architecture may help to improve model training.<\/p>\n<p>With the Big Bang in artificial intelligence\u2014initiated through combining the powers of deep neural networks, big data, and GPU\u2014the modern AI scene counts dozens of companies behind revolutionary products, as well as 1,000 startups with $5 billion in funding (according to Venture Scanner).<\/p>\n<p>The notion of <a href=\"https:\/\/www.altoros.com\/blog\/what-is-behind-deep-reinforcement-learning-and-transfer-learning-with-tensorflow\/\">deep reinforcement learning<\/a> has got into the spotlight with Google\u2019s DeepMind utilizing the approach to master Atari games. What they did was go beyond just classifying, but teach the model to take actions in an environment and learn from observing the results of the actions.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/image-net-accuracy-results-v1.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/image-net-accuracy-results-v1.jpg\" alt=\"\" width=\"590\" height=\"352\" class=\"aligncenter size-full wp-image-31463\" \/><\/a><small>The accuracy rate of ImageNet (<a href=\"https:\/\/www.slideshare.net\/secret\/dNQDdao7n6uY2A\" rel=\"noopener noreferrer\" target=\"_blank\">Image credit<\/a>)<\/small><\/center><\/p>\n<p>In 2016, DeepMind\u2019s AlphaGo\u2014a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Weak_AI\" target=\"_blank\" rel=\"noopener noreferrer\">narrow AI<\/a> program designed to play <a href=\"https:\/\/en.wikipedia.org\/wiki\/Go_(game)\" target=\"_blank\" rel=\"noopener noreferrer\">the Go board game<\/a>\u2014outplayed the world\u2019s champion Lee Sedol. So, what was behind AlphaGo?<\/p>\n<ul>\n<li>40 search threads<\/li>\n<li>1,202 CPUs<\/li>\n<li>176 GPUs for inference only<\/li>\n<\/ul>\n<p>The training took around three weeks on 50 GPUs for the policy network and another three weeks for the value network. Eric mentioned there were rumors around that AlphaGo had achieved the excellence of doing inference on just a single GPU.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"How_does_it_all_happen_to_work\"><\/span>How does it all happen to work?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>In terms of classical reinforcement learning there is an agent acting in an environment, which it can influence by taking actions. A policy determines actions, while an environment returns a reward and a state. To find a policy that maximizes a reward, one can employ <a href=\"https:\/\/www.altoros.com\/blog\/learning-game-control-strategies-with-deep-q-networks-and-tensorflow\/\">Q-learning<\/a>.<\/p>\n<p>However, reinforcement learning has it own drawbacks:<\/p>\n<ul>\n<li>state spaces are too large<\/li>\n<li>its capabilities are limited by the need \u201cfor hand-crafted, task-specific\u201d features<\/li>\n<li>finally algorithms to learn and generate data are required<\/li>\n<\/ul>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/schematic-convolitional-neural-network-v1.jpg\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/schematic-convolitional-neural-network-v1.jpg\" alt=\"\" width=\"640\" class=\"aligncenter size-full wp-image-31464\" \/><\/a><small>Schematic illustration of a convolutional neural network (<a href=\"https:\/\/www.nature.com\/articles\/nature14236\" target=\"_blank\" rel=\"noopener noreferrer\">Image credit<\/a>)<\/small><\/center><\/p>\n<p>To combat the trouble of too many states and\/or actions stored in memory and speed up the process of learning the value of each state, one can make use of value function approximation. It also enables to make predictions on previously unseen states.<\/p>\n<p>When dealing with Q-learning, one may face instability due to correlated trajectories. Here deep Q-learning networks come in. What one has to do is:<\/p>\n<ul>\n<li>Take an action according to the ep-greedy policy<\/li>\n<li>Store transition in replay memory<\/li>\n<li>Sample mini-batch from memory<\/li>\n<li>Compute Q-learning targets to old parameters<\/li>\n<li>Optimize Q-network vs. Q-learning targets using Stochastic gradient descent<\/li>\n<\/ul>\n<p>Eric also touched upon policy-based reinforcement learning, which sometimes simplifies the process of computing a policy in comparison to the value-based approach. For example, in the Pong game if a ball is coming toward you, you hit it (value-based). If you hit the ball coming towards you, you\u2019ll get a one-point reward later (policy-based). This approach can feature better convergence properties and is effective in high-dimensional or continuous action spaces as you don\u2019t have to find maximal values.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Accelerating_further_with_the_A3C_algorithm\"><\/span>Accelerating further with the A3C algorithm<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>In June 2016, DeepMind released the <a href=\"https:\/\/www.altoros.com\/blog\/deep-q-networks-and-practical-reinforcement-learning-with-tensorflow\/\">Asynchronous Advantage Actor-Critic<\/a> (A3C) algorithm that is capable of achieving better scores on a bunch of deep reinforcement learning tasks faster and easier.<\/p>\n<p>Within the A3C algorithm, each worker is a separate process. It starts with adjusting network parameters to the global network. Then, a worker interacts with an environment, storing a list of its experiences. Once the number of experiences is sufficient, a worker calculates value and policy losses and gradients to further update the global network.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/architecture-of-asynchronous-advantage-actor-critic-algoritm-v1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/architecture-of-asynchronous-advantage-actor-critic-algoritm-v1.png\" alt=\"\" width=\"640\" height=\"714\" class=\"aligncenter size-full wp-image-31465\" \/><\/a><small>A high-level architecture of A3C (<a href=\"https:\/\/medium.com\/emergent-future\/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2\" target=\"_blank\" rel=\"noopener noreferrer\">Image credit<\/a>)<\/small><\/center><\/p>\n<p>What comes out of using this algorithm on GPU?<\/p>\n<ul>\n<li style=\"margin-bottom: 6px;\">A predictor submits predictions in batches to the model on the GPU and returns the requested policy.<\/li>\n<li style=\"margin-bottom: 6px;\">A trainer submits training batches to the GPU for model updates.<\/li>\n<li style=\"margin-bottom: 6px;\">Performance on the GPU is optimized when the application has large amounts of parallel computations that can hide the latency of fetching data from memory.<\/li>\n<li style=\"margin-bottom: 6px;\">By increasing the number of predictors, one can achieve faster fetching prediction queries. However, smaller prediction batches result in more data transfers and lower GPU utilization.<\/li>\n<li style=\"margin-bottom: 6px;\">A bigger number of trainers leads to more frequent model updates. Note: too many trainers can occupy GPU and prevent predictors from accessing it.<\/li>\n<li style=\"margin-bottom: 6px;\">More agents generate more training experiences while hiding prediction latency.<\/li>\n<li style=\"margin-bottom: 6px;\">Training per second is roughly proportional to the overall learning speed.<\/li>\n<\/ul>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/gpu-based-architecture-vs-simple-a3c-algorithm-v1.png\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/gpu-based-architecture-vs-simple-a3c-algorithm-v1.png\" alt=\"\" width=\"640\" class=\"aligncenter size-full wp-image-31466\" \/><\/a><small>The architecture comparison of the A3C and GA3C algorithms (<a href=\"https:\/\/openreview.net\/pdf?id=r1VGvBcxl\" rel=\"noopener noreferrer\" target=\"_blank\">Image credit<\/a>) <\/small><\/center><\/p>\n<p>Being one of the major forces driving artificial intelligence forward, Google has recently released a tensor processing unit (TPU) designed specifically to boost machine learning performance. With all those options at hand, we actually see future happening here and now.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Want_details_Watch_the_video\"><\/span>Want details? Watch the video!<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><center><script src=\"https:\/\/fast.wistia.com\/embed\/medias\/6e9k4ln3z0.jsonp\" async=\"\"><\/script><script src=\"https:\/\/fast.wistia.com\/assets\/external\/E-v1.js\" async=\"\"><\/script><\/p>\n<div class=\"wistia_embed wistia_async_6e9k4ln3z0\" style=\"height: 360px; width: 640px;\"><\/div>\n<p><\/center><\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Related_slides\"><\/span>Related slides<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><center><iframe loading=\"lazy\" style=\"border: 1px solid #CCC; border-width: 1px; margin-bottom: 5px; max-width: 100%;\" src=\"https:\/\/www.slideshare.net\/slideshow\/embed_code\/key\/dNQDdao7n6uY2A\" width=\"427\" height=\"356\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/center><\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span>Further reading<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/deep-q-networks-and-practical-reinforcement-learning-with-tensorflow\/\">Deep Q-Networks and Practical Reinforcement Learning with TensorFlow<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/what-is-behind-deep-reinforcement-learning-and-transfer-learning-with-tensorflow\/\">What Is Behind Deep Reinforcement Learning and Transfer Learning with TensorFlow?<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/learning-game-control-strategies-with-deep-q-networks-and-tensorflow\/\">Learning Game Control Strategies with Deep Q-Networks and TensorFlow<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/research-papers\/performance-of-distributed-tensorflow-a-multi-node-and-multi-gpu-configuration\/\">Performance of Distributed TensorFlow: A Multi-Node and Multi-GPU Configuration<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"About_the_expert\"><\/span>About the expert<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><small><a href=\"https:\/\/www.linkedin.com\/in\/ericharper451\/\" target=\"_blank\" rel=\"noopener noreferrer\">Eric Harper<\/a> is a Data Scientist and Solutions Architect for NVIDIA with a focus on deep learning for enterprise. Prior to NVIDIA, he was a Lead Data Scientist for DISH Media Sales and before entering the industry Eric also served asa Postdoctoral Researcher in Low-Dimensional Topology at McMaster University and the University of Quebec at Montreal.<\/small><\/p>\n<hr \/>\n<p><center><small>This blog post was written by <a href=\"https:\/\/www.altoros.com\/blog\/author\/sophie.turol\/\">Sophia Turol<\/a> and <a href=\"https:\/\/www.altoros.com\/blog\/author\/alex\/\">Alex Khizhniak<\/a>.<\/small><\/center><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Deep Q-learning on GPU<\/p>\n<p>Deep learning solutions has penetrated multiple industries with a view to improve our everyday experiences. Deep learning is employed in translation, medicine, media and entertainment, cybersecurity, automobile industry, etc.<\/p>\n<p>At a recent TensorFlow meetup in Denver, Eric Harper of NVIDIA explored the world of deep reinforcement learning, highlighting [&#8230;]<\/p>\n","protected":false},"author":3,"featured_media":31472,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[214],"tags":[748,749],"class_list":["post-31460","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tutorials","tag-machine-learning","tag-tensorflow"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros<\/title>\n<meta name=\"description\" content=\"This blog post explores the approaches and algorithms driving deep reinforcement learning forward, the related pitfalls, and the perks of GPU-based training.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros\" \/>\n<meta property=\"og:description\" content=\"Deep Q-learning on GPU Deep learning solutions has penetrated multiple industries with a view to improve our everyday experiences. Deep learning is employed in translation, medicine, media and entertainment, cybersecurity, automobile industry, etc. At a recent TensorFlow meetup in Denver, Eric Harper of NVIDIA explored the world of deep reinforcement learning, highlighting [...]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/\" \/>\n<meta property=\"og:site_name\" content=\"Altoros\" \/>\n<meta property=\"article:published_time\" content=\"2017-06-02T16:29:10+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-05-06T23:52:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif\" \/>\n\t<meta property=\"og:image:width\" content=\"640\" \/>\n\t<meta property=\"og:image:height\" content=\"366\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/gif\" \/>\n<meta name=\"author\" content=\"Sophia Turol\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sophia Turol\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/\",\"name\":\"Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros\",\"isPartOf\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif\",\"datePublished\":\"2017-06-02T16:29:10+00:00\",\"dateModified\":\"2019-05-06T23:52:42+00:00\",\"author\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/58194952af19fe7b2b830846e077a58e\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif\",\"width\":640,\"height\":366},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.altoros.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Mastering Game Development with Deep Reinforcement Learning and GPUs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\",\"url\":\"https:\/\/www.altoros.com\/blog\/\",\"name\":\"Altoros\",\"description\":\"Insight\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.altoros.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/58194952af19fe7b2b830846e077a58e\",\"name\":\"Sophia Turol\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/05\/trello_card-96x96.jpg\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/05\/trello_card-96x96.jpg\",\"caption\":\"Sophia Turol\"},\"description\":\"Sophia Turol is passionate about delivering well-structured articles that cater for picky technical audience. With 3+ years in technical writing and 5+ years in editorship, she enjoys collaboration with developers to create insightful, yet intelligible technical tutorials, overviews, and case studies. Sophie is enthusiastic about deep learning solutions\u2014TensorFlow in particular\u2014and PaaS systems, such as Cloud Foundry.\",\"url\":\"https:\/\/www.altoros.com\/blog\/author\/sophie-turol\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros","description":"This blog post explores the approaches and algorithms driving deep reinforcement learning forward, the related pitfalls, and the perks of GPU-based training.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/","og_locale":"en_US","og_type":"article","og_title":"Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros","og_description":"Deep Q-learning on GPU Deep learning solutions has penetrated multiple industries with a view to improve our everyday experiences. Deep learning is employed in translation, medicine, media and entertainment, cybersecurity, automobile industry, etc. At a recent TensorFlow meetup in Denver, Eric Harper of NVIDIA explored the world of deep reinforcement learning, highlighting [...]","og_url":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/","og_site_name":"Altoros","article_published_time":"2017-06-02T16:29:10+00:00","article_modified_time":"2019-05-06T23:52:42+00:00","og_image":[{"width":640,"height":366,"url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif","type":"image\/gif"}],"author":"Sophia Turol","twitter_misc":{"Written by":"Sophia Turol","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/","url":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/","name":"Mastering Game Development with Deep Reinforcement Learning and GPUs | Altoros","isPartOf":{"@id":"https:\/\/www.altoros.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage"},"image":{"@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage"},"thumbnailUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif","datePublished":"2017-06-02T16:29:10+00:00","dateModified":"2019-05-06T23:52:42+00:00","author":{"@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/58194952af19fe7b2b830846e077a58e"},"breadcrumb":{"@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#primaryimage","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/mastering-game-dev-with-deep-reinforcement-learning-and-gpus-tensorflow.gif","width":640,"height":366},{"@type":"BreadcrumbList","@id":"https:\/\/www.altoros.com\/blog\/mastering-game-development-with-deep-reinforcement-learning-and-gpus\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.altoros.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Mastering Game Development with Deep Reinforcement Learning and GPUs"}]},{"@type":"WebSite","@id":"https:\/\/www.altoros.com\/blog\/#website","url":"https:\/\/www.altoros.com\/blog\/","name":"Altoros","description":"Insight","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.altoros.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/58194952af19fe7b2b830846e077a58e","name":"Sophia Turol","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/05\/trello_card-96x96.jpg","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/05\/trello_card-96x96.jpg","caption":"Sophia Turol"},"description":"Sophia Turol is passionate about delivering well-structured articles that cater for picky technical audience. With 3+ years in technical writing and 5+ years in editorship, she enjoys collaboration with developers to create insightful, yet intelligible technical tutorials, overviews, and case studies. Sophie is enthusiastic about deep learning solutions\u2014TensorFlow in particular\u2014and PaaS systems, such as Cloud Foundry.","url":"https:\/\/www.altoros.com\/blog\/author\/sophie-turol\/"}]}},"_links":{"self":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/31460","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/comments?post=31460"}],"version-history":[{"count":13,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/31460\/revisions"}],"predecessor-version":[{"id":42854,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/31460\/revisions\/42854"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media\/31472"}],"wp:attachment":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media?parent=31460"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/categories?post=31460"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/tags?post=31460"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}