{"id":22789,"date":"2021-05-06T08:04:00","date_gmt":"2021-05-06T08:04:00","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/in-situ-machine-learning\/"},"modified":"2023-08-21T10:32:42","modified_gmt":"2023-08-21T10:32:42","slug":"in-situ-machine-learning","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/in-situ-machine-learning\/","title":{"rendered":"In Situ Machine Learning"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22789\" class=\"elementor elementor-22789\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-b322ff7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b322ff7\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e5c298e\" data-id=\"e5c298e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ae5aee3 elementor-widget elementor-widget-image\" data-id=\"ae5aee3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"762\" height=\"248\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1M5l4Gu7ERNAg5i7_nj3rng-1.png\" class=\"attachment-large size-large wp-image-30904\" alt=\"In Situ Machine Learning\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1M5l4Gu7ERNAg5i7_nj3rng-1.png 762w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1M5l4Gu7ERNAg5i7_nj3rng-1-300x98.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1M5l4Gu7ERNAg5i7_nj3rng-1-610x199.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1M5l4Gu7ERNAg5i7_nj3rng-1-750x244.png 750w\" sizes=\"(max-width: 762px) 100vw, 762px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-11395f2 elementor-widget elementor-widget-text-editor\" data-id=\"11395f2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"64de\">It is complex and expensive to extract data from a database, send it to a GPU, train a model, and then use this model to enrich a database. What if we could leave our data \u201cin place\u201d and continuously run algorithms that would automatically enrich our database with new insights? This is the vision behind a new generation of systems called \u201cIn Situ\u201d machine learning systems. They reflect a new trend to integrate machine learning directly into our enterprise knowledge graphs.<\/p>\n\n<p id=\"dfc3\">The term\u00a0<em>In Situ<\/em>\u00a0means \u201cin the original place\u201d. In this context, it implies that we will keep data in place in our enterprise graph, and we are going to design our systems to minimize the need to move data around. If we look at the problem from the Systems Thinking perspective, we realize that the reason that we started moving data around is that older databases were incredibly inefficient at traversing relationships and binding specific datatypes to computational resources. This is because older relational databases were designed to run on a\u00a0<strong>single<\/strong>\u00a0server. Before the arrival of Massively Parallel Processing (MPP) enterprise knowledge graphs, the design of the older relational databases makes it difficult to do analysis over a cluster of 100s of nodes. With the arrival of Graphcore and the\u00a0<a href=\"https:\/\/pheedloop.com\/graphaiworld\/site\/sessions\/?event=graphaiworld&amp;section=33510&amp;id=SESTOTWYA57ZVHEQR\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">new Intel PIUMA architectures<\/a>, we need to start thinking of our databases as integrated data-compute resources distributed over many servers and even sometimes in different data centers. It is now our task to rethink every process in our enterprise knowledge graph that requires unnecessary data movement.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-aa26c3e elementor-widget elementor-widget-heading\" data-id=\"aa26c3e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">The Benefits of In Situ Machine Learning<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-23c3d11 elementor-widget elementor-widget-text-editor\" data-id=\"23c3d11\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0ecf\">I was introduced to the concept of In Situ Machine Learning by Dr. Changran Liu of TigerGraph. I had seen Dr. Liu do in-database machine learning as a \u201ctrick\u201d in the past using GSQL functions within TigerGraph. But at the time, I was apprehensive about protecting our production systems to avoid slow responses for our 25,000 concurrent users. I didn\u2019t think carefully about it as a realistic long-term goal when we have 1,000x the compute that we have now. I didn\u2019t appreciate the deep architectural tradeoffs of doing In Situ Machine Learning. Dr. Liu was the first person to carefully articulate some of the benefits of In Situ Machine Learning. Here are a few of the benefits if In Situ Machine Learning from Dr. Liu and some that I have added myself:<\/p>\n\n<ol>\n<li>Avoid slow and costly processes of moving data in and out of your database<\/li>\n<li>Avoid the security and audit problems of possible data leakage by excessive data movement<\/li>\n<li>Better support of the new generation of \u201cEyes Off\u201d machine learning where our developers don\u2019t have to see sensitive information such as Personal Healthcare Information.<\/li>\n<li>Better support for continuous model evolution over rapidly changing data sets<\/li>\n<li>Fewer limitations in model size<\/li>\n<li>Better utilization of the existing compute resources that are closer to the actual data<\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5dcb963 elementor-widget elementor-widget-heading\" data-id=\"5dcb963\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Why In Situ Machine Learning is More Natural<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-094e2f2 elementor-widget elementor-widget-text-editor\" data-id=\"094e2f2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"1f1e\">And if we think carefully about In Situ Machine Learning, we realize that it is a much more natural process. From research on\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Neurogenesis\" target=\"_blank\" rel=\"noreferrer noopener\">Neurogenesis<\/a>, we learned that our brains learn continuously. Throughout our lives, new neurons and synapses are continuously being generated and removed in our brains. And we do this\u00a0<strong>without<\/strong>\u00a0having to dump the data in our brains to external systems! So why can\u2019t our enterprise knowledge graphs do the same things!<\/p>\n\n<h2 id=\"2faf\">Understanding the Legacy of Video Games and GPUs<\/h2>\n\n<p id=\"8cf8\">The answer is because our research building deep learning neural networks is fundamentally a massively parallel processing task. The\u00a0<strong>only<\/strong>\u00a0hardware we had sitting around at the time was some cool hardware originally designed to speed up rendering in video games: Graphics Processing Units (GPUs). But this historical hack should not be confused with good architecture, sound design, and Systems Thinking. Our design principle of \u201cdon\u2019t<em>\u00a0move data if you don\u2019t have to\u201d<\/em>\u00a0should always be weighed with adherence to old ways of doing things \u201c<em>because that is the way we have always done things.\u201d<\/em><\/p>\n\n<p id=\"a843\">You can see a video of Dr. Liu demonstrating In Situ Learning at the Graph + AI World conference session:\u00a0<strong>Hands-on Workshop: Accelerating Machine Learning with Graph Algorithms<\/strong>\u00a0<a href=\"https:\/\/pheedloop.com\/graphaiworld\/site\/sessions\/?id=gQWetS\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">here<\/a>. Dr. Liu\u2019s In Situ Machine Learning demo starts around 1:15 in the workshop.<\/p>\n\n<p id=\"8359\">I close with a statement of gratitude for the people that put on the Graph+AI World conference. It was really eye-opening for me to get exposed to the diverse set of speakers from a vast number of companies and industries \u2014 a big shout out to my former colleague Jonathan Herke and the rest of the team from TigerGraph. I know you worked had to make this conference happen, and it was totally worth it for my coworkers and me.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>What if we could leave our data \u201cin place\u201d and continuously run algorithms that would automatically enrich our database with new insights? This is the vision behind a new generation of systems called \u201cIn Situ\u201d machine learning systems.<\/p>\n","protected":false},"author":993,"featured_media":19323,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97,1564,92],"ppma_author":[3677],"class_list":["post-22789","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence","tag-in-situ-machine-learning","tag-machine-learning"],"authors":[{"term_id":3677,"user_id":993,"is_guest":0,"slug":"dan-mccreary","display_name":"Dan McCreary","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Dan-McCreary.jpeg","user_url":"https:\/\/www.optum.com\/","last_name":"McCreary","first_name":"Dan","job_title":"","description":"Dan McCreary is a distinguished Engineer in AI and Graph at Optum, a health services and innovation company. He is the co-author of the highly rated book \"Making Sense of NoSQL\" and co-founder of the \"NoSQL Now!\" conference."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22789","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/993"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22789"}],"version-history":[{"count":6,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22789\/revisions"}],"predecessor-version":[{"id":30908,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22789\/revisions\/30908"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/19323"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22789"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22789"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22789"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22789"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}