{"id":22633,"date":"2021-02-18T11:12:07","date_gmt":"2021-02-18T11:12:07","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/definition-enterprise-in-ekgs\/"},"modified":"2023-09-05T05:13:45","modified_gmt":"2023-09-05T05:13:45","slug":"definition-enterprise-in-ekgs","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/definition-enterprise-in-ekgs\/","title":{"rendered":"A Definition of \u201cEnterprise\u201d in EKGs"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22633\" class=\"elementor elementor-22633\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1d541e8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1d541e8\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bc79a50\" data-id=\"bc79a50\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6b2da15 elementor-widget elementor-widget-text-editor\" data-id=\"6b2da15\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a245\">Last week, I was discussing the key features of an Enterprise Knowledge Graph (EKG) with some colleagues, and I realized that although we were using the same words, we were talking about different things. We had a problem with the&nbsp;<em>semantics<\/em>&nbsp;of the word \u201c<em>Enterprise\u201d.<\/em>&nbsp;This is a bit ironic since many of these people I was talking to had a strong background in semantics.<\/p>\n<p id=\"05cc\">Many people co-mingle the terms from\u00a0open linked data world and the\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Semantic_Web_Stack\" target=\"_blank\" rel=\"noreferrer noopener\">semantic web\u00a0<\/a>stack&#8217;s role with the concepts related to sustainability and scalability of <a href=\"https:\/\/www.experfy.com\/blog\/ai-ml\/enterprise-knowledge-graph-trends\/\" target=\"_blank\" rel=\"noreferrer noopener\">enterprise knowledge graphs<\/a>. My assertion is these are independent and orthogonal concepts. They both share the goal of highly connected and easy-to-query data. But in my definition of EKG, there is\u00a0<strong>no<\/strong>\u00a0requirement for using any components of the semantic web stack to qualify as an enterprise knowledge graph.<\/p>\n<p id=\"aa4f\">How you connect your data within your graph is an implementation detail, not a defining characteristic of an EKG. Every project has different requirements to link data and uses a different set of tools from my experience. For example, entire companies help you design and build master data management systems that perform entity resolution of customer data.<\/p>\n<p id=\"8936\">To me, what defines a true EKG is an answer to the question:&nbsp;<em>Does your system support the demanding requirements of sustainable, scale-out, highly connected, and queryable datasets in&nbsp;<\/em><strong><em>large,<\/em><\/strong><strong><em>diverse<\/em><\/strong><em>&nbsp;organizations?<\/em><\/p>\n<p id=\"d7c9\">Let&#8217;s list the attributes of graph projects that we have found to be the most relevant to predicting if small departmental level graph projects will ever evolve from a pilot phase to a true enterprise-scale solution.<\/p>\n<p id=\"41b1\">1.&nbsp;<strong>Scale-out data size<\/strong>&nbsp;\u2014 adding more RAM, SSD, and spinning disk should not interrupt services. Operations staff should be able to add new nodes to a cluster and have the data automatically migrate to the new nodes without service interruption. If you have to shut down critical services to do this, you don\u2019t really have an enterprise-class system.<\/p>\n<p id=\"d5a1\">2.&nbsp;<strong>Scale-out compute<\/strong>&nbsp;\u2014 adding additional CPUs should be possible without service interruption. That means even adding new hardware such as an FPGA to perform real-time similarity calculations in parallel to your EKG without service interruption.<\/p>\n<p id=\"1060\">3.&nbsp;<strong>Scale-out security<\/strong>&nbsp;\u2014 adding more projects with more roles and more users should not impact system performance. The preferred method of implementing this is to use role-based access control (RBAC) at the vertex level. Because RBAC removes the requirement of associating a user\u2019s ID with each vertex, RBAC is a much more scaleable authorization process.<\/p>\n<p id=\"634e\">4.&nbsp;<strong>Scale-out manageability<\/strong>&nbsp;\u2014 monitoring the continual performance of 100s of applications executing thousands of graph queries is a complex process. EKGs must have ways to integrate detailed query performance logging and quickly alert operations staff when service levels become slower than expected values.<\/p>\n<p id=\"0858\">5.\u00a0<strong>Scale-out data quality<\/strong>\u00a0\u2014EKG software must make it easy to perform data validation as it enters the EKG and as it evolves within the EKG as new relationships are inferred. Creating rich and maintainable data quality rules in declarative languages like\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/XML_schema\" target=\"_blank\" rel=\"noreferrer noopener\">XML Schema<\/a>\u00a0with GUI editors and writing rules for link quality such as in\u00a0<a href=\"https:\/\/www.w3.org\/TR\/shacl\/\" target=\"_blank\" rel=\"noreferrer noopener\">SHACL<\/a>\u00a0will need to be an integral component of future EKGs.<\/p>\n<p id=\"1b53\">6.\u00a0<strong>Scale-out algorithms<\/strong>\u00a0\u2014 EKGs need to run an extensive library of standard graph algorithms and a new generation of machine-learning algorithms that create\u00a0<a href=\"https:\/\/dmccreary.medium.com\/understanding-graph-embeddings-79342921a97f\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">graph embedding<\/a>. Complex CPU-intensive queries must be able to run on EKGs without impacting application service levels.<\/p>\n<p id=\"62a2\">7.\u00a0<strong>Scale-out query<\/strong>\u00a0\u2014 EKGs need query software that allows developers to express distributed queries in high-level query languages. We learned from the Map-Reduce days that forcing developers to write 10-pages Java programs that take days to run on unindexed raw files will not cut it in the future. Features such as\u00a0<a href=\"https:\/\/www.tigergraph.com\/blog\/accumulator-101\/\" target=\"_blank\" rel=\"noreferrer noopener\">Accumulators<\/a>\u00a0in GSQL make it easy for graph query developers to express distributed queries in just a few lines of code.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ebe9aa8 elementor-widget elementor-widget-heading\" data-id=\"ebe9aa8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Staying Small is Still OK<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bbb3ad7 elementor-widget elementor-widget-text-editor\" data-id=\"bbb3ad7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"6a17\">Not every graph project needs to evolve into an enterprise sale project. Small departmental projects that focus on a few tasks can still be cost-effective and useful to the people in that department. The challenge I see is that many graph pilot projects hope that their solution will scale-up but don\u2019t have a concrete plan to scale up.<\/p>\n<p id=\"ecb5\">My advice is to begin a project with the end in mind. Ask around if a vendor has references that have scaled to meet the demanding needs of thousands of concurrent users using hundreds of applications with zero downtime.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7e6d5a8 elementor-widget elementor-widget-heading\" data-id=\"7e6d5a8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">On-the-wire vs. In-the-Can Thinking<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6ce4582 elementor-widget elementor-widget-image\" data-id=\"6ce4582\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"960\" height=\"431\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g.png\" class=\"attachment-large size-large wp-image-18751\" alt=\"A Definition of \u201cEnterprise\u201d in EKGs\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g.png 960w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g-300x135.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g-768x345.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g-610x274.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/15G6EPfIzvE2nM33NXEZl-g-750x337.png 750w\" sizes=\"(max-width: 960px) 100vw, 960px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">On-the-wire thinking and in-the-can thinking are different viewpoints to approach the challenges of building and analyzing highly connected data. Drawing by the author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d95ce9b elementor-widget elementor-widget-text-editor\" data-id=\"d95ce9b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"05ca\">One of the key issues that many solution architects struggle with is the requirements for representing connected data on-the-wire vs. representing connected data in a database (in-the-can). As articulated by Tim-Bernes Lee in their landmark\u00a0<a href=\"https:\/\/www.scientificamerican.com\/article\/the-semantic-web\/\" target=\"_blank\" rel=\"noreferrer noopener\">Scientific American issue in May of 2001<\/a>, the Semantic Web&#8217;s original vision was a vision of\u00a0<strong>publishing<\/strong>\u00a0data, not of scalable enterprise database query technology.<\/p>\n<p id=\"2a13\">Although it was a laudable goal to use the same stack for both purposes, it didn\u2019t seem to work out. The problems with schema evolution and RDF reification (<a href=\"https:\/\/dmccreary.medium.com\/graph-databases-data-modeling-and-the-jenga-tower-metaphor-24631d492c4b\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">The Jenga Tower Problem<\/a>) prevented large-scale RDF database systems from being cost-effective for any teams of more than a dozen developers. RDF* (RDF \u201cstar\u201d) is a laudable effort, but it might be too little, too late. LPG database products now have come to dominate the enterprise graph market and have shown that they can meet the enterprise&#8217;s demanding requirements.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ba36fdd elementor-widget elementor-widget-heading\" data-id=\"ba36fdd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Conclusion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-723af1a elementor-widget elementor-widget-text-editor\" data-id=\"723af1a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"bf65\">Words are important. My advice is to ask your colleagues and your vendor precise questions about what they mean when they use the word \u201cEnterprise\u201d when describing their graph products and services. If they insist that you must use the semantic web stack to be considered enterprise-worthy, then seek clarification on what their requirements are for success.<\/p>\n<p id=\"800d\">If colleagues talk about on-the-wire issues related to precise data publishing, then suggest they use the word \u201csemantic knowledge graph.\u201d If they try to address the issues of a specific department, then try using the word \u201cproject knowledge graph\u201d or \u201cdepartment knowledge graph\u201d.<\/p>\n<p id=\"fde0\">If colleagues clearly articulate the need for sustainable scale-out graph databases that support large organizations&#8217; diverse needs, you might have a start on gaining consensus on what the word \u201c<em>Enterprise<\/em>\u201d really means in the phrase \u201cEnterprise Knowledge Graph\u201d.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Ask your colleagues and your vendor precise questions about what they mean when they use the word \u201cEnterprise\u201d when describing their graph products and services.<\/p>\n","protected":false},"author":993,"featured_media":18752,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97,1346,1119,92,1347],"ppma_author":[3677],"class_list":["post-22633","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence","tag-enterprise","tag-knowledge-graph","tag-machine-learning","tag-semantics"],"authors":[{"term_id":3677,"user_id":993,"is_guest":0,"slug":"dan-mccreary","display_name":"Dan McCreary","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Dan-McCreary.jpeg","user_url":"https:\/\/www.optum.com\/","last_name":"McCreary","first_name":"Dan","job_title":"","description":"Dan McCreary is a distinguished Engineer in AI and Graph at Optum, a health services and innovation company. He is the co-author of the highly rated book \"Making Sense of NoSQL\" and co-founder of the \"NoSQL Now!\" conference."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22633","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/993"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22633"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22633\/revisions"}],"predecessor-version":[{"id":32179,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22633\/revisions\/32179"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18752"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22633"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22633"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22633"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22633"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}