{"id":22654,"date":"2021-03-01T08:12:12","date_gmt":"2021-03-01T08:12:12","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/density-based-and-graph-based-clustering\/"},"modified":"2023-09-04T13:11:37","modified_gmt":"2023-09-04T13:11:37","slug":"density-based-and-graph-based-clustering","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/density-based-and-graph-based-clustering\/","title":{"rendered":"Density-Based And Graph-Based Clustering"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22654\" class=\"elementor elementor-22654\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-c9af24c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c9af24c\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-af0bc03\" data-id=\"af0bc03\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-93842c7 elementor-widget elementor-widget-text-editor\" data-id=\"93842c7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p class=\"has-medium-font-size\"><em>What they are. When they are useful. How they relate to each other.<\/em><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e00de48 elementor-widget elementor-widget-text-editor\" data-id=\"e00de48\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e06b\">Consider the data set<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d34a42e elementor-widget elementor-widget-image\" data-id=\"d34a42e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"409\" height=\"196\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0cFA3DFk_AmSiAcz9.png\" class=\"attachment-large size-large wp-image-18826\" alt=\"Density-Based And Graph-Based Clustering\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0cFA3DFk_AmSiAcz9.png 409w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0cFA3DFk_AmSiAcz9-300x144.png 300w\" sizes=\"(max-width: 409px) 100vw, 409px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-435ea59 elementor-widget elementor-widget-text-editor\" data-id=\"435ea59\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"84e3\">What do you see? I see three definite clusters and 4 outliers. What makes the clusters \u2018definite\u2019? Each is dense with many points in it. What makes the singleton clusters outliers? Each is a singleton and far from any of the dense clusters.<\/p>\n<p id=\"c51d\">Density-based clustering is an approach that fits this&nbsp;example. Its aim is to find maximal clusters, each sufficiently dense. Maximal just means a cluster cannot be grown without decreasing the density significantly. For example, growing the rightmost dense cluster in the figure above by adding the point near the top right corner would decrease the density significantly, so we leave it out, preferring to think of it as an outlier instead.<\/p>\n<p id=\"4fdc\">Once the clusters have been found, clusters that have very few points in them can be deemed outliers.<\/p>\n<p id=\"9975\">Before diving into density-based clustering, let\u2019s ask the question: why not use&nbsp;<em>K<\/em>-means clustering? We\u2019ll assume the reader is familiar with this algorithm. If not, check out [1].<\/p>\n<p id=\"a407\"><em>K<\/em>-means clustering is ill-suited to this problem. For one, what&nbsp;<em>K<\/em>&nbsp;would you use? Think of it this way. Imagine that as we sampled from the data\u2019s population, the three dense clusters remained largely the same but the outliers changed, including the number of them. This is a reasonable model of a population that inherently forms a small fixed number of clusters but is prone to large random errors, i.e. has an unknown number of outliers.<\/p>\n<p id=\"3a08\">Using&nbsp;<em>K<\/em>-means clustering in the above setting would be very sensitive to the choice of K. This is because the total number of clusters can vary significantly from sample to sample. However, the number and characteristics of the dense clusters largely remain the same. So in principle, a density-based approach would be more robust.<\/p>\n<p id=\"502b\">Okay, at an intuitive level, the notion of dense clusters is simple and clear. But how do we actually find the dense clusters?<\/p>\n<p id=\"7e64\">Let\u2019s view the situation as follows. For a given distance&nbsp;<em>d<\/em>&nbsp;we can define a graph whose edges capture pairs of points that are at most distance&nbsp;<em>d<\/em>&nbsp;apart. That is, the points are the nodes of this graph. Two points are connected by an edge if they are within distance&nbsp;<em>d<\/em>.<\/p>\n<p id=\"06a1\">How does this help us? If we can somehow just choose the right&nbsp;<em>d<\/em>, the graph\u2019s connectivity structure will capture the cluster structure. This is depicted in the example below.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-abafd04 elementor-widget elementor-widget-image\" data-id=\"abafd04\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"1024\" height=\"374\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-1024x374.png\" class=\"attachment-large size-large wp-image-18827\" alt=\"Density-Based And Graph-Based Clustering\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-1024x374.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-300x109.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-768x280.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-610x223.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-750x274.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa-1140x416.png 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Y9PSJIXuEbl4Fxaa.png 1444w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">(a): data. (b): graph for a good d. (c): graph is fully-connected for very large d.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5afe7cb elementor-widget elementor-widget-text-editor\" data-id=\"5afe7cb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"377d\">For just the right choice of&nbsp;<em>d<\/em>, we get a good graph. That is a graph that captures the clusters. When&nbsp;<em>d<\/em>&nbsp;is too large, the graph is fully-connected (i.e., a clique). This graph is useless for clustering. When&nbsp;<em>d<\/em>&nbsp;is too small, there may be no edges in the graph. Again, useless for clustering.<\/p>\n<p id=\"8f3f\">Okay so how do we choose a good&nbsp;<em>d<\/em>? Not an easy question to answer. Still, we\u2019ve made progress. We have transformed the problem into one involving graph clustering. So now we just have to choose a good&nbsp;<em>d<\/em>. Well, sometimes we can, using domain knowledge or exploratory analysis on the data set. What if neither helps? We\u2019ll defer the discussion on this, to later.<\/p>\n<p id=\"cbfa\">In this particular example, for a good choice of\u00a0<em>d<\/em>, the clusters are just the connected components of the graph. (A connected component is a maximal connected piece of the graph.) There are <a href=\"https:\/\/www.experfy.com\/blog\/ai-ml\/markov-clustering-algorithm\/\" target=\"_blank\" rel=\"noreferrer noopener\">easy algorithms<\/a> to find the connected components in a graph.<\/p>\n<p id=\"ed2f\">Below is another example on which the connected components algorithm works quite well. This data set has two clusters. Their shapes are such that 2-means clustering wouldn\u2019t work well. (2-means clustering would work better with elliptical clusters.) For the right&nbsp;<em>d<\/em>, connected components will find these two clusters.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e51f016 elementor-widget elementor-widget-image\" data-id=\"e51f016\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"264\" height=\"239\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0Dr1ULMHZ78O3wmyU.png\" class=\"attachment-large size-large wp-image-18828\" alt=\"Two Clusters\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Two clusters, with non-elliptical shapes.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4c4a767 elementor-widget elementor-widget-heading\" data-id=\"4c4a767\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Recap<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f68db7e elementor-widget elementor-widget-text-editor\" data-id=\"f68db7e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9629\">Okay, so this is what we have so far. If somehow we could choose a good distance threshold&nbsp;<em>d<\/em>, we can transform the clustering problem into one on graph clustering. Even the simplest graph clustering algorithm, connected components clustering, often works well for the right choice of&nbsp;<em>d<\/em>.<\/p>\n<p id=\"b245\">Viewed this way,<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-61034ab elementor-widget elementor-widget-text-editor\" data-id=\"61034ab\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">density-based clustering = find good <em>d<\/em> + graph-based clustering<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6deb653 elementor-widget elementor-widget-heading\" data-id=\"6deb653\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Beyond Connected Components Clustering<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-886f772 elementor-widget elementor-widget-text-editor\" data-id=\"886f772\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3c02\">With the right choice of&nbsp;<em>d<\/em>, connected components clustering can often work quite well. But not always.<\/p>\n<p id=\"d562\">Consider the following image.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-02a8913 elementor-widget elementor-widget-image\" data-id=\"02a8913\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"218\" height=\"216\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0RFr25add61xVgNP_.png\" class=\"attachment-large size-large wp-image-18829\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0RFr25add61xVgNP_.png 218w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0RFr25add61xVgNP_-150x150.png 150w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0RFr25add61xVgNP_-75x75.png 75w\" sizes=\"(max-width: 218px) 100vw, 218px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Two dense clusters plus an outlier cluster<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9094ec2 elementor-widget elementor-widget-text-editor\" data-id=\"9094ec2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"249f\">I see two dense clusters plus a singleton cluster (the top right point). (The singleton cluster may be interpreted as an outlier.) With just the right&nbsp;<em>d<\/em>, the graph\u2019s connected components will mirror these three clusters. However, the range of \u201cright\u201d&nbsp;<em>d<\/em>\u2019s is quite narrow. With a&nbsp;<em>d&nbsp;<\/em>even slightly larger, the outlier would be absorbed into its neighboring dense cluster.<\/p>\n<p id=\"7123\">Now comes the key point. Even for this&nbsp;<em>d<\/em>, there is information in the graph that allows us to keep the outlier out of the dense cluster. Specifically, the outlier has degree 0. (The degree of a node is the number of edges in the graph that touch it.) By contrast, each of the nodes in the dense cluster has a degree of at least three.<\/p>\n<p id=\"b24c\">Okay, so why don\u2019t we replace the \u201cconnected component\u201d property with a more stringent \u201cdense subgraph\u201d property. There are different ways we could define \u201cdense subgraph\u201d. One is to require that the number of edges in the subgraph is a certain minimum percentage of the number of edge slots in the subgraph. (An edge slot is a slot in which an edge can fit.) A different way to define \u201cdense subgraph\u201d is to require that every node in the subgraph has a certain minimum degree within the subgraph. The latter is a stricter definition as it requires each of the nodes to have a minimum density. The former only requires the entire subgraph to have a certain minimum density.<\/p>\n<p id=\"fd01\">In principle, we can partition the graph into maximal subgraphs each of which is sufficiently dense according to whatever definition of \u201cdense\u201d we have chosen. In practice, these types of partitioning problems are generally intractable. That is,&nbsp;<em>very<\/em>&nbsp;time consuming to solve for large graphs.<\/p>\n<p id=\"1a5d\">This is where heuristic algorithms come in.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2ea5464 elementor-widget elementor-widget-heading\" data-id=\"2ea5464\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">DBScan<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ec6aaa6 elementor-widget elementor-widget-text-editor\" data-id=\"ec6aaa6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"db10\">This is a widely-used density-based clustering method. it heuristically partitions the graph into subgraphs that are dense in a particular way.<\/p>\n<p id=\"d90b\">It works as follows. It inputs the graph derived using a suitable distance threshold&nbsp;<em>d<\/em>&nbsp;chosen somehow. The algorithm takes a second parameter&nbsp;<em>D<\/em>. Any node in the graph whose degree is at least&nbsp;<em>D<\/em>&nbsp;is termed a core node. A core node is called such because it will form the core of a cluster.<\/p>\n<p id=\"ef24\">For a given choice of&nbsp;<em>D<\/em>, the approach works as follows. First, we find the highest-degree node&nbsp;<em>u&nbsp;<\/em>in the graph. We start a new cluster at this node. Next, we expand this cluster to contain the neighbors of&nbsp;<em>u<\/em>. Next, if any neighbor&nbsp;<em>v&nbsp;<\/em>of&nbsp;<em>u<\/em>&nbsp;is a core node (this is where&nbsp;<em>D<\/em>&nbsp;comes in) we add the neighbors of&nbsp;<em>v&nbsp;<\/em>to this same cluster. We keep going until the cluster stops growing. We then freeze the cluster, and repeat this process on the remainder of the graph.<\/p>\n<p id=\"b262\">This process is depicted below. In step 1, we find the first core which becomes a new cluster. This core\u2019s degree is 4, which is greater than our setting D = 3. In step 2, we expand this core to include its neighbors into the cluster. This is shown by the dashed circle. In step 3, we discover that one of these neighbors is itself a core. In step 4, we expand out the cluster further to include the neighbors of the second core. So the entire graph shown is now in a single cluster.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2af850a elementor-widget elementor-widget-image\" data-id=\"2af850a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"968\" height=\"418\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc.png\" class=\"attachment-large size-large wp-image-18830\" alt=\"Density-Based And Graph-Based Clustering\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc.png 968w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc-300x130.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc-768x332.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc-610x263.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0LyGTVdeew1KOiwoc-750x324.png 750w\" sizes=\"(max-width: 968px) 100vw, 968px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">(1). First core found. (2). First core\u2019s neighbors found. (3) Second core detected in first core\u2019s neighbors. (4) Cluster expanded to include second core\u2019s neighbors.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8ae4cbc elementor-widget elementor-widget-text-editor\" data-id=\"8ae4cbc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"df2d\">Once we have found all the clusters, we deem the ones with sufficiently-many points in them to be dense and the rest to be outliers.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-994a6fb elementor-widget elementor-widget-heading\" data-id=\"994a6fb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">An Alternative Visualization<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-170a7c2 elementor-widget elementor-widget-text-editor\" data-id=\"170a7c2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9a70\">One way to visualize this process is to imagine that from the first graph (which was derived from the data for a given&nbsp;<em>d<\/em>) we derive a second graph. The nodes of the second graph are the core nodes in the first graph. Two nodes in the second graph are connected by an edge if there is an edge between the same two nodes in the first graph.<\/p>\n<p id=\"2439\">Now we find the connected components in the second graph. Each of the resulting clusters contains one or more (core) nodes. We then expand out each of these clusters further using the edges in the original graph. Specifically for every core node in a cluster, we also put its neighbors in the same cluster.<\/p>\n<p id=\"aaad\">This process may be summarized as follows.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e3f88bd elementor-widget elementor-widget-text-editor\" data-id=\"e3f88bd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol><li>Build G2 from G.<\/li><li>Find connected components in G2.<\/li><li>Expand each connected component using the edges in G.<\/li><li>If some nodes in G are still unassigned, find connected components in the subgraph of G induced by these nodes. These connected components become additional clusters. Moreover, we also know that these clusters are \u2018secondary\u2019 clusters as they don\u2019t have any core points in them.<\/li><\/ol>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-873aa66 elementor-widget elementor-widget-text-editor\" data-id=\"873aa66\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9e80\">This view opens the door to using connected component algorithms or implementations.<\/p>\n<p id=\"912d\">We illustrate this process in the figure below.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5c4a63d elementor-widget elementor-widget-image\" data-id=\"5c4a63d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"661\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-1024x661.png\" class=\"attachment-large size-large wp-image-18831\" alt=\"Component Algorithms\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-1024x661.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-300x194.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-768x496.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-610x394.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-750x484.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD-1140x736.png 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0bEhsYScpvkkZLnQD.png 1255w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Exhibit (a) shows the data set. (b) shows the graph G derived from it for a certain&nbsp;<em>d<\/em>&nbsp;(not shown). (c) shows the graph G2 derived from G for D = 3. (d) shows the connected components in G2 expanded using the additional edges in G. (e) depicts the secondary cluster composed of nodes in G that remained unexamined after (d).<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-32b6805 elementor-widget elementor-widget-heading\" data-id=\"32b6805\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Choosing the Hyperparameters<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-00f355b elementor-widget elementor-widget-text-editor\" data-id=\"00f355b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9d5c\">So the graph-based clustering has one hyperparameter&nbsp;<em>d<\/em>. DBscan\u2019s particular algorithm has a second:&nbsp;<em>D<\/em>. How might we choose these when we have no domain knowledge to guide us?<\/p>\n<p id=\"d4ce\">There is no magical algorithm to find the best values. That said, there are some heuristic ways that ease the burden of finding good candidates. We will focus on&nbsp;<em>d<\/em>, which is more important as it defines the graph.<\/p>\n<p id=\"5b14\">The number of edges in the graph is a monotone function of&nbsp;<em>d<\/em>. When&nbsp;<em>d<\/em>&nbsp;is smaller than the minimum distance between two points in the data set, there is no edge. As we increase&nbsp;<em>d<\/em>, we start creating edges. Eventually, when&nbsp;<em>d<\/em>&nbsp;is large enough, we get a fully-connected graph.<\/p>\n<p id=\"ca6f\">Consider this approach. Track the number of edges as a function of&nbsp;<em>d<\/em>, and pick the first&nbsp;<em>d&nbsp;<\/em>where there is a significant jump. This is depicted below.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8b2a5a7 elementor-widget elementor-widget-image\" data-id=\"8b2a5a7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"505\" height=\"390\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0L2hlJBBO1beJJMd8.png\" class=\"attachment-large size-large wp-image-18832\" alt=\"Number of edges\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0L2hlJBBO1beJJMd8.png 505w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0L2hlJBBO1beJJMd8-300x232.png 300w\" sizes=\"(max-width: 505px) 100vw, 505px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cb4319d elementor-widget elementor-widget-text-editor\" data-id=\"cb4319d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"39c2\">The intuition behind this is that if there are dense clusters in the data, the graph will suddenly get a lot of edges as&nbsp;<em>d<\/em>&nbsp;approaches the right value.<\/p>\n<p id=\"0fca\">Consider the example below. It\u2019s not hard to visualize the&nbsp;<em>d<\/em>&nbsp;at which suddenly lots of edges emerge.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-42dfa01 elementor-widget elementor-widget-image\" data-id=\"42dfa01\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"264\" height=\"239\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0I-C-21q2HWmaRvGJ.png\" class=\"attachment-large size-large wp-image-18833\" alt=\"edges emerge\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b892217 elementor-widget elementor-widget-text-editor\" data-id=\"b892217\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4d84\">In fact, the graph of the number of edges as a function of&nbsp;<em>d&nbsp;<\/em>for this picture might look like this.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5e80b43 elementor-widget elementor-widget-image\" data-id=\"5e80b43\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"596\" height=\"425\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/05JtUEuqSBMjgFVHG.png\" class=\"attachment-large size-large wp-image-18834\" alt=\"graph of the number\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/05JtUEuqSBMjgFVHG.png 596w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/05JtUEuqSBMjgFVHG-300x214.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/05JtUEuqSBMjgFVHG-120x86.png 120w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/05JtUEuqSBMjgFVHG-350x250.png 350w\" sizes=\"(max-width: 596px) 100vw, 596px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9435f35 elementor-widget elementor-widget-text-editor\" data-id=\"9435f35\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"69bb\">This is because the first cluster is a bit denser than the second one.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-57c201d elementor-widget elementor-widget-heading\" data-id=\"57c201d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">HCS<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e76b3f0 elementor-widget elementor-widget-text-editor\" data-id=\"e76b3f0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c332\">This may be viewed as a conservative variant of connected components clustering (CCC). To see this let\u2019s rename CCC to&nbsp;<em>connected subgraphs clustering<\/em>. (This is just semantics.) HCS stands for&nbsp;<strong><em>highly<\/em><\/strong><em>&nbsp;connected subgraphs clustering<\/em>.<\/p>\n<p id=\"9e1f\">Naming aside, how does HCS work? (Previously we said this type of clustering is generally intractable.) It uses the notion of min-cuts. There are many heuristic algorithms to find min-cuts or approximations.<\/p>\n<p id=\"5feb\">HCS starts with the entire graph and checks if it is highly connected. We can define highly connected however we like so long as we can test for this condition efficiently. If the graph is not highly connected, it partitions the graph into two non-empty subgraphs in such a way that the number of edges that cross the partition is minimized. We might possibly add more constraints on the two graphs. Such as neither is relatively too small, i.e. the partition is not too imbalanced. Such a set of edges is called a min-cut. There are algorithms to find min-cuts under additional constraints. We won\u2019t discuss these.<\/p>\n<p id=\"76ae\">We then recurse on the two subgraphs.<\/p>\n<p id=\"cf50\">CCC may be viewed as a special case in which the highly connected condition is just&nbsp;<em>connected<\/em>&nbsp;and the min-cut must be of size 0.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4b476f5 elementor-widget elementor-widget-heading\" data-id=\"4b476f5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Example<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c6771f9 elementor-widget elementor-widget-text-editor\" data-id=\"c6771f9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3d6a\">Consider the graph depicted below. For an appropriate definition of \u201chighly connected\u201d, HCS should find the two clusters below. There is only one edge crossing the clusters. That is, the cut is small.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-27128c9 elementor-widget elementor-widget-image\" data-id=\"27128c9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"396\" height=\"225\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0xEZZ7PO4LfzaKyQX.png\" class=\"attachment-large size-large wp-image-18835\" alt=\"Density-Based And Graph-Based Clustering\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0xEZZ7PO4LfzaKyQX.png 396w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/0xEZZ7PO4LfzaKyQX-300x170.png 300w\" sizes=\"(max-width: 396px) 100vw, 396px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Two highly-connected subgraphs connected by a small cut<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-de96f8c elementor-widget elementor-widget-text-editor\" data-id=\"de96f8c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3c43\">One could reasonably make the case that the top node in the first cluster \u2014 the one whose degree is 1 \u2014 should go into its own (singleton) cluster. For an appropriate definition of \u201chighly connected,\u201d this can be made to happen.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-34af087 elementor-widget elementor-widget-heading\" data-id=\"34af087\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Summary<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0741841 elementor-widget elementor-widget-text-editor\" data-id=\"0741841\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a88c\">In this post, we covered density-based clustering and explained that it is a better fit than K-means clustering to scenarios in which (i) the number of clusters is not known in advance, (ii) there are outliers, and (iii) the clusters are not elliptical.<\/p>\n<p id=\"f94c\">During the process, we also revealed that more generally graph-based clustering has these attractive properties. In fact, the most popular algorithm for density-based clustering, DBSCAN, is graph-based.<\/p>\n<p id=\"8fe0\">Finally, we covered DBSCAN and a few other graph-based clustering algorithms (CCC, HCS) in some detail.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4d7116e elementor-widget elementor-widget-heading\" data-id=\"4d7116e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Further Reading<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-573f9b8 elementor-widget elementor-widget-text-editor\" data-id=\"573f9b8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol><li><a href=\"https:\/\/towardsdatascience.com\/k-means-clustering-and-variants-703f0a09ac36\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">K-means Clustering and Variants. The clustering problem is to group a\u2026 | by Arun Jagota | Dec, 2020<\/a><\/li><li><a href=\"https:\/\/en.wikipedia.org\/wiki\/DBSCAN\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/en.wikipedia.org\/wiki\/DBSCAN<\/a><\/li><li><a href=\"https:\/\/en.wikipedia.org\/wiki\/HCS_clustering_algorithm\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/en.wikipedia.org\/wiki\/HCS_clustering_algorithm<\/a><\/li><\/ol>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>This post covered density-based clustering and explained that it is a better fit than K-means clustering to scenarios in which (i) the number of clusters is not known in advance, (ii) there are outliers, and (iii) the clusters are not elliptical.<\/p>\n","protected":false},"author":1044,"featured_media":18836,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[1369,1370,1371,1372],"ppma_author":[3691],"class_list":["post-22654","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-clustering","tag-connected-components","tag-dbscan","tag-dense-subgraphs"],"authors":[{"term_id":3691,"user_id":1044,"is_guest":0,"slug":"arun-jagota","display_name":"Arun Jagota","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Arun-Jagota-150x150.jpeg","user_url":"https:\/\/www.salesforce.com\/in\/?ir=1","last_name":"Jagota","first_name":"Arun","job_title":"","description":"Arun Jagota is Director of Data Science at Salesforce.com. A PhD in computer science, he has taught undergraduate, graduate, and continuing education courses in Computer Science at many US Universities from 1992 through 2006. He has written a number of books, most available at Amazon.com, 50 academic publications and has 17+ patents issued."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22654","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/1044"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22654"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22654\/revisions"}],"predecessor-version":[{"id":32091,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22654\/revisions\/32091"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18836"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22654"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22654"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22654"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22654"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}