{"id":743,"date":"2018-06-20T02:04:54","date_gmt":"2018-06-19T23:04:54","guid":{"rendered":"http:\/\/kusuaks7\/?p=348"},"modified":"2026-02-26T11:28:31","modified_gmt":"2026-02-26T11:28:31","slug":"co-clustering-can-provide-industrial-data-pattern-discovery","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/iot\/co-clustering-can-provide-industrial-data-pattern-discovery\/","title":{"rendered":"Co-Clustering Can Provide Industrial Data Pattern Discovery"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"743\" class=\"elementor elementor-743\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-268fc565 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"268fc565\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7ae30ff6\" data-id=\"7ae30ff6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3df50c7a elementor-widget elementor-widget-text-editor\" data-id=\"3df50c7a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong><em>Ready to learn Internet of Things? <a href=\"https:\/\/www.experfy.com\/training\/courses\">Browse courses<\/a>\u00a0like\u00a0<a href=\"https:\/\/www.experfy.com\/training\/tracks\/internet-of-things-training-certification\">Internet of Things (IoT) Training<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-393b082 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"393b082\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bc5d9cd\" data-id=\"bc5d9cd\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5b41c30 elementor-widget elementor-widget-text-editor\" data-id=\"5b41c30\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn spite of the rapid development in data acquisition technology resulting in the explosive collection of acquired datasets, techniques such as data organization and classification, manipulation, and analysis of very large, diverse, heterogeneous datasets have only evolved modestly. This has led to hindrances in effective utility and better understanding of the acquired, large-scale data for knowledge discovery. In an industrial setting, an interesting visual from McKinsey illustrates that despite collecting data from tens of thousands of sensors, less than 1% is actually utilized.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a17da87 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a17da87\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-297629e\" data-id=\"297629e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1a93950 elementor-widget elementor-widget-text-editor\" data-id=\"1a93950\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tData clustering is the classification of data objects into different groups (clusters) such that data objects in one group are similar together and dissimilar from another group. Typically, homogeneous data objects, i.e. data objects having the same data type, are grouped together using some of the well-known clustering algorithms. However, many of the real world data clustering problems arising in data mining applications are pair-wise heterogeneous in nature. Clustering problems of these kinds have two data types that need to be clustered together. For example, in a customer relationship management (CRM) application, it is desirable to co-cluster\u00a0<em>customers\u00a0<\/em>and\u00a0<em>items purchased\u00a0<\/em>to study items of interest for particular category of customers. Customized product promotion campaigns are then targeted at appropriate prospective customers. Collaborative information filtering applications such as movie recommender systems co-cluster the accumulated movie rating provided by viewers and the movies they have watched. A new viewer submits a movie rating for a movie he\/she has liked. Using this information, the viewer is recommended other movies by classifying the rating he\/she provided to a\u00a0<em>viewer ratings-movies watched\u00a0<\/em>cluster. In some of the biomedical applications, co-clustering is performed on\u00a0<em>patient symptoms\u00a0<\/em>and\u00a0<em>medical diagnosis\u00a0<\/em>for patients in the database. Computer-aided diagnosis is then achieved for a patient based on symptoms provided. From the above discussion, it is clear that the existence of two pair-wise data types is \u201chand-in-hand\u201d. In other words, one data type in this scenario induces clustering of the other data type and vice-versa. Hence, applying conventional clustering algorithms separately to each of the data types cannot produce meaningful co-clustering results.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-bf51371 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bf51371\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9f823e3\" data-id=\"9f823e3\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-77714ac elementor-widget elementor-widget-text-editor\" data-id=\"77714ac\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tTypically, the data is stored in a contingency or co-occurrence matrix C where rows and columns of the matrix represent the data types to be co-clustered. An entry\u00a0<em>Cij\u00a0<\/em>of the matrix signifies the relation between the data type represented by row\u00a0<em>i\u00a0<\/em>and column\u00a0<em>j<\/em>. Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. Names such as bi-clustering, bi-dimensional clustering, and block clustering, among others, are often used in the literature to refer to the same problem formulation.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1a3c1d1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1a3c1d1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4550145\" data-id=\"4550145\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-661c9ce elementor-widget elementor-widget-text-editor\" data-id=\"661c9ce\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tOne technique for achieving co-clustering is to approach the problem from a graph theoretic point of view. That is, we model the relationship between the two data types in the co-clustering problem using a weighted bipartite graph model. The two data types represent the two kinds of vertices in the bipartite graph. Data co-clustering is achieved by partitioning the bipartite graph.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-fde5ba4 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"fde5ba4\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-abbbcd6\" data-id=\"abbbcd6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-bb06bf3 elementor-widget elementor-widget-text-editor\" data-id=\"bb06bf3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe square and circular vertices (<em>m\u00a0<\/em>and\u00a0<em>r<\/em>, respectively) denote the two data types in the co-clustering problem that are represented by the bipartite graph. Partitioning this bipartite graph leads to co-clustering of the two data types.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-686d245 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"686d245\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e7d0529\" data-id=\"e7d0529\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-19a313c elementor-widget elementor-widget-text-editor\" data-id=\"19a313c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tI would welcome any conversation on application development to provide stronger insights for a variety of industries. We can move rapidly into Industry 4.0 by combining subject matter expertise, data collection methods and next-generation data science tools, beyond many of the &#8220;me too&#8221; products.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Data clustering is the classification of data objects into different groups (clusters) such that data objects in one group are similar together and dissimilar from another group.&nbsp;Many of the real world data clustering problems arising in data mining applications are pair-wise heterogeneous in nature. Clustering problems of these kinds have two data types that need to be clustered together.&nbsp;In an industrial setting, despite collecting data from tens of thousands of sensors, less than 1% is actually utilized. We can move rapidly into Industry 4.0 by combining subject matter expertise, data collection methods and next-generation data science tools, beyond many of the &#8220;me too&#8221; products.<\/p>\n","protected":false},"author":220,"featured_media":3966,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[195],"tags":[93],"ppma_author":[1890],"class_list":["post-743","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-iot","tag-internet-of-things"],"authors":[{"term_id":1890,"user_id":220,"is_guest":0,"slug":"dan-yarmoulk","display_name":"Dan Yarmoulk","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Yarmoulk","first_name":"Dan","job_title":"","description":"Dan Yarmoulk is Director of Business Development (IoT, Data Science) at ATEK Access Technologies, LLC. He is a leader in IoT and Data Science, Digital Transformation, Machine Learning, Artificial Intelligence, and Business Models"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/743","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/220"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=743"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/743\/revisions"}],"predecessor-version":[{"id":38288,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/743\/revisions\/38288"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3966"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=743"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=743"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=743"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=743"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}