{"id":1106,"date":"2019-02-15T10:31:58","date_gmt":"2019-02-15T10:31:58","guid":{"rendered":"http:\/\/kusuaks7\/?p=711"},"modified":"2023-08-08T13:49:24","modified_gmt":"2023-08-08T13:49:24","slug":"suggestion-and-opinion-mining-from-qualitative-surveys","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/suggestion-and-opinion-mining-from-qualitative-surveys\/","title":{"rendered":"Suggestion and Opinion Mining from Qualitative Surveys"},"content":{"rendered":"<div>\n<h3>Problem Definition and Motivation<\/h3>\n<p>For a long time, open-ended qualitative surveys were usually conducted within small focus groups and the survey responses were typically manually coded and interpreted. Today with the explosion of social media conversations and online reviews, the notion of an <em>Always On<\/em> focus group is real, and brands have started using automated tools based on machine learning and NLP to understand customer sentiments, opinions, suggestions, and needs from the continuous stream of conversations.<\/p>\n<p>Recently, we were involved in analyzing traveler feedback on a newly opened airport in Asia. The airport had been operational for about 8 months, generating close to 300,000 qualitative comments in social media and email surveys. The dataset contained suggestions and opinions on various aspects of the airport, ranging from boarding procedures to airport amenities. Our task was to understand what aspects of the airport the travelers were most satisfied or dissatisfied with, and which ones needed improvement. The results of our analysis were then fed into an action plan to help improve the airport\u2019s operations.<\/p>\n<h3>Our Approach<\/h3>\n<p>Essentially, we extracted a combination of Sentiments, Intent, Interest, and Suggestions from the surveys and grouped them across various airport specific attributes to obtain a deeper understanding of the issues.<\/p>\n<p>Some issues that we needed to consider while approaching this problem were:<\/p>\n<ol>\n<li><!-- [if--><!--[endif]-->Customers who provide suggestions are generally unhappy about a particular aspect being discussed. For example, in the case <strong><em>I hope the security is good next time I come here<\/em>, <\/strong>the customer actually suggests improvement in security which is an indication of the <em>dissatisfaction<\/em> of the customer. Sentiment extraction that uses keyword-based features like positive and negative adjectives would not be able to identify suggestions properly and would probably mistake suggestion sentences to be positive expressions.<!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!--[endif]-->It is also essential to figure out the airport feature that the customer is talking about and to accordingly identify sentences expressing opinion about a particular aspect of the subject being discussed.<!--![endif]--><!--![if--><\/li>\n<\/ol>\n<p>In our case, opinionated statements could be among the following types:<\/p>\n<ol>\n<li><!-- [if--><!--[endif]-->Positive Expressions \u2013 sentences talking about a feature in a positive context<!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!--[endif]-->Negative Expressions \u2013 sentences talking about a feature in a negative context<!--![endif]--><!--![if--><!-- [if--><!--[endif]--><\/li>\n<li>Suggestive Expressions \u2013 sentences expressing a desire for improvement \/ wishful expressions<!--![endif]--><!--![if--><\/li>\n<\/ol>\n<p>So in our approach, we proceeded to identify the three types of expressions that customers used to express opinions about the airport. In our case, we assume that negative and suggestive expressions indicate a negative opinion. Another challenge in our task was to identify the aspects about which customers were expressing opinions. The task objective clearly mentions to capture opinions about a certain set of features (<strong>food and beverage, shopping, restrooms, etc.<\/strong>). Manually building a terminology on these aspects of the airport is a time consuming process and the terminology obtained may not be complete. Therefore, for our purposes, we used a topic modeling-based approach to build our terminology (more on this will be discussed in the later sections).<\/p>\n<p>From our extracted terminology, we then proceeded to identifying suggestions and sentiment- expressing sentences for our gathered features, using the following approaches:<\/p>\n<h3>1.\u00a0Features Extraction<\/h3>\n<p>We used a topic modeling algorithm to extract the various themes across various reviews. After convergence of the Gibbs Sampling iterations, we got the topics associated with each review. A topic is a multinomial distribution over words. This example topic (<strong>airport, restroom, wait, queue<\/strong>) signifies that the theme of a certain review is related to \u201cwaiting for the restroom.\u201d From the topics obtained we manually created a terminology of features. For an in-depth understanding of how topic models work, please refer to the excellent paper <a href=\"https:\/\/www.cs.princeton.edu\/~blei\/papers\/BleiNgJordan2003.pdf\" rel=\"noopener\"><strong><em>Latent Dirichlet Allocation<\/em><\/strong><\/a> by <strong>David Blei.<\/strong><\/p>\n<h3>2.\u00a0Suggestion Extraction<\/h3>\n<p>Suggestion Extraction is complex task by itself and previous research work is not conclusive. For extracting suggestions, we used some of the approaches followed by <strong>Ramanand et.al <\/strong>in <a href=\"http:\/\/www.aclweb.org\/anthology\/W10-0207\" rel=\"noopener\"><strong><em>Wishful Thinking Finding suggestions and \u2018buy\u2019 wishes from product reviews<\/em><\/strong><\/a><strong>.<\/strong> They mention the use of rule-based approaches to extract suggestive texts from corpora. By manually browsing through random reviews, we have observed that modal verbs are an important aspect of suggestive sentences. For example, use of modal verbs like <strong><em>would<\/em>, <em>should<\/em> <\/strong>in phrases need to be considered when building rules for suggestion classification. Some rules which we utilized when designing our suggestion extractor are given below:<\/p>\n<p>1.\u00a0<!--[endif]--><strong>&lt;modal_verb&gt;&lt;preference_verb&gt;&lt;optional_window_size_of_3&gt;&lt;positive_sentiment<!--![endif]--><!--![if-->_words&gt;<\/strong><\/p>\n<p>Some examples of these rules are \u201c<strong>would be great\u201d <\/strong>and<strong> \u201ccould be really good\u201d<\/strong>.<\/p>\n<p><!-- [if--><!--[endif]--><\/p>\n<p><!-- [if-->2<strong>.\u00a0<!--[endif]-->&lt;modal_verb&gt;&lt;optional_window_size_3&gt;&lt;auxillary_verb&gt;<\/strong><!--![endif]--><!--![if--><\/p>\n<p><!-- [if--><!--[endif]--><\/p>\n<p>Some examples of these rules are \u201c<strong>would like\u201d <\/strong>and <strong>\u201cwill really love\u201d<\/strong>.<\/p>\n<p>Also, some general rules were used from manual extraction like <strong>\u201cshould come with\u201d<\/strong> and <strong>\u201creally needs to\u201d<\/strong>.<\/p>\n<p>Our suggestion extractor trained on these rules was used to extract suggestions from the reviews. Our vocabulary was then used to identify all the suggestions for a particular feature of the airport.<\/p>\n<h3>3.\u00a0Sentiment Extraction<\/h3>\n<p>After filtering out suggestive snippets, we used our vocabulary to extract expressions conveying a positive and negative opinion from the review corpus. For the sentiment extraction, we used our in-house sentiment engine that has been trained with a wide range of rules. Some design features that were taken into account when analyzing sentiments are:<!-- [if--><!--[endif]--><\/p>\n<ul>\n<li>A comprehensive vocabulary of sentiment words related to our concerned domain<!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!--[endif]-->Sentence-level sentiment analysis to capture multiple sentiments across a post\/review<!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!--[endif]-->Feature-based sentiment analysis to capture sentiments across multiple features<!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!--[endif]-->Negation support to improve sentiment accuracy (e.g., phrases like \u201cnot good\u201d, \u201cno improvement\u201d could be disambiguated by our engine)<!--![endif]--><!--![if--><\/li>\n<\/ul>\n<h3>Technologies Used:<\/h3>\n<ul>\n<li><!-- [if--><!--[endif]-->Serendio&#8217;s <a href=\"https:\/\/github.com\/serendio-labs\/diskoveror-ta\" rel=\"noopener\">DisKoveror framework<\/a><!--![endif]--><!--![if--><\/li>\n<li>R programming<!--![endif]--><!--![if--><!-- [if--><!--[endif]--> <!--![endif]--><!--![if--><!--![endif]--><!--![if--><\/li>\n<\/ul>\n<h3>Analysis of Surveys<\/h3>\n<h4>1.\u00a0Suggestions vs. Opinions<\/h4>\n<p>The following pie chart expresses the distribution of suggestions vs. opinions extracted by our algorithm from the surveys.<\/p>\n<h4>2.\u00a0Suggestions for Improvement<\/h4>\n<p>The pie chart above shows the features on which people had suggestions to give. Some sample suggestions for the popular features we have extracted are given below.<\/p>\n<p><b>1. Airport Transportation<\/b> remained one of the most negative features about the airport. Many people had a lot of complaints regarding the transportation to and from the airport. Some sample suggestions that we picked up are:<\/p>\n<ul>\n<li><!-- [if--><!--[endif]--><em>Please bring back the city buses that were cheap and convenient<\/em><!--![endif]--><!--![if-->?<\/li>\n<li><em>Covered long term car park with adequate security is a must. For regular users it should be made cheaper<\/em><\/li>\n<\/ul>\n<p><strong>2<\/strong>.\u00a0<strong>General Airport Amenities<\/strong> \u2013 there was also a lot of general consensus among customers relating to information kiosks, charging areas for laptops, mobile phones, etc. Some sample suggestions for this category are:<\/p>\n<ul>\n<li><em>Need good secure wifi connectivity\u00a0<\/em><\/li>\n<li><em>Quiet zones without noisy announcements where people can catch up on sleep<\/em><\/li>\n<li><em>More mobile charger points &amp; power points\u00a0<\/em><\/li>\n<\/ul>\n<p><strong>3.\u00a0Waiting Time<\/strong>\u00a0\u2013 here are some suggestions regarding passenger waiting facilities offered in the airport:<\/p>\n<div>\n<ul>\n<li><!-- [if--><!--[endif]--><em>The chairs are terrible, please get slightly high-back chairs so that people sitting and waiting can rest their heads properly\u2026this is sorely lacking at departure gate areas<\/em><!--![endif]--><!--![if--><\/li>\n<li><!-- [if--><!-- [if--><!--[endif]--><!--[endif]--><em>I would like a few more recliner chairs\u2026bean bags are also a great idea<\/em><!--![endif]--><!--![endif]--><!--![if--><!--![if--><\/li>\n<\/ul>\n<h4><\/h4>\n<h4>3.\u00a0Positive Opinions and their respective features:<\/h4>\n<p>The pie chart above shows the distribution of positive opinions with respect to various features. Sample positive expressions for the top positive features include:<\/p>\n<p><!-- [if--><strong>1.\u00a0<!--[endif]-->General Amenities:<\/strong><!--![endif]--><!--![if--><\/p>\n<ul>\n<li><em>It is good that they have a paid lounge with unrestricted access for a reasonable fee<\/em><!--![endif]--><!--![if--><!-- --><!-- [if--><!--[endif]--><em>Nothing like landing at 12-midnight and having a strong coffee in the airport<\/em><!--![endif]--><!--![if--><!-- [if--><!--[endif]--> <!--![endif]--><!--![if--><!--![endif]--><!--![if--><!--![endif]--><!--![if--><\/li>\n<li><em>Nothing like landing at 12-midnight and having a strong coffee in the airport<\/em><\/li>\n<\/ul>\n<p><strong>2.\u00a0Waiting Time:<\/strong><\/p>\n<ul>\n<li><em>The waiting area was in general comfortable<\/em><\/li>\n<li><em>Comfortable waiting area and the check-in process was fast<\/em><\/li>\n<\/ul>\n<h4><\/h4>\n<h4>4.\u00a0Negative Opinions and corresponding features:<\/h4>\n<p>&nbsp;<\/p>\n<p>The pie chart above shows the distribution of negative opinions with respect to various features. Some sample negative opinions about the airport features are given below:<\/p>\n<p><strong>1.\u00a0Airport Transportation:<\/strong><\/p>\n<ul>\n<li><em>Currently, the vehicles move around and cause a serious problem for the passengers crossing the road<\/em><\/li>\n<li><em>The arrival point has a lot of taxi drivers who always harass the passangers<\/em><\/li>\n<li><em>Remove the nuisance of private taxi owners who harass passangers most<\/em><\/li>\n<\/ul>\n<p><strong>2.\u00a0Security:<\/strong><\/p>\n<ul>\n<li><em>Too much hassle from airport security<\/em><\/li>\n<li><em>Make sure security check in not going to be bottleneck which is currently the bottleneck<\/em><\/li>\n<li><em>The security counter not only looks ugly, it is also not comfortable for the poort security staff<\/em><\/li>\n<\/ul>\n<h4><\/h4>\n<h4>5.\u00a0Tag Cloud of features used in our suggestion extraction<\/h4>\n<p><strong>Conclusion<\/strong><\/p>\n<p>This article highlighted the Text Analytics techniques we used in analyzing qualitative surveys. This was used in conjunction with other forms of customer segmentation techniques such as cohort analysis to give us a more fine-grained understanding of the needs and wants of the airport user, as well as to introduce new amenities and services to improve the overall travel experience.<\/p>\n<p>&nbsp;<\/p>\n<div>\n<p><!--![endif]--><!--![if--><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><!--![endif]--><!--![if--><!--![endif]--><!--![if--><\/p>\n<p><!-- [if--><br \/>\n<!--[endif]--><!--![endif]--><!--![if--><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p><!--![endif]--><!--![if--><!-- [if--><!--[endif]--><!--![endif]--><!--![if--><!--![endif]--><!--![if--><!--![endif]--><!--![if--><!--![endif]--><!--![if--><!--![endif]--><!--![if--><!--![endif]--><!--![endif]--><!--![endif]--><!--![endif]--><!--![if--><!--![if--><!--![if--><!--![if--><\/p>\n<p><!--![endif]--><!--![endif]--><!--![endif]--><!--![endif]--><!--![endif]--><!--![if--><!--![if--><!--![if--><!--![if--><!--![if--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Analyzing traveler feedback on a newly opened airport in Asia, Ravi gives us a walkthrough of his approach to a specific sentiment analysis problem.<\/p>\n","protected":false},"author":27,"featured_media":4077,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[122],"ppma_author":[2453],"class_list":["post-1106","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-big-data"],"authors":[{"term_id":2453,"user_id":27,"is_guest":0,"slug":"ravi-condamoor","display_name":"Ravi Condamoor","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Condamoor","first_name":"Ravi","job_title":"","description":"After working at some of the industry&nbsp;leaders such as IBM and Oracle, Ravi has co-founded multiple successful companies in the Big Data &amp; Analytics industry. He has experience in building scalable products in domains in including&nbsp;Healthcare, Ad Tech, Media and Industrial Internet.&nbsp;"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1106","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1106"}],"version-history":[{"count":3,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1106\/revisions"}],"predecessor-version":[{"id":30052,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1106\/revisions\/30052"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/4077"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1106"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1106"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1106"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}