Suggestion and Opinion Mining from Qualitative Surveys

Ravi Condamoor Ravi Condamoor
February 15, 2019 Big Data, Cloud & DevOps

Problem Definition and Motivation

For a long time, open-ended qualitative surveys were usually conducted within small focus groups and the survey responses were typically manually coded and interpreted. Today with the explosion of social media conversations and online reviews, the notion of an Always On focus group is real, and brands have started using automated tools based on machine learning and NLP to understand customer sentiments, opinions, suggestions, and needs from the continuous stream of conversations.

Recently, we were involved in analyzing traveler feedback on a newly opened airport in Asia. The airport had been operational for about 8 months, generating close to 300,000 qualitative comments in social media and email surveys. The dataset contained suggestions and opinions on various aspects of the airport, ranging from boarding procedures to airport amenities. Our task was to understand what aspects of the airport the travelers were most satisfied or dissatisfied with, and which ones needed improvement. The results of our analysis were then fed into an action plan to help improve the airport’s operations.

Our Approach

Essentially, we extracted a combination of Sentiments, Intent, Interest, and Suggestions from the surveys and grouped them across various airport specific attributes to obtain a deeper understanding of the issues.

Some issues that we needed to consider while approaching this problem were:

  1. Customers who provide suggestions are generally unhappy about a particular aspect being discussed. For example, in the case I hope the security is good next time I come here, the customer actually suggests improvement in security which is an indication of the dissatisfaction of the customer. Sentiment extraction that uses keyword-based features like positive and negative adjectives would not be able to identify suggestions properly and would probably mistake suggestion sentences to be positive expressions.
  2. It is also essential to figure out the airport feature that the customer is talking about and to accordingly identify sentences expressing opinion about a particular aspect of the subject being discussed.

In our case, opinionated statements could be among the following types:

  1. Positive Expressions – sentences talking about a feature in a positive context
  2. Negative Expressions – sentences talking about a feature in a negative context
  3. Suggestive Expressions – sentences expressing a desire for improvement / wishful expressions

So in our approach, we proceeded to identify the three types of expressions that customers used to express opinions about the airport. In our case, we assume that negative and suggestive expressions indicate a negative opinion. Another challenge in our task was to identify the aspects about which customers were expressing opinions. The task objective clearly mentions to capture opinions about a certain set of features (food and beverage, shopping, restrooms, etc.). Manually building a terminology on these aspects of the airport is a time consuming process and the terminology obtained may not be complete. Therefore, for our purposes, we used a topic modeling-based approach to build our terminology (more on this will be discussed in the later sections).

From our extracted terminology, we then proceeded to identifying suggestions and sentiment- expressing sentences for our gathered features, using the following approaches:

1. Features Extraction

We used a topic modeling algorithm to extract the various themes across various reviews. After convergence of the Gibbs Sampling iterations, we got the topics associated with each review. A topic is a multinomial distribution over words. This example topic (airport, restroom, wait, queue) signifies that the theme of a certain review is related to “waiting for the restroom.” From the topics obtained we manually created a terminology of features. For an in-depth understanding of how topic models work, please refer to the excellent paper Latent Dirichlet Allocation by David Blei.

2. Suggestion Extraction

Suggestion Extraction is complex task by itself and previous research work is not conclusive. For extracting suggestions, we used some of the approaches followed by Ramanand et.al in Wishful Thinking Finding suggestions and ‘buy’ wishes from product reviews. They mention the use of rule-based approaches to extract suggestive texts from corpora. By manually browsing through random reviews, we have observed that modal verbs are an important aspect of suggestive sentences. For example, use of modal verbs like would, should in phrases need to be considered when building rules for suggestion classification. Some rules which we utilized when designing our suggestion extractor are given below:

1. <modal_verb><preference_verb><optional_window_size_of_3><positive_sentiment_words>

Some examples of these rules are “would be great” and “could be really good”.

2. <modal_verb><optional_window_size_3><auxillary_verb>

Some examples of these rules are “would like” and “will really love”.

Also, some general rules were used from manual extraction like “should come with” and “really needs to”.

Our suggestion extractor trained on these rules was used to extract suggestions from the reviews. Our vocabulary was then used to identify all the suggestions for a particular feature of the airport.

3. Sentiment Extraction

After filtering out suggestive snippets, we used our vocabulary to extract expressions conveying a positive and negative opinion from the review corpus. For the sentiment extraction, we used our in-house sentiment engine that has been trained with a wide range of rules. Some design features that were taken into account when analyzing sentiments are:

  • A comprehensive vocabulary of sentiment words related to our concerned domain
  • Sentence-level sentiment analysis to capture multiple sentiments across a post/review
  • Feature-based sentiment analysis to capture sentiments across multiple features
  • Negation support to improve sentiment accuracy (e.g., phrases like “not good”, “no improvement” could be disambiguated by our engine)

Technologies Used:

  • Serendio’s DisKoveror framework
  • R programming

Analysis of Surveys

1. Suggestions vs. Opinions

The following pie chart expresses the distribution of suggestions vs. opinions extracted by our algorithm from the surveys.

2. Suggestions for Improvement

The pie chart above shows the features on which people had suggestions to give. Some sample suggestions for the popular features we have extracted are given below.

1. Airport Transportation remained one of the most negative features about the airport. Many people had a lot of complaints regarding the transportation to and from the airport. Some sample suggestions that we picked up are:

  • Please bring back the city buses that were cheap and convenient?
  • Covered long term car park with adequate security is a must. For regular users it should be made cheaper

2. General Airport Amenities – there was also a lot of general consensus among customers relating to information kiosks, charging areas for laptops, mobile phones, etc. Some sample suggestions for this category are:

  • Need good secure wifi connectivity 
  • Quiet zones without noisy announcements where people can catch up on sleep
  • More mobile charger points & power points 

3. Waiting Time – here are some suggestions regarding passenger waiting facilities offered in the airport:

  • The chairs are terrible, please get slightly high-back chairs so that people sitting and waiting can rest their heads properly…this is sorely lacking at departure gate areas
  • I would like a few more recliner chairs…bean bags are also a great idea

3. Positive Opinions and their respective features:

The pie chart above shows the distribution of positive opinions with respect to various features. Sample positive expressions for the top positive features include:

1. General Amenities:

  • It is good that they have a paid lounge with unrestricted access for a reasonable feeNothing like landing at 12-midnight and having a strong coffee in the airport
  • Nothing like landing at 12-midnight and having a strong coffee in the airport

2. Waiting Time:

  • The waiting area was in general comfortable
  • Comfortable waiting area and the check-in process was fast

4. Negative Opinions and corresponding features:

 

The pie chart above shows the distribution of negative opinions with respect to various features. Some sample negative opinions about the airport features are given below:

1. Airport Transportation:

  • Currently, the vehicles move around and cause a serious problem for the passengers crossing the road
  • The arrival point has a lot of taxi drivers who always harass the passangers
  • Remove the nuisance of private taxi owners who harass passangers most

2. Security:

  • Too much hassle from airport security
  • Make sure security check in not going to be bottleneck which is currently the bottleneck
  • The security counter not only looks ugly, it is also not comfortable for the poort security staff

5. Tag Cloud of features used in our suggestion extraction

Conclusion

This article highlighted the Text Analytics techniques we used in analyzing qualitative surveys. This was used in conjunction with other forms of customer segmentation techniques such as cohort analysis to give us a more fine-grained understanding of the needs and wants of the airport user, as well as to introduce new amenities and services to improve the overall travel experience.

 

 

 

 

 


 

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Ravi Condamoor

    Tags
    Big Data
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    Looking to Enter the AI Industry? Here Are Some Tips

    Looking to Enter the AI Industry? Here Are Some Tips

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.