Sentiment analysis helps companies understand their customers by categorising consumers’ attitudes and feelings towards their organisation, product or brand as either positive, negative or neutral depending on their language use.
Language is a powerful tool used by consumers to have their voice heard, whether that be on social media sites, review or comparison sites, blogs and business sites for which sentiment analysis aims to shed light on the latest consumer opinions and habits. Companies can then use this data to better understand and communicate with their customers, highlighting the positive through tailoring PR and marketing strategy while identifying any issues before they become greater concerns.
As for the best way of analysing the data, few people appear to be in agreement with some favouring the use of computer software (algorithms assigning positive or negative sentiment to individual words) and others opting for more traditional human analysis. But which approach ultimately leads to the most accurate and efficient results?
An important trade-off when deciding which approach to adopt is the insight human interpretation provides but the time it costs comparatively to software. While computers are able to process vast amounts of information instantaneously, humans are often required to understand it for us.
Software for example may classify statements of negation “the food wasn’t great” as positive, simply because “great” carries positive connotations. Computers also struggle with slang (emerging terms that only humans are familiar with) and statements, which are both positive and negative, “the food was nice but I had to wait a while before they served us.” How would software categorise this sentiment? It’s hardly neutral.
Real problems arise if negative statements are wrongly interpreted as positive, meaning customers’ negative feedback fails to reach customer care teams and problems remain unresolved. That said, there’s no denying that human analysis is slow and time consuming, requiring great effort recruiting the analysts and manually engaging with the mass of data.
Another issue that underlies the accuracy rates of each approach is its ability to interpret both literal and intended meanings within language. The suggested accuracy rates of software vary widely from as low as 30% and as high as 80% which may be explained by the nature of the data they analyse.
Software appears able to detect literal meanings at high degrees of accuracy, generating various sentiment figures at fast pace. However, its accuracy in understanding metaphorical and intended meanings on behalf of the analyst may be challenged. Meaning generated through language is far more than the words themselves, but its intonation and contextual features also. The struggle of software in detecting intended meanings such as irony and sarcasm where we deliberately use language opposite of the truth for particular effect may prove human analysis to be the best method in providing the most accurate results.
“I was stuck in a queue for an hour. Great!” While software may wrongly tag this comment as positive, humans can understand from personal experience the negative emotion expressed and empathise more with the customer.
Humans are also better at identifying the subtleties of language which software has no chance of understanding at the same level. A customer could describe a delivery service as “good” which although positive, may include negative undertones in that a happy customer may be more likely to include an intensifier, “very good” or an alternative adjective, e.g. “brilliant.”
Computers can’t keep up with the idiosyncrasies of language and the subtle intended meaning of a writer. It’s also important to consider context as some words may lend themselves more to certain sentiments depending on the situation. For example, “long” may be used positively in smartphone reviews detailing battery life but have more negative connotations in retail, e.g. length of queues. Human analysts are more likely to consider contextual factors while automated sentiment analysis would benefit from its tools being trained to be domain-specific (e.g. retail, restaurant-focused etc) which may take a while to develop.
At first glance, software may appear the cheaper route with the cost of employing a large number of human analysts over a long period of time high. Then again, software isn’t necessarily cheap to run. While some software may be cheap to buy initially, its accuracy may be low.
Therein lies a decision in opting between cheaper (inaccurate) software or developing the tools to generate more accurate results (i.e. domain and time-specific) which over time becomes expensive when hiring the software engineers to do it. Either way, if high levels of accuracy are what you’re aiming for, it’s going to cost.
Both approaches have their advantages and disadvantages and while software is arguably more efficient, human analysis is ultimately most accurate. Sentiment analysis software ought to complement human analysis, not replace it completely. Software has the edge in terms of speed and efficiency. However, data benefits from being checked by human analysts to ensure all aspects of language are carefully interpreted.
If using software alone, tools should be time and domain-specific enabling the computer to keep up to date with emerging terms and understand the impact of context on meaning. If opting for human analysis on its own, the data sample should be reduced to focus only on the most important and influential sources, limiting time and effort where possible.