Linguistic, Behavioral and Temporal Signals Provide Clues
Thinking about trying that new restaurant that opened just down the street? Eye-balling that new phone that costs $1,000?
From picking a restaurant to buying the latest gadget to choosing a new doctor, people make a large number of their decisions based on reading reviews. Sometimes these reviews are an accurate reflection of what you’ll get. Other times, these reviews fall short.
Consumer Economy Driven by Opinion
“The consumer economy is driven by opinions,” said Arjun Mukherjee, assistant professor of computer science at the University of Houston, whose research is focused on detecting deceptive opinion spam, or fake reviews. “Veracity of opinions is of paramount importance.”
Spotting deceptive opinions, in the absence of contextual information about the reviewer’s background, can be quite tricky, a point that Mukherjee makes to his students every semester. One experiment that he often repeats is showing a group of students a pair of reviews, one that is real, one that is fake.
When asked which one is fake, their accuracy inevitably hovers around 50 percent, little better than sheer guesswork. Even after years of working in the field, Mukherjee’s success rate at spotting an individual fake review isn’t much better either.
“Just by looking at a review, it is very hard to tell which one is a fake,” Mukherjee said.
Filtering Fake Reviews Dependent on Contextual Clues
Mukherjee’s research group constructs models that use contextual signals to spot fake reviews.
“There are a lot of signals you have to take into account,” Mukherjee said.
These patterns are a little bit like a poker player’s tells: subtle signals that can help differentiate truth from deception but are never fully foolproof. With spotting fake reviews, this comes down to analyzing linguistic, behavioral and temporal signals.
Use of Language and Posting Patterns to Spot Deception
For example, if one reviewer posts a whole bunch of reviews all at once or a group of people review the same products in a short time frame, that’s a sign of deceptive opinions. Another signal, known as ‘buffering,’ would be if there is an influx of positive reviews when the overall approval rate of a product is dropping, a strategy often used by review spammers to maintain a product’s rating.
Another signal would be behavioral. If reviews deviate from the norm, giving positive feedback when the majority are negative, or if an author uses duplicated content or only gives extreme ratings, these are also indications of potentially deceptive opinions.
Another clue is language. If a review’s descriptions are generic, without offering many specific details, that’s yet another sign of deception.
With Mukherjee’s models, all of these signals are taken into account, to spot patterns that suggest deceptive opinions.
“It’s a holistic model,” Mukherjee said.
Suspicious Patterns Do Not Always Indicate Deception
But, as with any problem this complex, although the analysis can be sophisticated enough to filter out fake reviews that, at first glance, seem legitimate, there are no guarantees. There will always be genuine reviews that get flagged as fake, and there will always be fake reviews that escape detection.
“All suspicious patterns may not indicate deception,” Mukherjee said. “After all, we are individuals, we have different personalities. How one person evaluates a particular entity might look fishy or suspicious to me, but might not look that way to you.”
This research is funded by the National Science Foundation. Results from this research have been presented at the Association for the Advancement of Artificial Intelligence Conference on Web and Social Media, the Association for Computing Machinery World Wide Web Conference, the Institute for Electrical and Electronics Engineers’ International Conference on Data Mining, as well as the International Conference on Intelligent Text Processing and Computation Linguistics.
- Rachel Fairbank, College of Natural Sciences and Mathematics