The European Journal of Humour Research

Vol 2, No 3 (2014)

A statistical analysis of satirical Amazon.com product reviews

Stephen Skalicky,Scott Crossley

Abstract

A corpus of 750 product reviews extracted from Amazon.com was analyzed for specific lexical, grammatical, and semantic features to identify differences between satirical and non-satirical Amazon.com product reviews through a statistical analysis. The corpus contained 375 reviews identified as satirical and 375 as non-satirical (750 total). Fourteen different linguistic indices were used to measure features related to lexical sophistication, grammatical functions, and the semantic properties of words. A one-way multivariate analysis of variance (MANOVA) found a significant difference between review types. The MANOVA was followed by a discriminant function analysis (DFA), which used seven variables to correctly classify 71.7 per cent of the reviews as satirical or non-satirical. Those seven variables suggest that, linguistically, satirical texts are more specific, less lexically sophisticated, and contain more words associated with negative emotions and certainty than non-satirical texts. These results demonstrate that satire shares some, but not all, of the previously identified semantic features of sarcasm (Campbell & Katz 2012), supporting Simpson’s (2003) claim that satire should be considered separately from other forms of irony. Ultimately, this study puts forth an argument that a statistical analysis of lexical, semantic, and grammatical properties of satirical texts can shed some descriptive light on this relatively understudied linguistic phenomenon, while also providing suggestions for future analysis.

References

Amazon. (2013). Funny Reviews: Dynamic List. http://www.amazon.com/gp/feature.html?ie=UTF8&docId=1001250201 (accessed 18 September 2013).

Attardo, S. (2000). ‘Irony as relevant inappropriateness’. Journal of Pragmatics 32, pp. 793-826.

Burfoot, C. & Baldwin, T. (2009). ‘Automatic satire detection: Are you having a laugh?’, in Proceedings of the Association for Computational Linguistics International Joint Conference on Natural Language Processing 2009 Conference: Short Papers (Singapore, 2-7 August 2009), pp. 161-164.

Brysbaert, M. & New, B. (2009). ‘Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English’. Behavior Research Methods 41 (4), pp. 977-990.

Brysbaert, M., Warriner, A. & Kuperman, V. (2013). ‘Concreteness ratings for 40 thousand generally known English word lemmas’. Behavior Research Methods 46 (3), pp. 904-911.

Campbell, J. & Katz, A. (2012). ‘Are there necessary conditions for inducing a sense of sarcastic irony?’. Discourse Processes 49 (6), pp. 459-480.

Carvalho, P., Sarmento, L., Silva, M. & de Oliveira, E. (2009). ‘Clues for detecting irony in user-generated contents: Oh …!! It’s “so easy” ; – )’, in TSA ’09: 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion (Hong Kong, 6 November 2009), New York: Association for Computing Machinery, pp. 53-56.

Caucci, G. & Kreuz, R. (2012). ‘Social and paralinguistic cues to sarcasm’. Humor: International Journal of Humor Research 25 (1), pp. 1-22.

Colston, H. & Gibbs, R. (2007). ‘A brief history of irony’, in Gibbs, R. R. & Colston, H. (eds.), Irony in Language and Thought: A Cognitive Science Reader, New York: Lawrence Erlbaum Associates, pp. 3-21.

Coltheart, M. (1981). ‘The MRC psycholinguistic database’. Quarterly Journal of Experimental Psychology 33 (4), pp. 497-505.

Condren, C. (2012). ‘Satire and definition’. Humor: International Journal of Humor Research 25 (4), pp. 375-399.

Crossley, S. A., Salsbury, T., McNamara, D. S. & Jarvis, S. (2010). ‘Predicting lexical proficiency in language learner texts using computational indices’. Language Testing 28 (4), pp. 561-580.

Gibbs, R. (2000). ‘Irony in talk among friends’. Metaphor and Symbol 15 (1-2), pp. 5-27.

González-Ibáñez, R., Muresan, S. & Wacholder, N. (2011). ‘Identifying sarcasm in Twitter: A closer look’, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (Portland, Oregon, 19-24 June 2011): Short Papers, Stroudsburg, PA: Association for Computational Linguistics (ACL), pp. 581-586.

Hancock, J. (2004). ‘Verbal irony use in face-to-face and computer-mediated conversations’. Journal of Language and Social Psychology 23 (4), pp. 447-463.

Jorgensen, J. (1996). ‘The functions of sarcastic irony in speech’. Journal of Pragmatics 26 (5), pp. 613-634.

Kreuz, R., Long, D. & Church, M. (1991). ‘On being ironic: Pragmatic and mnemonic implications’. Metaphor and Symbolic Activity 6 (3), pp. 149-162.

Kreuz, R. & Caucci, G. (2007). ‘Lexical influences on the perception of sarcasm’, in FigLanguages ’07: Proceedings of the Workshop on Computational Approaches to Figurative Language, Stroudsburg, PA: Association for Computational Linguistics (ACL), pp. 1-4.

Kreuz, R. & Caucci, M. (2008). ‘Do lexical factors affect the perception of sarcasm?’ Paper presented at the 18th Annual Meeting of the Society for Text and Discourse. University of Memphis, Memphis, TN, 12-15 July.

Kuperman, V., Stadthagen-Gonzales, H. & Brysbaert, B. (2012). ‘Age-of-acquisition ratings for 30 thousand English words’. Behavior Research Methods 44 (4), pp. 978-990.

Kyle, K. & Crossley, S. A. (2014). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly.

LIWC, Inc. (n.d.). Linguistic inquiry and word count: Table 1: LIWC2007 output variable information. http://www.liwc.net/descriptiontable1.php (accessed 1 November 2013).

Mihalcea, R. & Strapparava, C. (2006). ‘Learning to laugh (automatically): Computational models for humor recognition’. Computational Intelligence 22 (2), pp. 126-142.

Newman, M., Groom, C., Handelman, L. & Pennebaker, J. (2008). ‘Gender differences in language use: An analysis of 14,000 text samples’. Discourse Processes 45, pp. 211-236.

Nilsen, A. & Nilsen, D. (2008). ‘Literature and humor’, in Raskin, V. (ed.), The Primer of Humor Research, New York: Mouton de Gruyter, pp. 243-280.

Pennebaker, J., Booth, R. & Francis, M. (2007). Operator’s Manual: Linguistic Inquiry and Word Count: LIWC2007. Austin, Texas: LIWC.net http://homepage.psy.utexas.edu/HomePage/Faculty/Pennebaker/Reprints/LIWC2007_OperatorManual.pdf (accessed 1 October 2013).

Popova, M. (n.d.). Modern Masterpieces of Comedic Genius: The Art of the Humorous Amazon Review. http://www.brainpickings.org/index.php/2013/07/08/humorous-amazon-reviews/ (accessed 1 September 2013).

Reyes, A. & Rosso, P. (2011). ‘Mining subjective knowledge from customer reviews: A specific case of irony detection’, in Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (Portland, Oregon, 24 June 2011), Stroudsburg, PA: Association for Computational Linguistics (ACL), pp. 118-124.

Simpson, P. (2003). On the Discourse of Satire: Towards a Stylistic Model of Satirical Humor. Amsterdam & Philadelphia: John Benjamins Publishing Company.

Skalicky, S. (2013). ‘Was this analysis helpful?: A genre analysis of the Amazon.com discourse community and its “most helpful” product reviews’. Discourse, Context & Media 2 (2), pp. 84-93.

Tausczik, Y. & Pennebaker, J. (2009). ‘The psychological meaning of words: LIWC and computerized text analysis methods’. Journal of Language and Social Psychology 29 (1), pp. 24-54.

Whalen, J., Pexman, P. & Gill, A. (2009). ‘“Should be fun – Not!”: Incidence and marking of nonliteral language in e-mail’. Journal of Language and Social Psychology 28 (3), pp. 263-280.