Document Type

Discussion Paper

Publication Date

5-1-2019

CFDP Number

2176

CFDP Pages

55

Journal of Economic Literature (JEL) Code(s)

M1, M3, C8, C5

Abstract

The authors address two novel and significant challenges in using online text reviews to obtain attribute level ratings. First, they introduce the problem of inferring attribute level sentiment from text data to the marketing literature and develop a deep learning model to address it. While extant bag of words based topic models are fairly good at attribute discovery based on frequency of word or phrase occurrences, associating sentiments to attributes requires exploiting the spatial and sequential structure of language. Second, they illustrate how to correct for attribute self-selection—reviewers choose the subset of attributes to write about—in metrics of attribute level restaurant performance. Using Yelp.com reviews for empirical illustration, they find that a hybrid deep learning (CNN-LSTM) model, where CNN and LSTM exploit the spatial and sequential structure of language respectively provide the best performance in accuracy, training speed and training data size requirements. The model does particularly well on the “hard” sentiment classification problems. Further, accounting for attribute self-selection significantly impacts sentiment scores, especially on attributes that are frequently missing.

Included in

Economics Commons

Share

COinS