Economics Webinar - On the Testability of the Anchor Words Assumption in Topic Models

9:00am - 10:30am
Online via Zoom

Topic models are a simple and popular tool for the statistical analysis of textual data. Their identification and estimation is typically enabled by assuming the existence of \emph{anchor words}; that is, words that are exclusive to specific topics. In this paper we show that the existence of anchor words is statistically testable: there exists a test with correct size that has nontrivial power. This means that, in general, the anchor word assumption cannot be viewed simply as a convenient normalization. At the core of our result lies a simple characterization of when a column-stochastic matrix with known nonnegative rank admits a \emph{separable} factorization. We use a simulation study to analyze the power of a bootstrapped version of our suggested procedure and to discuss its computational limitations.

Event Format
Speakers / Performers:
Prof. Jose Luis Montiel Olea
Cornell University

Recommended For
Faculty and staff
PG students
Department of Economics

Julie Wong by email:

Post an event
Campus organizations are invited to add their events to the calendar.