Economics Webinar - On the Testability of the Anchor Words Assumption in Topic Models

9:00am - 10:30am
Online via Zoom

Topic models are a simple and popular tool for the statistical analysis of textual data. Their identification and estimation is typically enabled by assuming the existence of \emph{anchor words}; that is, words that are exclusive to specific topics. In this paper we show that the existence of anchor words is statistically testable: there exists a test with correct size that has nontrivial power. This means that, in general, the anchor word assumption cannot be viewed simply as a convenient normalization. At the core of our result lies a simple characterization of when a column-stochastic matrix with known nonnegative rank admits a \emph{separable} factorization. We use a simulation study to analyze the power of a bootstrapped version of our suggested procedure and to discuss its computational limitations.

講者/ 表演者:
Prof. Jose Luis Montiel Olea
Cornell University

http://www.joseluismontielolea.com/

語言
英文
適合對象
校友
教職員
研究生
主辦單位
經濟學系
聯絡方法

Julie Wong by email: ecseminar@ust.hk

新增活動
請各校內團體將活動發布至大學活動日曆。