Time-dependent Poisson reduced rank models for political text data analysis

We consider Poisson reduced rank models where parameters depend on time. Our main motivation comes from studies in comparative politics where one wants to locate party positions in a certain political space. For this purpose, several empirical methods have been proposed using text as data sources. A...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jentsch, Carsten (VerfasserIn) , Lee, Eun Ryung (VerfasserIn) , Mammen, Enno (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 2020
In: Computational statistics & data analysis
Year: 2020, Jahrgang: 142
DOI:10.1016/j.csda.2019.106813
Online-Zugang:Verlag, Volltext: https://doi.org/10.1016/j.csda.2019.106813
Verlag, Volltext: http://www.sciencedirect.com/science/article/pii/S0167947319301586
Volltext
Verfasserangaben:Carsten Jentsch, Eun Ryung Lee, Enno Mammen
Beschreibung
Zusammenfassung:We consider Poisson reduced rank models where parameters depend on time. Our main motivation comes from studies in comparative politics where one wants to locate party positions in a certain political space. For this purpose, several empirical methods have been proposed using text as data sources. As the data structure of texts is quite complex, its analysis to extract information is generally a difficult task. Furthermore, political texts usually contain a large number of words such that a simultaneous analysis of word counts becomes challenging. In this paper, we consider Poisson models for each word count simultaneously and provide a statistical method suitable to analyze political text data. We consider a novel model which allows the political lexicon to change over time and develop an estimation procedure based on LASSO and fused LASSO penalization techniques to address high-dimensionality via significant dimension reduction. This model gives insights into the potentially changing use of words by left and right-wing parties over time. The procedure allows to identify automatically those words having a discriminating effect between party positions. To address the dependence structure of word counts over time, we propose integer-valued time series processes to implement a suitable bootstrap method for constructing confidence intervals for the model parameters. We apply our approach to party manifesto data from German parties over seven federal elections after German reunification. The approach does not require any a priori information nor expert knowledge to process the data.
Beschreibung:Available online 22 July 2019
Gesehen am 28.11.2019
Beschreibung:Online Resource
DOI:10.1016/j.csda.2019.106813