Download PDFOpen PDF in browserContextual Predictability of Texts for Texts Processing and UnderstandingEasyChair Preprint 239816 pages•Date: January 17, 2020AbstractThis paper is the first part of contextual predictability model investigation for Russian, it is focused on linguistic and psychology interpretation of models, fea-tures, metrics and sets of features. The aim of this paper is to identify the depend-ence of the implementation of contextual predictability procedures on the genre characteristics of the text (or text collection): scientific vs. fictional. We construct a model predicting text elements and designate its features for texts of different genres and domains. We analyze various methods for studying contextual pre-dictability, carry out a computational experiment against scientific and fictional texts, and verify its results by the experiment with informants (cloze-tests) and word embeddings (word2vec CBOW model). As a result, text processing model is built. It is evaluated based on the selected contextual predictability features and experiments with informants. Keyphrases: Contextual Predictability, Dice, Fiction texts, Informational Entropy, cloze test, computational linguistic, conditional probability, language model, scientific corpora, surprisal
|