Institut für Dokumentologie und Editorik

Genre Analysis and Corpus Design: Nineteenth-Century Spanish-American Novels (1830–1910)

 

Index of Tables

Table 1: Generic logics according to Schaeffer.

Table 2: Types of summarizing subgenre labels.

Table 3: Top most frequent explicit subgenre labels in the bibliography.

Table 4: Top most frequent subgenres in the bibliography.

Table 5: Set of subgenres used as a basis for the interpretation of implicit signals.

Table 6: Top most frequent thematic subgenre labels in the bibliography.

Table 7: Frequencies of subgenre labels related to literary currents in the bibliography.

Table 8: Set of subgenres used as a basis for the interpretation of literary-historical subgenre labels.

Table 9: Literary-historical sources for the assignment of subgenres.

Table 10: Set of subgenres occurring explicitly or implicitly in the bibliography.

Table 11: Steps for the preparation of structured full text.

Table 12: Error words mapped with general lists of proper nouns.

Table 13: Regular expressions for verb forms with pronoun suffixes.

Table 14: Error words mapped with word patterns.

Table 15: Error words mapped with manually edited exception lists.

Table 16: Values for the time period covered by a novel.

Table 17: Encoding of textual phenomena in the main body of the novels.

Table 18: Encoding of types of written texts represented in the novels.

Table 19: Additional keyword terms for subgenre signals in the text corpus.

Table 20: Elements of the corpus published on GitHub.

Table 21: Authors with most novels in BibACMé and Conha19.

Table 22: Authors with most editions in BibACMé and Conha19.

Table 23: Ranks of discursive levels of subgenre labels, explicit versus literary-historical (Bib-ACMé).

Table 24: Top combinations of thematic subgenre labels.

Table 25: Top combinations of subgenre labels related to the mode of representation.

Table 26: Parameters for general feature sets.

Table 27: Definition and examples of character n-gram subtypes.

Table 28: Most frequent tokens.

Table 29: Word count matrix for the first sentence of the novel “Amalia” (1855, AR) by José Mármol.

Table 30: Word count matrix with word lemmas.

Table 31: Top 15 words of two example topics.

Table 32: Example documentsOut from a topic model.

Table 33: Parameters for topic feature sets.

Table 34: Parameters for classifiers.

Table 35: Experiments for parameter evaluation.

Table 36: Classification results for primary thematic subgenres (topics).

Table 37: Classification results for primary thematic subgenres (SVM, 90 topics, optimization interval of 250).

Table 38: Classification results for primary thematic subgenres (MFW).

Table 39: Classification results for primary thematic subgenres (RF, 3,000 MFW, tf-idf).

Table 40: Classification results for primary literary currents (topics).

Table 41: Classification results for primary literary currents (SVM, 90 topics, optimization interval of 2,500).

Table 42: Classification results for primary literary currents (MFW).

Table 43: Classification results for primary literary currents (SVM, 3,000 MFW, tf-idf).

Table 44: Overview of the family resemblance networks produced.

Table 45: Nearest neighbors in cluster 3 of the network HIST.

Table 46: Sources of the novels in the corpus.