Page: 1 2 3 4 5 6 7 8 9... 149
81 |
Data for: Psycholinguistic dataset on language use in 1145 novels published in English and Dutch ...
|
|
|
|
Abstract:
LIWC and n-gram counts of English and Dutch novels ================================================== This dataset consists of CSV files with word counts in several corpora: - 694 English language novels from different genders and orientations - 401 bestselling Dutch language novels - 50 novels nominated for Dutch literary prizes Each corpus comes with: - LIWC counts; this file also includes the available metadata for each novel. The English data was created with LIWC 2015. The Dutch data was created with the validated translation of LIWC 2001. - Word counts (unigrams) and bigram counts per novel. All text has been converted to lowercase. Contractions are tokenized into separate tokens, e.g., can't => ca n't Two restrictions are applied: - only unigrams or bigrams that occur in at least 10 texts are retained - only the 100k most frequent are retained - Overall word counts and bigram counts; i.e., the sum across all novels. All files are encoded in UTF-8. ...
|
|
Keyword:
Arts and Humanities; Computational Linguistics
|
|
URL: https://data.mendeley.com/datasets/x3m2gjkhx5 https://dx.doi.org/10.17632/x3m2gjkhx5
|
|
BASE
|
|
Hide details
|
|
82 |
Graph-to-Graph Translations To Augment Abstract Meaning Representation Tense And Aspect ...
|
|
|
|
BASE
|
|
Show details
|
|
83 |
A Journey in Linguistic Computing from Father Busa to Linguistic Linked Data ...
|
|
|
|
BASE
|
|
Show details
|
|
84 |
A Journey in Linguistic Computing from Father Busa to Linguistic Linked Data ...
|
|
|
|
BASE
|
|
Show details
|
|
86 |
Including Signed Languages in Natural Language Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
90 |
Rule-based Morphological Inflection Improves Neural Terminology Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
91 |
When is Char Better Than Subword: A Systematic Study of Segmentation Algorithms for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
92 |
The Reading Machine: a Versatile Framework for Studying Incremental Parsing Strategies ...
|
|
|
|
BASE
|
|
Show details
|
|
93 |
To POS Tag or Not to POS Tag: The Impact of POS Tags on Morphological Learning in Low-Resource Settings ...
|
|
|
|
BASE
|
|
Show details
|
|
94 |
Translating Headers of Tabular Data: A Pilot Study of Schema Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
95 |
A Prototype Free/Open-Source Morphological Analyser and Generator for Sakha ...
|
|
|
|
BASE
|
|
Show details
|
|
97 |
Developing Conversational Data and Detection of Conversational Humor in Telugu ...
|
|
|
|
BASE
|
|
Show details
|
|
98 |
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words ...
|
|
|
|
BASE
|
|
Show details
|
|
99 |
HIT - A Hierarchically Fused Deep Attention Network for Robust Code-mixed Language Representation ...
|
|
|
|
BASE
|
|
Show details
|
|
100 |
Minimally-Supervised Morphological Segmentation using Adaptor Grammars with Linguistic Priors ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8 9... 149
|
|