1 |
Finding Variants for Construction-Based Dialectometry: A Corpus-Based Approach to Regional CxGs ...
|
|
|
|
Abstract:
This paper develops a construction-based dialectometry capable of identifying previously unknown constructions and measuring the degree to which a given construction is subject to regional variation. The central idea is to learn a grammar of constructions (a CxG) using construction grammar induction and then to use these constructions as features for dialectometry. This offers a method for measuring the aggregate similarity between regional CxGs without limiting in advance the set of constructions subject to variation. The learned CxG is evaluated on how well it describes held-out test corpora while dialectometry is evaluated on how well it can model regional varieties of English. Themethod is tested using two distinct datasets: First, the International Corpus of English representing eight outer circle varieties; Second, a web-crawled corpus representing five inner circle varieties. Results show that themethod (1) produces a grammar with stable quality across sub-sets of a single corpus that is (2) capable ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2104.01299 https://dx.doi.org/10.48550/arxiv.2104.01299
|
|
BASE
|
|
Hide details
|
|
2 |
Representations of Language Varieties Are Reliable Given Corpus Similarity Measures ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Learned Construction Grammars Converge Across Registers Given Increased Exposure ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Production vs Perception: The Role of Individuality in Usage-Based Grammar Induction ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Global Syntactic Variation in Seven Languages: Towards a Computational Dialectology ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Learned Construction Grammars Converge Across Registers Given Increased Exposure
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Production vs Perception: The Role of Individuality in Usage-Based Grammar Induction
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Representations of Language Varieties Are Reliable Given Corpus Similarity Measures
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Modeling Global Syntactic Variation in English Using Dialect Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Mapping Languages and Demographics with Georeferenced Corpora
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Global Syntactic Variation in Seven Languages: Toward a Computational Dialectology
|
|
|
|
In: Front Artif Intell (2019)
|
|
BASE
|
|
Show details
|
|
17 |
Modeling the Complexity and Descriptive Adequacy of Construction Grammars
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2018)
|
|
BASE
|
|
Show details
|
|
18 |
Learnability and falsifiability of Construction Grammars
|
|
|
|
In: Proceedings of the Linguistic Society of America; Vol 2 (2017): Proceedings of the Linguistic Society of America; 1:1–15 ; 2473-8689 (2017)
|
|
BASE
|
|
Show details
|
|
19 |
The Linguistic Status of Predictions and Feature Ranks from SVM Text Classifiers
|
|
|
|
In: LSA Annual Meeting Extended Abstracts; Vol 6: LSA Annual Meeting Extended Abstracts 2015; 5:1-5 ; 2377-3367 (2015)
|
|
BASE
|
|
Show details
|
|
|
|