2 |
CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – the National Corpus of Contemporary Welsh ...
|
|
Knight, Dawn; Morris, Steve; Fitzpatrick, Tess; Rayson, Paul; Spasić, Irena; Thomas, Enlli Môn; Lovell, Alex; Morris, Jonathan; Evas, Jeremy; Stonelake, Mark; Arman, Laura; Davies, Josh; Ezeani, Ignatius; Neale, Steven; Needs, Jennifer; Piao, Scott; Rees, Mair; Watkins, Gareth; Williams, Lowri; Muralidaran, Vignesh; Tovey-Walsh, Bethan; Anthony, Laurence; Cobb, Thomas M; Deuchar, Margaret; Donnelly, Kevin; McCarthy, Michael; Scannell, Kevin. - : Cardiff University, 2020
|
|
Abstract:
The CorCenCC corpus contains over 11 million words (circa 14.4m tokens) from written, spoken and electronic (online, digital texts) Welsh language sources, taken from a range of genres, language varieties (regional and social) and contexts. The contributors to CorCenCC are representative of the over half a million Welsh speakers in the country. The creation of CorCenCC was a community-driven project, which offered users of Welsh an opportunity to be proactive in contributing to a Welsh language resource that reflects how Welsh is currently used. To make CorCenCC as representative of contemporary Welsh as possible, the project team designed a bespoke sampling framework. Extracts were collected from sources including for example, journals, emails, sermons, road signs, TV programmes, meetings, magazines and books. Conversations were recorded by the research team, and a specially designed crowdsourcing app (see: https://www.corcencc.org/app/) enabled Welsh speakers in the community to record and upload samples ...
|
|
Keyword:
Computational Linguistics; Computational/Corpus Linguistics; Language Corpora for ICT; Linguistics General
|
|
URL: https://dx.doi.org/10.17035/d.2020.0119878310 https://research.cardiff.ac.uk/converis/portal/detail/Dataset/119878310?auxfun=&lang=en_GB
|
|
BASE
|
|
Hide details
|
|
3 |
The National Corpus of Contemporary Welsh: Project Report | Y Corpws Cenedlaethol Cymraeg Cyfoes: Adroddiad y Prosiect ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Yr Amliadur: Frequency Lists for Contemporary Welsh (Version 1.0.0) ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The engagement of BAAL - and applied linguistics - with policy and practice
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Making sense of learner performance on tests of productive vocabulary knowledge
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Establishing the Reliability of Word Association Data for Investigating Individual and Group Differences
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Establishing the reliability of word association data for investigating individual and group differences
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Establishing the reliability of word association data for investigating individual and group differences
|
|
|
|
BASE
|
|
Show details
|
|
|
|