DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6...9
Hits 21 – 40 of 168

21
Bayesian phylogenetics, sequence alignment and the genetic structure of the Kainji languages. ...
Bacon, Geoff; Bird, Steven. - : Monash University, 2016
BASE
Show details
22
Learning Crosslingual Word Embeddings without Bilingual Corpora ...
BASE
Show details
23
Language Preservation 2.0: Crowdsourcing oral language documentation using mobile devices
Bird, Steven. - 2015
Abstract: In crude quantitative terms, Zipf’s law tells us that documentation of something as simple as word usage requires several million words of text or several hundred hours of speech, in a wide variety of genres and styles. The only way to achieve this goal for the majority of the world’s languages is to collect speech. Speech has the added advantage of providing information about phonetics, phonology, and prosody. Speech is also the primary register for dialogue, the most common form of language use. We argue that a combination of community outreach, crowdsourcing techniques, and mobile/web technologies make it relatively easy to collect hundreds or thousands of hours of speech (Callison-Burch and Dredze, 2010; Hughes et al., 2010; Anon 2010). On its own, this would leave us with a large archive of uninterpreted audio recordings and – once the languages are no longer spoken – an onerous and unverifiable decipherment problem. To avoid this problem and to ensure interpretability, there must be a documentary record that includes translation into a major language. We take as our guide the current typical practice in documentary linguistics, which is to record and report data as interlinear glossed text. To this end, we add two layers of audio annotation to the primary recordings. The first layer is careful respeaking, or “audio transcription,” in which native speakers listen to the recordings phrase by phrase, and respeak each phrase slowly and carefully. The second layer is oral translation, in which bilingual speakers produce phrase-by-phrase interpretation of the original recordings into a widely-spoken contact language such as English. This combination of respeaking and interpreting constitutes an “acoustic Rosetta stone” which, over time, will grow to a sufficient size to allow open-ended analysis of the language even when it is no longer spoken, including new methods for developing automatic phonetic recognizers and automatic translation systems (Liberman et al., 2013, Lee et al., 2013, Anon 2013). We will demonstrate a novel way to work with the speakers of endangered languages to collect these spoken language annotations and interlinear glossed texts on a large scale. Our approach addresses key issues in such areas as informed consent, quality control, workflow management, and the diverse technological situations of linguistic fieldwork. Our work promises to speed up the process of preserving the world’s languages and enable future study of these languages and access to knowledge that is captured in archived speech recordings. References Chris Callison-Burch and Mark Dredze. Creating speech and language data with Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pages 1–12. Association for Computational Linguistics, 2010. URL http://www.aclweb.org/anthology/ W10-0701. Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, and Mike LeBeau. Building transcribed speech corpora quickly and cheaply for many languages. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, pages 1914–1917. ISCA, 2010. Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra (2013). Using multiple versions of speech input in phone recognition, ICASSP. Chia-ying Lee, Yu Zhang, and James Glass (2013). Joint learning of phonetic units and word pronunciations for ASR. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 182–192. Association for Computational Linguistics. ; 25309.mp3 ; 25309.pdf
URL: http://hdl.handle.net/10125/25309
BASE
Hide details
24
Documentary Linguistics and Computational Linguistics: A response to Brooks
Bird, Steven; Chiang, David; Frowein, Friedel. - : University of Hawaii Press, 2015
BASE
Show details
25
Practical Natural Language Processing for Low-Resource Languages.
BASE
Show details
26
Documentary Linguistics and Computational Linguistics: A response to Brooks
Bird, Steven; Chiang, David; Frowein, Friedel. - : University of Hawaii Press, 2015
BASE
Show details
27
Language Preservation 2.0: Crowdsourcing oral language documentation using mobile devices
Bird, Steven. - 2015
BASE
Show details
28
Computational support for early elicitation and classification of tone
Bird, Steven; Lee, Haejoong. - : University of Hawai'i Press, 2014
BASE
Show details
29
Computational support for early elicitation and classification of tone
Bird, Steven; Lee, Haejoong. - : University of Hawai'i Press, 2014
BASE
Show details
30
Collecting bilingual audio in remote indigenous villages
BASE
Show details
31
The International Workshop on Language Preservation: An Experiment in Text Collection and Language Technology
Bird, Steven; Chiang, David; Frowein, Friedel. - : University of Hawaii Press, 2013
BASE
Show details
32
The International Workshop on Language Preservation: An Experiment in Text Collection and Language Technology
Bird, Steven; Chiang, David; Frowein, Friedel. - : University of Hawaii Press, 2013
BASE
Show details
33
Effects of distributed practice on the acquisition of second language English syntax
In: Applied psycholinguistics. - Cambridge [u.a.] : Cambridge Univ. Press 32 (2011) 2, 435-452
BLLDB
OLC Linguistik
Show details
34
Tone in Usarufa: Field Recordings
Bird, Steven. - 2011
BASE
Show details
35
Equipping university students to document their ancestral languages
BASE
Show details
36
Book Review
In: Natural language engineering. - Cambridge : Cambridge University Press 17 (2010) 3, 419-424
OLC Linguistik
Show details
37
Effects of distributed practice on the acquisition of second language English syntax
In: Applied psycholinguistics. - Cambridge [u.a.] : Cambridge Univ. Press 31 (2010) 4, 635-650
BLLDB
OLC Linguistik
Show details
38
The human language project: building a universal corpus of the world's languages
In: Association for Computational Linguistics. Proceedings of the conference. - Stroudsburg, Penn. : ACL 48 (2010) 1, 88-97
BLLDB
Show details
39
The Big Australian Speech Corpus (The Big ASC)
Chetty, Girija; Cassidy, Stephen; Butcher, Andrew Richard. - : Causal Productions, 2010
BASE
Show details
40
Fast query for large treebanks
GHODKE, SUMUKH; BIRD, STEVEN. - : Association for Computational Linguistics, 2010
BASE
Show details

Page: 1 2 3 4 5 6...9

Catalogues
3
7
12
0
1
2
0
Bibliographies
27
0
1
1
0
0
0
0
3
Linked Open Data catalogues
0
Online resources
4
0
1
0
Open access documents
114
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern