Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 25

1	Chinese computational linguistics and natural language processing based on naturally annotated big data : 13th China national conference, CCL 2014 and second international symposium, NLP-NABD 2014, Wuhan, China, October 18 - 19, 2014 : proceedings
	Zhao, Jun (Herausgeber); Sun, Maosong (Herausgeber); Liu, Yang (Herausgeber). - Cham, Heidelberg, New York, Dordrecht, London : Springer, 2014
	BLLDB
	UB Frankfurt Linguistik
	Show details

2	Methods in Latin computational linguistics
	McGillivray, Barbara. - Leiden [u.a.] : Brill, 2014
	BLLDB
	UB Frankfurt Linguistik
	Show details

3	MarsaTag, a tagger for French written texts and speech transcriptions
	Rauzy, Stéphane; Montcheuil, Grégoire; Blache, Philippe
	In: Second Asian Pacific Corpus linguistics Conference ; https://hal.archives-ouvertes.fr/hal-01500736 ; Second Asian Pacific Corpus linguistics Conference, Mar 2014, Hong Kong, China. pp.220-220 (2014)
	Abstract: International audience ; We present in this paper a new system, MarsaTag, aiming at segmenting, tagging and chunking French input. The originality of the tool, on top of its efficiency, is its ability to process written texts as well as speech transcriptions. The tagger executes the three following operations. First, a rule-based tokenizer splits the raw textual input in a sequence of tokens. In a second step, thanks to a broad-coverage morphosyntactic lexicon, each token form is associated to a tag distribution. The last step consists in disambiguating the tagging by selecting the POS tag sequence with the highest probability. The probability of a sequence of tags is computed thanks to a stochastic model using the Hidden Markov Model machinery. The states or patterns of our model are extracted from the GraceLPL resource (700,000 tokens with morphosyntactic annotation). The performance of the tagger reaches an F-measure score of 0.974 for written material. The tagger has been adapted for the treatment of spontaneous speech transcriptions. The system has been trained with a large spoken French corpus (CID, see Bertrand et al. 2008). Phenomena proper to speech (filled paused, disfluencies, truncation, etc.) were identified and included in a model specific to speech transcription inputs. The tagger performance of 0.948 (F-measure) has been evaluated on the manual corrected tags of the CID corpus. MarsaTag is distributed with a software interface allowing the choice of various input and output formats (see hdl:11041/sldr000841). Thanks to the genericity of the technique, extension to other languages for which annotated treebanks are available (e.g. Chinese Penn Treebank) is currently in progress.
	Keyword: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; resource; syntax; tagging; treebank
	URL: https://hal.archives-ouvertes.fr/hal-01500736
	BASE
	Hide details

4	Phrase extraction and rescoring in statistical machine translation
	Srivastava, Ankit Kumar. - : Dublin City University. Centre for Next Generation Localisation (CNGL), 2014. : Dublin City University. School of Computing, 2014
	In: Srivastava, Ankit Kumar (2014) Phrase extraction and rescoring in statistical machine translation. PhD thesis, Dublin City University. (2014)
	BASE
	Show details

5	Deep Syntax Annotation of the Sequoia French Treebank
	Candito, Marie; Perrier, Guy; Guillaume, Bruno...
	In: International Conference on Language Resources and Evaluation (LREC) ; https://hal.inria.fr/hal-00969191 ; International Conference on Language Resources and Evaluation (LREC), May 2014, Reykjavik, Iceland (2014)
	BASE
	Show details

6	Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
	Lacheret, Anne; Kahane, Sylvain; Beliao, Julie...
	In: Language Resources and Evaluation Conference ; https://hal.sorbonne-universite.fr/hal-00968959 ; Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland (2014)
	BASE
	Show details

7	Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
	Bawden, Rachel; Bottala, Marie-Amélie,; Gerdes, Kim...
	In: Proceedings of the 9th Language Resources and Evaluation Conference (LREC) ; https://halshs.archives-ouvertes.fr/halshs-01011059 ; Proceedings of the 9th Language Resources and Evaluation Conference (LREC), 2014, Iceland. pp.1-6 (2014)
	BASE
	Show details

8	Exploring the prosody of stance : variation in the realization of stance adverbials
	Biber, Douglas; Staples, Shelley
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 271-294
	BLLDB
	Show details

9	The grammatical annotation of speech corpora : techniques and perspectives
	Bick, Eckhard
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 105-128
	BLLDB
	Show details

10	The notion of sentence and other discourse units in corpus annotation
	Pietrandrea, Paola; Kahane, Sylvain; Lacheret-Dujour, Anne...
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 331-364
	BLLDB
	Show details

11	Methodological issues for spontaneous speech corpora compilation : the case of C-ORAL-BRASIL
	Mello, Heliana
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 27-68
	BLLDB
	Show details

12	The IPIC resource and a cross-linguistic analysis of information structure in Italian and Brazilian Portuguese
	Panunzi, Alessandro; Mittmann, Maryualê Malvessi
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 129-151
	BLLDB
	Show details

13	A multilingual speech corpus of North-Germanic languages
	Johannessen, Janne Bondi; Vangsnes, Øystein A.; Priestley, Joel...
	In: Spoken corpora and linguistic studies. - Amsterdam [u.a.] : Benjamins (2014), 69-83
	BLLDB
	Show details

14	The theoretical foundations of givenness annotation
	Haug, Dag; Eckhoff, Hanne M.; Welo, Eirik
	In: Information structure and syntactic change in Germanic and Romance languages. - Amsterdam [u.a.] : Benjamins (2014), 17-52
	BLLDB
	Show details

15	Accessing phonetic variation in spoken language corpora through non-standard orthography
	Schalley, Andrea C.; Musgrave, Simon; Haugh, Michael
	In: Australian journal of linguistics. - Basingstoke, Hampshire : Taylor & Francis 34 (2014) 1, 139-170
	BLLDB
	Show details

16	Tamil Dependency Treebank v0.1
	Ramasamy, Loganathan; Žabokrtský, Zdeněk. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

17	Copenhagen Dependency Treebanks versions 1-3
	Buch-Kromann, Matthias. - : Copenhagen Business School, 2014
	BASE
	Show details

18	Czech-English Parallel Corpus 1.0 (CzEng 1.0)
	Bojar, Ondřej; Žabokrtský, Zdeněk; Dušek, Ondřej. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

19	Prague Dependency Treebank 3.0
	Bejček, Eduard; Hajičová, Eva; Hajič, Jan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

20	HamleDT 2.0
	Zeman, Daniel; Mareček, David; Mašek, Jan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern