Home Catalogue search

eng

Refine your search:
- Keyword
- Creator / Publisher
- Year
- Medium
- Type:
- BLLDB-Access:
  - free (32)
  - subject to license (1)

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 32

1	Towards Arabic Sentence Simplification via Classification and Generative Approaches ...
	Khallaf, Nouran; Sharoff, Serge. - : arXiv, 2022
	BASE
	Show details

2	Overview of the Fourth BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora
	Rapp, Reinhard; Zweigenbaum, Pierre; Sharoff, Serge
	In: 13th Workshop on Building and Using Comparable Corpora (BUCC) ; https://hal.archives-ouvertes.fr/hal-03100822 ; 13th Workshop on Building and Using Comparable Corpora (BUCC), May 2020, Marseille, France. pp.6-13 (2020)
	BASE
	Show details

3	Know thy corpus! Robust methods for digital curation of Web corpora ...
	Sharoff, Serge. - : arXiv, 2020
	BASE
	Show details

4	Recognizing semantic relations by combining transformers and fully connected models
	Roussinov, Dmitri; Sharoff, Serge; Puchnina, Nadezhda. - : European Language Resources Association (ELRA), 2020
	BASE
	Show details

5	A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora
	Zweigenbaum, Pierre; Sharoff, Serge; Rapp, Reinhard
	In: International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01898362 ; International Conference on Language Resources and Evaluation, May 2018, Miyazaki, Japan (2018)
	Abstract: International audience ; Comparable corpora can be seen as a reservoir for parallel sentences and phrases to overcome limitations in variety and quantity encountered in existing parallel corpora. This has motivated the design of methods to extract parallel sentences from comparable corporad. Despite this interest and work, no shared dataset has been made available for this task until the 2017 BUCC Shared Task. We present the challenges faced to build such a dataset and the solutions adopted to design and create the 2017 BUCC Shared Task dataset, emphasizing issues we had to cope with to include Chinese as one of the languages. The resulting corpus contains a total of about 3.5 million distinct sentences in English, French, German, Russian, and Chinese, mostly from Wikipedia. We illustrate the use of this dataset in the shared task and summarize the main results obtained by its participants. We finally outline remaining issues.
	Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Comparable Corpora; Cross-language similarity; Natural Language Processing; Parallel corpora; Parallel Sentences
	URL: https://hal.archives-ouvertes.fr/hal-01898362
	BASE
	Hide details

6	Functional text dimensions for the annotation of web corpora
	Sharoff, Serge
	In: Corpora. - Edinburgh : Univ. Press 13 (2018) 1, 65-95
	BLLDB
	Show details

7	Crowdsourcing for web genre annotation
	Asheghi, Noushin Rezapour [Verfasser]; Sharoff, Serge [Verfasser]; Markert, Katja [Verfasser]. - Hannover : Gottfried Wilhelm Leibniz Universität Hannover, 2016
	DNB Subject Category Language
	Show details

8	Language Adaptation for Extending Post-Editing Estimates for Closely Related Languages
	Rios Miguel; Sharoff Serge
	In: Prague Bulletin of Mathematical Linguistics , Vol 106, Iss 1, Pp 181-192 (2016) (2016)
	BASE
	Show details

9	MULTEXT-East non-commercial lexicons 4.0
	Erjavec, Tomaž; Derzhanski, Ivan; Divjak, Dagmar. - : Jožef Stefan Institute, 2015
	BASE
	Show details

10	Document dissimilarity within and across languages: A benchmarking study
	Forsyth, Richard S.; Sharoff, Serge
	In: LLC. - Oxford : Oxford Univ. Press 29 (2014) 1, 6
	OLC Linguistik
	Show details

11	Languages for Specific Purposes in the Digital Era
	Sevilla Pavón, Ana; Rodríguez Arancón, Pilar; Albarrán Martín, Reyes. - 2014
	BLLDB
	UB Frankfurt Linguistik
	Show details

12	Building and using comparable corpora
	Sharoff, Serge [Herausgeber]; Fung, Pascale [Herausgeber]; Rapp, Reinhard [Herausgeber]. - 2013
	DNB Subject Category Language
	Show details

13	Corpus-based vocabulary lists for language learners for nine languages [<Journal>]
	Kilgarriff, Adam [Verfasser]; Charalabopoulou, Frieda [Verfasser]; Gavrilidou, Maria [Verfasser].
	DNB Subject Category Language
	Show details

14	Building and Using Comparable Corpora
	Sharoff, Serge; Rapp, Reinhard; Zweigenbaum, Pierre. - Berlin, Heidelberg : Springer Berlin Heidelberg, 2013
	UB Frankfurt Linguistik
	Show details

15	Building and using comparable corpora
	Sharoff, Serge (Hrsg.). - Berlin [u.a.] : Springer, 2013
	BLLDB
	UB Frankfurt Linguistik
	Show details

16	Building and using comparable corpora
	Sharoff, Serge; Rapp, Reinhard; Zweigenbaum, Pierre. - : Springer, 2013
	BASE
	Show details

17	Terminology Extraction, Translation Tools and Comparable Corpora: TTC concept, midterm progress and achieved results
	Gornostay, Tatiana; Gojun, Anita; Weller, Marion...
	In: LREC 2012 Workshop on Creating Cross-language Resources for Disconnected Languages and Styles (CREDISLAS) ; https://hal.archives-ouvertes.fr/hal-00819909 ; LREC 2012 Workshop on Creating Cross-language Resources for Disconnected Languages and Styles (CREDISLAS), May 2012, Istanbul, Turkey. 4 p (2012)
	BASE
	Show details

18	Genres on the Web : Computational Models and Empirical Studies
	Mehler, Alexander; Sharoff, Serge; Santini, Marina. - Dordrecht : Springer Netherlands, 2011
	UB Frankfurt Linguistik
	Show details

19	User-centred Views on Terminology Extraction Tools: Usage Scenarios and Integration into MT and CAT Tools.
	Blancafort, Helena; Heid, Ulrich; Gornostay, Tatiana...
	In: Actes du colloque Tralogy : Anticiper les technologies pour la traduction ; Tralogy I. Métiers et technologies de la traduction : quelles convergences pour l'avenir ? ; https://hal.archives-ouvertes.fr/hal-00818657 ; Tralogy I. Métiers et technologies de la traduction : quelles convergences pour l'avenir ?, Mar 2011, Paris, France. 10 p (2011)
	BASE
	Show details

20	Balancing form and function in corpus research
	Sharoff, Serge
	In: International journal of corpus linguistics. - Amsterdam [u.a.] : Benjamins 15 (2010) 3, 419-424
	BLLDB
	OLC Linguistik
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern