DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6
Hits 41 – 60 of 101

41
Coreference and anaphoric annotations for spontaneous speech corpora in French.
In: 8th Discourse Anaphora and Anaphor Resolution Colloquium ; https://halshs.archives-ouvertes.fr/halshs-00764786 ; 8th Discourse Anaphora and Anaphor Resolution Colloquium, Oct 2011, Faro, Portugal. pp.182-190 (2011)
BASE
Show details
42
An Analysis of the Performances of the CasEN Named Entities Recognition System in the Ester2 Evaluation Campaign
In: https://hal.archives-ouvertes.fr/hal-00502370 ; 2010 (2010)
BASE
Show details
43
Dénomination et anaphore lexicale - Le réseau sémantique de Prolexbase
In: Construction d'identité et processus d'identification ; https://hal.archives-ouvertes.fr/hal-01067232 ; S.N. Osu, G. Col, N. Garric et F. Toupin. Construction d'identité et processus d'identification, Peter Lang Editions, pp.151-163, 2010 (2010)
BASE
Show details
44
Reconnaissance d'entités nommées : enrichissement d'un système à base de connaissances à partir de techniques de fouille de textes
In: Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-00568758 ; Traitement Automatique des Langues Naturelles, Jul 2010, Montréal, Canada (2010)
BASE
Show details
45
Who are you, you who speak? Transducer cascades for information retrieval
In: 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01174643 ; 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Nov 2009, Poznań, Poland (2009)
Abstract: International audience ; This paper deals with a survey corpus. We present information retrieval about the speaker. We used finite state transducer cascades and we present here detailed results with an evaluation. This work is part of a French project to enhance the corpus ESLO (sociolinguistic survey taken in the city of Orléans). This survey has been realized in 1968 and the project is to save records in computer format, to transcribe them and to increase the transcription with annotations in XML format. This work was supported by a French ANR contract (ANR-06-CORP-023) and by European fund from Région Centre (FEDER). The corpus represent a collection of 200 interviews with the questions about the life in the city of Orléans: How long have you lived in Orléans for?, What led you to live in Orléans?, Do you like living in Orléans?, etc. and questions about the occupation or the family of the speaker, completed by recordings within a professional or private context. The recording situations are different: interviews, discussions between friends, recordings in microphone hidden, interviews with the political, academic and religious personalities, conversations between a social worker and parents in Psycho Medical Center of Orleans. In total, we have 300 hours of speech estimated to 4,500,000 words. More precisely, we worked on almost 120 transcribed hours representing 112 Transcriber XML files and 32 577 Kb. We worked on 105 files (31 004 Kb) and we evaluated the results on 7 files (1 573 Kb-5.1%). The transcription files have no punctuation marks, but the first letter of proper names is capitalized and acronyms are fully capitalized. We used the CasSys system (Friburger, Maurel, 2004) that computes texts with transducer cascades (Abney, 1996). The cascades we used are hand built: each transducer describes a local grammar for the recognition of some entities. Some times this recognition needs the succession of two or more transducers, in a specific order. More precisely, we used two cascades; the first one, for named entity recognition, was built some years ago for a newspaper corpus and we adapted it to oral corpus in the project; the second one aimed at discovering information about the speaker in three domains: origin (is he/she Orléans city native or where he/she comes from?), family (is he/she married, with children or not?) and occupation (what is his/her occupation? where does he/she work?). We called this information designating entities. This second cascade was specifically built for the project. CasSys computes transducers with Unitex software (Paumier, 2003) that needs to segment the text by preprocessing. For written text, this segmentation usually uses sentence boundary detection (Friburger and al., 2000). In our corpus there is no punctuation. So we have chosen to use XML Transcriber tags to do the segmentation and also to hide the inside of the tag for the named entity task, sometimes ambiguous with context entities (Dister, 2007).
Keyword: [INFO]Computer Science [cs]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Information retrieval; Named entity task; Survey 1 Motivation; Transducer cascades
URL: https://hal.archives-ouvertes.fr/hal-01174643
https://hal.archives-ouvertes.fr/hal-01174643/file/ltc-2009-Maurel.pdf
https://hal.archives-ouvertes.fr/hal-01174643/document
BASE
Hide details
46
Temporal Expressions: Comparisons in a Multilingual Corpus
In: Human Language Technologies as a Challenge for Computer Science and Linguistics ; 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01024150 ; 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, 2009, Poznan, Poland. pp.531-535 (2009)
BASE
Show details
47
Explorer des corpus à l'aide de CasSys. Application au Corpus d'Orléans
In: In G.Willems (ed.), Texte et corpus n°4, Actes des 6es Journées Internationales de Linguistique de Corpus (JLC). ; Journées de Linguistique de Corpus ; https://hal.archives-ouvertes.fr/hal-01174606 ; Journées de Linguistique de Corpus, Sep 2009, Lorient, France. pp.189-196 (2009)
BASE
Show details
48
Les noms propres d'association et d'organisation: traduction et traitement automatique
In: Nouveaux cahiers d'allemand. - Nancy : Univ. 26 (2008) 2, 161-174
BLLDB
OLC Linguistik
Show details
49
Prolexbase. A multilingual relational lexical database of proper names
In: LREC ; Sixth language resources and evaluation conference ; https://hal.archives-ouvertes.fr/hal-01024056 ; Sixth language resources and evaluation conference, 2008, Marrakech, Morocco (2008)
BASE
Show details
50
Prolexbase : Une base de données lexicale de noms propres pour le Tal
In: Colloque Lexicographie et informatique : bilan et perspectives ; https://hal.archives-ouvertes.fr/hal-01030489 ; Colloque Lexicographie et informatique : bilan et perspectives, Jan 2008, Nancy, France. pp.137-144 (2008)
BASE
Show details
51
Prolexbase et LMF: vers un standard pour les ressources lexicales sur les noms propres
In: ISSN: 1248-9433 ; EISSN: 1965-0906 ; Revue TAL ; https://hal.archives-ouvertes.fr/hal-01021179 ; Revue TAL, ATALA (Association pour le Traitement Automatique des Langues), 2008, 49 (1), pp.61-88 ; https://www.atala.org/content/prolexbase-et-lmf-vers-un-standard-pour-les-ressources-lexicales-sur-les-noms-propres (2008)
BASE
Show details
52
Compression method for natural language automata
In: Finite-State Methods and Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-01024076 ; Finite-State Methods and Natural Language Processing, 2008, Ispra, Italy. pp.146-157 (2008)
BASE
Show details
53
Compression de dictionnaires électroniques
In: Neuvièmes journées internationales d'analyse statistique des données textuelles ; https://hal.archives-ouvertes.fr/hal-01030743 ; Neuvièmes journées internationales d'analyse statistique des données textuelles, 2008, Lyon, France. pp.1103-1114 (2008)
BASE
Show details
54
Automates et morphologie. Autour des noms propres, quelques réflexions sur la flexion en français
In: Linguistics, Computer Science and Language Processin ; https://hal.archives-ouvertes.fr/hal-01067215 ; Gaston Gross, Klaus U.Schulz. Linguistics, Computer Science and Language Processin, College Publications, pp.189-203, 2008 (2008)
BASE
Show details
55
Balisage XML des entités nommées et dénommantes du corpus Eslo
In: 1st Cataloguing and Encoding of Spoken Language Data (CatCod) ; https://hal.archives-ouvertes.fr/hal-01048597 ; 1st Cataloguing and Encoding of Spoken Language Data (CatCod), Dec 2008, Orléans, France (2008)
BASE
Show details
56
PROLEX: a lexical model for translation of proper names: application to French, Serbian and Bulgarian
In: BULAG. - Besançon : Presses Univ. de Franche-Comté 32 (2007), 55-72
BLLDB
Show details
57
Formaliser les langues avec l'ordinateur: de INTEX à NooJ. Actes des sixièmes (Sofia 2003) et septièmes (Tours 2004) Journées INTEX/Nooj
Koeva, Svetla; Maurel, Denis; Silberztein, Max. - Besançon : Presses Univ. de France-Comté, 2007
IDS Bibliografie zur deutschen Grammatik
Show details
58
Un dictionnaire INTEX de noms de professions: quels féminins possibles?
In: Formaliser les langues avec l'ordinateur. - Besançon : Presses Univ. de France-Comté (2007), 109-118
BLLDB
Show details
59
Formaliser les langues avec l'ordinateur : de INTEX à NooJ ; actes des sixièmes (Sofia 2003) et septièmes (Tours 2004) Journées INTEX/Nooj
Silberztein, Max (Hrsg.); Maurel, Denis (Hrsg.); Koeva, Svetla (Hrsg.). - Besançon : Presses Univ. de France-Comté, 2007
BLLDB
UB Frankfurt Linguistik
Show details
60
A note on the semantic and morphological properties of proper names in the Prolex project
In: Lingvisticae investigationes. - Amsterdam : Benjamins 30 (2007) 1, 115-133
BLLDB
Show details

Page: 1 2 3 4 5 6

Catalogues
4
0
5
0
0
0
0
Bibliographies
27
0
1
0
0
0
2
0
1
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
68
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern