4 |
Lexico-grammaire et textométrie : identification et visualisation de schémas lexico-grammaticaux caractéristiques dans deux corpus juridiques comparables en français
|
|
|
|
In: ISSN: 1638-9808 ; EISSN: 1765-3126 ; Corpus ; https://hal-univ-paris.archives-ouvertes.fr/hal-02615941 ; Corpus, Bases, Corpus, Langage - UMR 7320, 2017 (2017)
|
|
BASE
|
|
Show details
|
|
5 |
Creation of a multilingual aligned corpus with Ukrainian as the target language and its exploitation
|
|
|
|
In: Computational Linguistics and Intelligent Systems ; https://hal.archives-ouvertes.fr/hal-01736363 ; Computational Linguistics and Intelligent Systems, Apr 2017, Kharkiv, Ukraine (2017)
|
|
BASE
|
|
Show details
|
|
6 |
Early gestures and signs in French Sign Language acquisition
|
|
|
|
In: Language as a form of Action ; https://hal.archives-ouvertes.fr/hal-01705105 ; Language as a form of Action, Jun 2017, Rome, Italy. 2017 ; http://www.dcomm.eu/events/conference-rome-june-2017/ (2017)
|
|
BASE
|
|
Show details
|
|
7 |
The Acquisition of Nominal and Verbal Liaisons: From Lexicalized to Abstract Constructions ; Acquisition des liaisons nominales et verbales : de la lexicalisation à l’abstraction des constructions
|
|
|
|
In: ISSN: 0023-8368 ; EISSN: 1957-7982 ; Langue française ; https://hal.archives-ouvertes.fr/hal-01615857 ; Langue française, Armand Colin, 2017, Les constructions comme unités de la langue : illustrations, évaluation, critique, 194 (2), pp.125 - 146. ⟨10.3917/lf.194.0125⟩ ; https://www.cairn.info/revue-langue-francaise-2017-2.htm (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Interdisciplinary and interlinguistic perspectives on Academic Discourse: the mode variable
|
|
|
|
In: ISSN: 2386-2629 ; Chimera: Romance Corpora and Linguistic Studies ; https://hal.archives-ouvertes.fr/hal-01570885 ; Chimera: Romance Corpora and Linguistic Studies, Universidad Autónoma de Madrid, 2017, 4 (1), pp.1-11 ; https://revistas.uam.es/index.php/chimera/article/view/7810 (2017)
|
|
BASE
|
|
Show details
|
|
9 |
An experiment with three genre-specific corpora for teaching EAP to French speakers.
|
|
|
|
In: BAAL Corpus SIG symposium: Using Corpora in EAP. ; https://halshs.archives-ouvertes.fr/halshs-03100562 ; BAAL Corpus SIG symposium: Using Corpora in EAP., Mar 2017, Durham, United Kingdom (2017)
|
|
BASE
|
|
Show details
|
|
10 |
Le Thesaurus occitan dans tous ses états
|
|
|
|
In: ISSN: 1386-1204 ; EISSN: 1875-368X ; Revue Française de Linguistique Appliquée ; https://halshs.archives-ouvertes.fr/halshs-01633047 ; Revue Française de Linguistique Appliquée, Paris : Publications linguistiques, 2017, XXII-1, pp.89-102 (2017)
|
|
BASE
|
|
Show details
|
|
11 |
Segmentation of oral corpora: First findings from a cross-language study
|
|
|
|
In: 15th International Pragmatics Conference - IPRA 2017 ; https://hal.archives-ouvertes.fr/hal-01773630 ; 15th International Pragmatics Conference - IPRA 2017, Jul 2017, Belfast, United Kingdom (2017)
|
|
BASE
|
|
Show details
|
|
12 |
Mining a Multimodal Corpus of Doctor's Training for Virtual Patient's Feedbacks
|
|
|
|
In: 19th International Conference on Multimodal Interaction (ICMI) ; https://hal.archives-ouvertes.fr/hal-01654812 ; 19th International Conference on Multimodal Interaction (ICMI), Nov 2017, Glasgow, United Kingdom. ⟨10.1145/3136755.3136816⟩ (2017)
|
|
BASE
|
|
Show details
|
|
13 |
Language Learning as Language Use: Statistically-based Chunking in Development
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Schwa Realization in French: Using Automatic Speech Processing to Study Phonological and Socio-linguistic Factors in Large Corpora
|
|
|
|
In: Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-01837179 ; Annual Conference of the International Speech Communication Association , ISCA, Aug 2017, Stockholm, Sweden (2017)
|
|
BASE
|
|
Show details
|
|
15 |
Le projet SegCor : Quelles unités pour la segmentation d’un corpus d’interactions en français et en allemand ?
|
|
|
|
In: Colloque FLORAL – Accessibilité, représentations et analyses des données ; https://hal.archives-ouvertes.fr/hal-01773621 ; Colloque FLORAL – Accessibilité, représentations et analyses des données, Mar 2017, Orléans, France (2017)
|
|
BASE
|
|
Show details
|
|
16 |
Ce que les corpus pourraient apporter aux grammaires et/ou aux dictionnaires : l’exemple de contre et même
|
|
|
|
In: ISSN: 2610-3745 ; Dossiers d'HEL ; https://hal.archives-ouvertes.fr/hal-01511249 ; Dossiers d'HEL, SHESL, 2017, Analyse et exploitation des données de corpus linguistiques, pp.41-51 ; http://shesl.org/index.php/dossier11-analyse-exploitation-corpus/ (2017)
|
|
BASE
|
|
Show details
|
|
17 |
SegCor : vers une segmentation multiniveaux pour le français parlé
|
|
|
|
In: Colloque Syntaxe et discours III – Types d’unités et procédures de segmentation ; https://hal.archives-ouvertes.fr/hal-01773625 ; Colloque Syntaxe et discours III – Types d’unités et procédures de segmentation, Florence Lefeuvre; Marie-José Béguelin; Gilles Corminboeuf, Jun 2017, Paris, France (2017)
|
|
BASE
|
|
Show details
|
|
18 |
Data-Driven Identification of German Phrasal Compounds
|
|
|
|
In: Text, Speech, and Dialogue ; https://hal.archives-ouvertes.fr/hal-01575651 ; Kamil Ekštein; Václav Matoušek. Text, Speech, and Dialogue, 10415, Springer International Publishing, pp.192-200, 2017, Lecture Notes in Computer Science, 978-3-319-64205-5. ⟨10.1007/978-3-319-64206-2_22⟩ ; https://link.springer.com/bookseries/558 (2017)
|
|
BASE
|
|
Show details
|
|
19 |
An empirical study of the Algerian dialect of Social network
|
|
|
|
In: ICNLSSP 2017 - International Conference on Natural Language, Signal and Speech Processing ; https://hal.inria.fr/hal-01659997 ; ICNLSSP 2017 - International Conference on Natural Language, Signal and Speech Processing, Dec 2017, Casablanca, Morocco ; http://icnlssp.isga.ma (2017)
|
|
Abstract:
International audience ; In this paper, we present analysis on the use of Algerian dialect in Youtube. To do so, we harvested a corpus of 17M of words. This latter was exploited to extract a comparable Algerian corpus, named CALYOU by aligning pairs of sentences written in Latin and Arabic. This one was built by using a multilingual word embeddings approach. Several experiments have been conducted to fix the parameters of the Continuous Bag of Words approach that will be discussed in this article. The method we proposed achieved a performance of 41% in terms of Recall. In the following, we present several figures on the collected data that led to several unexpected results. In fact, 51% of the vocabulary words are written in Latin script and 82% of the total comments are subject to the phenomenon of code-switching.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Algerian dialect; Code-switching; comparable corpora; Word embedding
|
|
URL: https://hal.inria.fr/hal-01659997/file/ICNLSSP2017_paper_16.pdf https://hal.inria.fr/hal-01659997 https://hal.inria.fr/hal-01659997/document
|
|
BASE
|
|
Hide details
|
|
20 |
Connecting Resources: Which Issues Have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
|
|
|
|
In: 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17) ; https://hal.archives-ouvertes.fr/hal-01918880 ; 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), Oct 2017, Bolzano, Italy. pp.52-55 ; https://doi.org/10.5281/zenodo.1040713 (2017)
|
|
BASE
|
|
Show details
|
|
|
|