Page: 1 2 3 4 5 6 7 8... 3.228
61 |
Morphology in the Corsican Language Database (BDLC) : assessment and perspectives ; La morphologie dans la Banque de Données Langue Corse : bilan et perspectives
|
|
|
|
In: ISSN: 1638-9808 ; EISSN: 1765-3126 ; Corpus ; https://hal.archives-ouvertes.fr/hal-03591866 ; Corpus, Bases, Corpus, Langage - UMR 7320, 2022, Corpus et données en morpholgie, ⟨10.4000/corpus.7115⟩ ; https://journals.openedition.org/corpus/7115 (2022)
|
|
BASE
|
|
Show details
|
|
62 |
Towards Unsupervised Content Disentanglement in Sentence Representations via Syntactic Roles
|
|
|
|
In: CtrlGen: Controllable Generative Modeling in Language and Vision ; https://hal.inria.fr/hal-03540084 ; CtrlGen: Controllable Generative Modeling in Language and Vision, Jan 2022, virtual, France (2022)
|
|
Abstract:
International audience ; Linking neural representations to linguistic factors is crucial in order to build and analyze NLP models interpretable by humans. Among these factors, syntactic roles (e.g. subjects, direct objects,. .) and their realizations are essential markers since they can be understood as a decomposition of predicative structures and thus the meaning of sentences. Starting from a deep probabilistic generative model with attention, we measure the interaction between latent variables and realizations of syntactic roles, and show that it is possible to obtain, without supervision, representations of sentences where different syntactic roles correspond to clearly identified different latent variables. The probabilistic model we propose is an Attention-Driven Variational Autoencoder (ADVAE). Drawing inspiration from Transformer-based machine translation models, ADVAEs enable the analysis of the interactions between latent variables and input tokens through attention. We also develop an evaluation protocol to measure disentanglement with regard to the realizations of syntactic roles. This protocol is based on attention maxima for the encoder and on disturbing individual latent variables for the decoder. Our experiments on raw English text from the SNLI dataset show that i) disentanglement of syntactic roles can be induced without supervision, ii) ADVAE separates more syntactic roles than classical sequence VAEs, iii) realizations of syntactic roles can be separately modified in sentences by mere intervention on the associated latent variables. Our work constitutes a first step towards unsupervised controllable content generation. The code for our work is publicly available 1 .
|
|
Keyword:
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
|
|
URL: https://hal.inria.fr/hal-03540084 https://hal.inria.fr/hal-03540084/document https://hal.inria.fr/hal-03540084/file/lqVcfT-CTRLGEN%20%287%29.pdf
|
|
BASE
|
|
Hide details
|
|
63 |
Corpus-based Language Universals Analysis using Universal Dependencies ; Analyse orientée corpus d'universaux linguistiques sur Universal Dependencies
|
|
|
|
In: SyntaxFest Quasy 2021 - Quantitative Syntax ; https://hal.inria.fr/hal-03501774 ; SyntaxFest Quasy 2021 - Quantitative Syntax, Mar 2022, Sofia, Bulgaria (2022)
|
|
BASE
|
|
Show details
|
|
64 |
Corpus-based Language Universals Analysis using Universal Dependencies ; Analyse orientée corpus d'universaux linguistiques sur Universal Dependencies
|
|
|
|
In: Quasy (Quantitative Syntax), SyntaxFest 2021 ; https://hal.inria.fr/hal-03501774 ; Quasy (Quantitative Syntax), SyntaxFest 2021, Mar 2022, Sofia, Bulgaria (2022)
|
|
BASE
|
|
Show details
|
|
65 |
Simplification of literary and scientific texts to improve reading fluency and comprehension in beginning readers of French
|
|
|
|
In: ISSN: 0142-7164 ; EISSN: 1469-1817 ; Applied Psycholinguistics ; https://hal-amu.archives-ouvertes.fr/hal-03549026 ; Applied Psycholinguistics, Cambridge University Press (CUP), 2022, pp.1-28. ⟨10.1017/S014271642100062X⟩ (2022)
|
|
BASE
|
|
Show details
|
|
66 |
Language Contact
|
|
|
|
In: UCLA Encyclopedia of Egyptology, vol 1, iss 1 (2022)
|
|
BASE
|
|
Show details
|
|
67 |
Psychological Well-Being of Left-Behind Children in China: Text Mining of the Social Media Website Zhihu.
|
|
|
|
In: International journal of environmental research and public health, vol 19, iss 4 (2022)
|
|
BASE
|
|
Show details
|
|
68 |
VEREINDEUTIGUNG ZUR KLASSIFIZIERUNG LEXIKALISCHER OBJEKTE ; DISAMBIGUATION FOR THE CLASSIFICATION OF LEXICAL ITEMS ; DÉSAMBÏGUISATION POUR LA CLASSIFICATION DE LEXÈMES
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03598242 ; France, Patent n° : EP3937059A1. 2022 (2022)
|
|
BASE
|
|
Show details
|
|
69 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
70 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
71 |
French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English
|
|
|
|
In: ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03629677 ; ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
72 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
73 |
European Language Equality - Report on the French Language
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03637776 ; [Research Report] CNRS - LISN. 2022 (2022)
|
|
BASE
|
|
Show details
|
|
74 |
Identifier l’ironie ?
|
|
|
|
In: ISSN: 1774-7988 ; EISSN: 2261-3455 ; Synergies Pologne ; https://halshs.archives-ouvertes.fr/halshs-03552205 ; Synergies Pologne, 2022 (2022)
|
|
BASE
|
|
Show details
|
|
75 |
PROTECT: A Pipeline for Propaganda Detection and Classification
|
|
|
|
In: CLiC-it 2021- Italian Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03417019 ; CLiC-it 2021- Italian Conference on Computational Linguistics, Jan 2022, Milan, Italy (2022)
|
|
BASE
|
|
Show details
|
|
76 |
Le modèle Transformer: un « couteau suisse » pour le traitement automatique des langues
|
|
|
|
In: Techniques de l'Ingenieur ; https://hal.archives-ouvertes.fr/hal-03619077 ; Techniques de l'Ingenieur, Techniques de l'ingénieur, 2022, ⟨10.51257/a-v1-in195⟩ ; https://www.techniques-ingenieur.fr/base-documentaire/innovation-th10/innovations-en-electronique-et-tic-42257210/transformer-des-reseaux-de-neurones-pour-le-traitement-automatique-des-langues-in195/ (2022)
|
|
BASE
|
|
Show details
|
|
77 |
Language identification, a tool for Corsican and for the evaluation of linguistic resources ; L'identification de langue, un outil au service du corse et de l'évaluation des ressources linguistiques
|
|
|
|
In: Traitement Automatique des Langues ; https://hal.archives-ouvertes.fr/hal-03633290 ; Traitement Automatique des Langues, 2022, Diversité Linguistique, 62 (3), pp.13-37 ; https://www.atala.org/content/diversité-linguistique-linguistic-diversity-natural-language-processing (2022)
|
|
BASE
|
|
Show details
|
|
78 |
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
|
|
|
|
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
|
|
BASE
|
|
Show details
|
|
79 |
Utiliser TinySegmenter avec Python
|
|
|
|
In: ISSN: 2729-465X ; Tekipaki ; https://hal.archives-ouvertes.fr/hal-03523195 ; 2022, https://tekipaki.hypotheses.org/2015 (2022)
|
|
BASE
|
|
Show details
|
|
80 |
Bislama: An Introduction to the National Language of Vanuatu
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8... 3.228
|
|