1 |
Obvie: interface web pour la fouille et la comparaison de textes
|
|
|
|
In: Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022) ; https://hal.archives-ouvertes.fr/hal-03543362 ; Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022), Jan 2022, Blois, France ; https://egc2022.univ-tours.fr/ateliers/ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Preprint Citation Praxis in PLOS
|
|
|
|
In: ISSN: 0138-9130 ; EISSN: 1588-2861 ; Scientometrics ; https://hal.archives-ouvertes.fr/hal-03506094 ; In press (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Islands and Bridges of Language: Bio-Inspired Structural Analysis of Language Embedding Data
|
|
|
|
Abstract:
In this thesis, I propose a method of applying an agent-based model named Monte Carlo Physarum Machine (MCPM) to language embedding data. This method has been previously applied in astronomy for inferring the quasi-fractal structure of the cosmic web. In this thesis, I show that this model can provide a distinct scope to understand, analyze and extract information from language embedding data. I assess the novelty of the algorithm rst by identifying the characteristics of the revealed structure through visualization, and generate word similarity metrics in comparison with other status quo similarity metrics. In addition, I propose a visualization tool to further help explore the language embedding space in 3D. As a result, I argue that both the MCPM method and the visualization tool can assist examining the structure of language embedding in the reduced 3D space.
|
|
Keyword:
Computer science; Data Science; Data Visualization; Digital Media; Information Retrieval; Linguistics; Natural Language Processing
|
|
URL: https://escholarship.org/uc/item/6zj1r9ch
|
|
BASE
|
|
Hide details
|
|
4 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
|
|
|
|
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
|
|
BASE
|
|
Show details
|
|
8 |
A comparative study of several parameterizations for speaker recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Speaker verification in mismatch training and testing conditions ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
A New Amharic Speech Emotion Dataset and Classification Benchmark ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Automatic Dialect Density Estimation for African American English ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|