1 |
ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
|
|
|
|
In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22) ; https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Towards combined semantic and lexical scores based on a new representation of textual data to extract experimental data from scientific publications
|
|
|
|
In: ISSN: 1751-5858 ; EISSN: 1751-5866 ; International Journal of Intelligent Information and Database Systems ; https://hal.inrae.fr/hal-03616243 ; International Journal of Intelligent Information and Database Systems, Inderscience, 2022, 15 (1), pp.78. ⟨10.1504/IJIIDS.2022.120146⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Obvie: interface web pour la fouille et la comparaison de textes
|
|
|
|
In: Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022) ; https://hal.archives-ouvertes.fr/hal-03543362 ; Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022), Jan 2022, Blois, France ; https://egc2022.univ-tours.fr/ateliers/ (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Preprint Citation Praxis in PLOS
|
|
|
|
In: ISSN: 0138-9130 ; EISSN: 1588-2861 ; Scientometrics ; https://hal.archives-ouvertes.fr/hal-03506094 ; In press (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Islands and Bridges of Language: Bio-Inspired Structural Analysis of Language Embedding Data
|
|
|
|
Abstract:
In this thesis, I propose a method of applying an agent-based model named Monte Carlo Physarum Machine (MCPM) to language embedding data. This method has been previously applied in astronomy for inferring the quasi-fractal structure of the cosmic web. In this thesis, I show that this model can provide a distinct scope to understand, analyze and extract information from language embedding data. I assess the novelty of the algorithm rst by identifying the characteristics of the revealed structure through visualization, and generate word similarity metrics in comparison with other status quo similarity metrics. In addition, I propose a visualization tool to further help explore the language embedding space in 3D. As a result, I argue that both the MCPM method and the visualization tool can assist examining the structure of language embedding in the reduced 3D space.
|
|
Keyword:
Computer science; Data Science; Data Visualization; Digital Media; Information Retrieval; Linguistics; Natural Language Processing
|
|
URL: https://escholarship.org/uc/item/6zj1r9ch
|
|
BASE
|
|
Hide details
|
|
6 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
|
|
|
|
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
|
|
BASE
|
|
Show details
|
|
10 |
A Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
|
|
|
|
In: Journal of Open Humanities Data; Vol 8 (2022); 3 ; 2059-481X (2022)
|
|
BASE
|
|
Show details
|
|
11 |
Cross-media Scientific Research Achievements Query based on Ranking Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Exploring Sub-skeleton Trajectories for Interpretable Recognition of Sign Language ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive Approach Using Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Simplifying Multilingual News Clustering Through Projection From a Shared Space ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Towards Best Practices for Training Multilingual Dense Retrieval Models ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Addressing Issues of Cross-Linguality in Open-Retrieval Question Answering Systems For Emergent Domains ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|