1 |
ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
|
|
|
|
In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22) ; https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Obvie: interface web pour la fouille et la comparaison de textes
|
|
|
|
In: Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022) ; https://hal.archives-ouvertes.fr/hal-03543362 ; Atelier DigitAl Humanities and cuLtural herItAge: data and knowledge management and analysis durant la conférence francophone sur l'Extraction et la Gestion des Connaissances (egc2022), Jan 2022, Blois, France ; https://egc2022.univ-tours.fr/ateliers/ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Preprint Citation Praxis in PLOS
|
|
|
|
In: ISSN: 0138-9130 ; EISSN: 1588-2861 ; Scientometrics ; https://hal.archives-ouvertes.fr/hal-03506094 ; In press (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
|
|
|
|
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
|
|
Abstract:
International audience ; We present the AGODA (Analyse sémantique et Graphes relationnels pour l'Ouverture des Débats à l'Assemblée nationale) project, which aims to create a platform for consulting and exploring digitised French parliamentary debates (1881-1940) available in the digital library of the National Library of France. This project brings together historians and NLP specialists: parliamentary debates are indeed an essential source for French history of the contemporary period, but also for linguistics. This project therefore aims to produce a corpus of texts that can be easily exploited with computational methods, and that respect the TEI standard. Ancient parliamentary debates are also an excellent case study for the development and application of tools for publishing and exploring large historical corpora. In this paper, we present the steps necessary to produce such a corpus. We detail the processing and publication chain of these documents, in particular by mentioning the problems linked to the extraction of texts from digitised images. We also introduce the first analyses that we have carried out on this corpus with "bag-of-words" techniques not too sensitive to OCR quality (namely topic modelling and word embedding).
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-CY]Computer Science [cs]/Computers and Society [cs.CY]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.HIST]Humanities and Social Sciences/History; France; OCR; Parliamentary debates; Third Republic; Topic modelling; Word embedding; XML-TEI
|
|
URL: https://hal.archives-ouvertes.fr/hal-03623351/document https://hal.archives-ouvertes.fr/hal-03623351 https://hal.archives-ouvertes.fr/hal-03623351/file/puren_bourgeois_pellet_vernus_agoda2022.pdf
|
|
BASE
|
|
Hide details
|
|
8 |
ISSumSet: a tweet summarization dataset hidden in a TREC track
|
|
|
|
In: SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing ; ISBN: 978-1-4503-8104-8 ; 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021) ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03244354 ; 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021), Association for Computing Machinery - Special Interest Group on Applied Computing (SIGAPP), Mar 2021, Republic of Korea (virtual event), South Korea. pp.665-671, ⟨10.1145/3412841.3441946⟩ ; https://dl.acm.org/doi/10.1145/3412841.3441946 (2021)
|
|
BASE
|
|
Show details
|
|
9 |
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
|
|
BASE
|
|
Show details
|
|
10 |
État de l'art du changement sémantique à partir de plongements contextualisés
|
|
|
|
In: COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference ; https://hal.archives-ouvertes.fr/hal-03320337 ; COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference, Apr 2021, Grenoble (virtuel), France (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Sentiment Analysis of Arabic Documents
|
|
|
|
In: Natural Language Processing for Global and Local Business ; https://hal.archives-ouvertes.fr/hal-03124729 ; Fatih Pinarbasi; M. Nurdan Taskiran. Natural Language Processing for Global and Local Business, pp.307-331, 2021, 9781799842408. ⟨10.4018/978-1-7998-4240-8.ch013⟩ ; https://www.igi-global.com/ (2021)
|
|
BASE
|
|
Show details
|
|
12 |
i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry
|
|
|
|
In: ISSN: 0166-3615 ; Computers in Industry ; https://hal.archives-ouvertes.fr/hal-03330584 ; Computers in Industry, Elsevier, 2021, 132, pp.103527. ⟨10.1016/j.compind.2021.103527⟩ (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Indirectly Named Entity Recognition ; Reconnaissance d'entités indirectement nommées
|
|
|
|
In: ISSN: 2530-9455 ; Journal of Computer-Assisted Linguistic Research (JCLR) ; https://hal.archives-ouvertes.fr/hal-03476411 ; Journal of Computer-Assisted Linguistic Research (JCLR), Universitat Politècnica de València, 2021, 5 (1), pp.27-46. ⟨10.4995/JCLR.2021.15922⟩ ; https://polipapers.upv.es/index.php/jclr/index (2021)
|
|
BASE
|
|
Show details
|
|
14 |
Atténuer les erreurs de numérisation dans la reconnaissance d'entités nommées pour les documents historiques
|
|
|
|
In: Conférence en Recherche d'Informations et Applications (CORIA 2021) ; https://hal.archives-ouvertes.fr/hal-03320332 ; Conférence en Recherche d'Informations et Applications (CORIA 2021), ARIA : Association Francophone de Recherche d’Information (RI) et Applications, Apr 2021, Grenoble (virtuel), France. pp.1 - 7 ; http://coria.asso-aria.org/2021/articles/mini_24/main.pdf (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Knowledge engineering in the sourcing domain for the recommendation of providers ; Ingénierie des connaissances dans le domaine du sourcing pour la recommandation de prestataires
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03336353 ; Information Retrieval [cs.IR]. Université Côte d'Azur, 2021. English. ⟨NNT : 2021COAZ4024⟩ (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Place names in Spanish Republican Life Stories: spatial patterns in locations and perceptions
|
|
|
|
In: Proceedings of the ICA ; International Cartographic Conference ; https://hal.archives-ouvertes.fr/hal-03485595 ; Proceedings of the ICA, 2021, 4, pp.1-9. ⟨10.5194/ica-proc-4-49-2021⟩ ; https://www.icc2021.net/ (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Experimental IR Meets Multilinguality, Multimodality, and Interaction
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03626028 ; Springer International Publishing, 12880, 2021, Lecture Notes in Computer Science, ⟨10.1007/978-3-030-85251-1⟩ (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Towards the Evaluation of Information Retrieval Systems on Evolving Datasets with Pivot Systems
|
|
|
|
In: Experimental IR Meets Multilinguality, Multimodality, and Interaction ; https://hal.archives-ouvertes.fr/hal-03369898 ; Experimental IR Meets Multilinguality, Multimodality, and Interaction, 12880, Springer International Publishing, pp.91-102, 2021, Lecture Notes in Computer Science, ⟨10.1007/978-3-030-85251-1_8⟩ (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Multilingual Epidemic Event Extraction
|
|
|
|
In: Towards Open and Trustworthy Digital Societies. 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Virtual Event, December 1–3, 2021, Proceedings ; https://hal.archives-ouvertes.fr/hal-03480551 ; Hao-Ren Ke; Chei Sian Lee; Kazunari Sugiyama. Towards Open and Trustworthy Digital Societies. 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Virtual Event, December 1–3, 2021, Proceedings, 13133, Springer, pp.139-156, 2021, Lecture Notes in Computer Science, 978-3-030-91668-8. ⟨10.1007/978-3-030-91669-5_12⟩ (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Ein Überblick über die neuesten abstrakten Zusammenfassungstechniken ; A Survey of Recent Abstract Summarization Techniques ; Un aperçu des techniques récentes de résumé abstrait
|
|
|
|
In: Proceedings of Sixth International Congress on Information and Communication TechnologyICICT 2021, London, Volume 4Series: Lecture Notes in Networks and Systems, Vol. 217Yang, X.-S., Sherratt, S., Dey, N., Joshi, A. (Eds.) 2021 ; Proceedings of Sixth International Congress on Information and Communication Technology ICICT 2021, London, Volume 4, Series: Lecture Notes in Networks and Systems, Vol. 217. Springer Singapore, 2021 ; https://hal.archives-ouvertes.fr/hal-03216381 ; Proceedings of Sixth International Congress on Information and Communication Technology ICICT 2021, London, Volume 4, Series: Lecture Notes in Networks and Systems, Vol. 217. Springer Singapore, 2021, ICICT 2021, Feb 2021, London, United Kingdom ; https://www.waterstones.com/book/proceedings-of-sixth-international-congress-on-information-and-communication-technology/xin-she-yang/simon-sherratt/9789811621017 (2021)
|
|
BASE
|
|
Show details
|
|
|
|