Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...21

Hits 1 – 20 of 401

1	ISSumSet: a tweet summarization dataset hidden in a TREC track
	Dusart, Alexis; Pinel-Sauvagnat, Karen; Hubert, Gilles
	In: SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing ; ISBN: 978-1-4503-8104-8 ; 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021) ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03244354 ; 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021), Association for Computing Machinery - Special Interest Group on Applied Computing (SIGAPP), Mar 2021, Republic of Korea (virtual event), South Korea. pp.665-671, ⟨10.1145/3412841.3441946⟩ ; https://dl.acm.org/doi/10.1145/3412841.3441946 (2021)
	BASE
	Show details

2	Proactive information retrieval
	Sen, Procheta. - : Dublin City University. School of Computing, 2021. : Dublin City University. ADAPT, 2021
	In: Sen, Procheta (2021) Proactive information retrieval. PhD thesis, Dublin City University. (2021)
	BASE
	Show details

3	Ein Überblick über die neuesten abstrakten Zusammenfassungstechniken ; A Survey of Recent Abstract Summarization Techniques ; Un aperçu des techniques récentes de résumé abstrait
	Puspitaningrum, Diyah
	In: Proceedings of Sixth International Congress on Information and Communication TechnologyICICT 2021, London, Volume 4Series: Lecture Notes in Networks and Systems, Vol. 217Yang, X.-S., Sherratt, S., Dey, N., Joshi, A. (Eds.) 2021 ; Proceedings of Sixth International Congress on Information and Communication Technology ICICT 2021, London, Volume 4, Series: Lecture Notes in Networks and Systems, Vol. 217. Springer Singapore, 2021 ; https://hal.archives-ouvertes.fr/hal-03216381 ; Proceedings of Sixth International Congress on Information and Communication Technology ICICT 2021, London, Volume 4, Series: Lecture Notes in Networks and Systems, Vol. 217. Springer Singapore, 2021, ICICT 2021, Feb 2021, London, United Kingdom ; https://www.waterstones.com/book/proceedings-of-sixth-international-congress-on-information-and-communication-technology/xin-she-yang/simon-sherratt/9789811621017 (2021)
	Abstract: International audience ; In diesem Artikel werden einige neuere abstrakte Zusammenfassungsmethoden vorgestellt: T5, Pegasus und ProphetNet. Wir implementieren die Systeme in zwei Sprachen: Englisch und Indonesisch. Wir untersuchen die Auswirkungen von Pre-Training-Modellen (ein T5, drei Pegasuses, drei ProphetNets) auf mehrere Wikipedia-Datensätze in englischer und indonesischer Sprache und vergleichen die Ergebnisse mit den Zusammenfassungen der Wikipedia-Systeme. Das T5-Large, das Pegasus-XSum und das ProphetNet-CNNDM bieten die beste Zusammenfassung. Die wichtigsten Faktoren, die die ROUGE-Leistung beeinflussen, sind Abdeckung, Dichte und Komprimierung. Je höher die Punktzahl, desto besser die Zusammenfassung. Weitere Faktoren, die die ROUGE-Werte beeinflussen, sind das Ziel vor dem Training, die Merkmale des Datensatzes, der Datensatz, der zum Testen des vorab trainierten Modells verwendet wird, und die mehrsprachige Funktion. Einige Vorschläge zur Verbesserung der Einschränkung dieses Dokuments sind: 1) Sicherstellen, dass der für das Modell vor dem Training verwendete Datensatz ausreichend groß sein muss und angemessene Instanzen für die Behandlung von mehrsprachigen Zwecken enthält; 2) Ein fortgeschrittener Prozess (Feinabstimmung) muss angemessen sein. Wir empfehlen, den großen Datensatz zu verwenden, der eine umfassende Abdeckung von Themen aus vielen Sprachen umfasst, bevor fortgeschrittene Prozesse wie das Train-Infer-Train-Verfahren zur Zero-Shot-Übersetzung in der Trainingsphase des Pre-Training-Modells implementiert werden. ; This paper surveys several recent abstract summarization methods: T5, Pegasus, and ProphetNet. We implement the systems in two languages: English and Indonesian languages. We investigate the impact of pre-training models (one T5, three Pegasuses, three ProphetNets) on several Wikipedia datasets in English and Indonesian language and compare the results to the Wikipedia systems' summaries. The T5-Large, the Pegasus-XSum, and the ProphetNet-CNNDM provide the best summarization. The most significant factors that influence ROUGE performance are coverage, density, and compression. The higher the scores, the better the summary. Other factors that influence the ROUGE scores are the pre-training goal, the dataset's characteristics, the dataset used for testing the pre-trained model, and the cross-lingual function. Several suggestions to improve this paper's limitation are: 1) assure that the dataset used for the pre-training model must sufficiently large, contains adequate instances for handling cross-lingual purpose; 2) Advanced process (finetuning) shall be reasonable. We recommend using the large dataset consists of comprehensive coverage of topics from many languages before implementing advanced processes such as the train-infer-train procedure to the zero-shot translation in the training stage of the pre-training model. ; Cet article examine plusieurs méthodes récentes de résumé des résumés: T5, Pegasus et ProphetNet. Nous implémentons les systèmes en deux langues: anglais et indonésien. Nous étudions l'impact des modèles de pré-formation (un T5, trois Pegasus, trois ProphetNets) sur plusieurs ensembles de données Wikipédia en anglais et en indonésien et comparons les résultats aux résumés des systèmes Wikipédia. Le T5-Large, le Pegasus-XSum et le ProphetNet-CNNDM fournissent le meilleur résumé. Les facteurs les plus importants qui influencent les performances de ROUGE sont la couverture, la densité et la compression. Plus les scores sont élevés, meilleur est le résumé. D'autres facteurs qui influencent les scores ROUGE sont l'objectif de pré-formation, les caractéristiques de l'ensemble de données, l'ensemble de données utilisé pour tester le modèle pré-entraîné et la fonction multilingue. Plusieurs suggestions pour améliorer les limites de cet article sont: 1) s'assurer que l'ensemble de données utilisé pour le modèle de pré-formation doit être suffisamment grand, contient des instances adéquates pour gérer l'objectif multilingue; 2) Le processus avancé (réglage fin) doit être raisonnable. Nous vous recommandons d'utiliser le grand ensemble de données qui consiste en une couverture complète de sujets dans de nombreuses langues avant de mettre en œuvre des processus avancés tels que la procédure train-infer-train à la traduction zéro-shot dans la phase de formation du modèle de pré-formation.
	Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; abstract summarization; ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL; ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.1: Content Analysis and Indexing/H.3.1.0: Abstracting methods; cross-lingual system; Pegasus; ProphetNet; T5; train-infer-train; Transformers
	URL: https://hal.archives-ouvertes.fr/hal-03216381/document https://hal.archives-ouvertes.fr/hal-03216381/file/2105.00824_DiyahPuspitaningrum_arXiv.pdf https://hal.archives-ouvertes.fr/hal-03216381
	BASE
	Hide details

4	The Impact of Specialized Corpora for Word Embeddings in Natural Langage Understanding.
	Neuraz, Antoine; Rance, Bastien; Garcelon, Nicolas...
	In: ISSN: 0926-9630 ; EISSN: 1879-8365 ; Studies in Health Technology and Informatics ; https://hal.inria.fr/hal-03476839 ; Studies in Health Technology and Informatics, IOS Press, 2020, 270, pp.432-436. ⟨10.3233/SHTI200197⟩ (2020)
	BASE
	Show details

5	Automatic Keyphrase Extraction From Russian-Language Scholarly Papers in Computational Linguistics
	Wienecke, Yves
	In: University Honors Theses (2020)
	BASE
	Show details

6	Romedi: An Open Data Source About French Drugs on the Semantic Web
	Cossin, Sébastien; Lebrun, Luc; Lobre, Grégory...
	In: ISSN: 0926-9630 ; EISSN: 1879-8365 ; Studies in Health Technology and Informatics ; https://hal.archives-ouvertes.fr/hal-02987843 ; Studies in Health Technology and Informatics, IOS Press, 2019, 264, pp.79-82. ⟨10.3233/SHTI190187⟩ (2019)
	BASE
	Show details

7	LIFER 2.0: discovering personal lifelog insights using an interactive lifelog retrieval system
	Ninh, Van-Tu; Le, Tu-Khiem; Zhou, Liting...
	In: Ninh, Van-Tu orcid:0000-0003-0641-8806 , Le, Tu-Khiem orcid:0000-0003-3013-9380 , Zhou, Liting orcid:0000-0002-7778-8743 , Piras, Luca, Riegler, Michael, Lux, Mathias, Tran, Minh-Triet orcid:0000-0003-3046-3041 , Gurrin, Cathal orcid:0000-0003-2903-3968 and Dang-Nguyen, Duc-Tien orcid:0000-0002-2761-2213 (2019) LIFER 2.0: discovering personal lifelog insights using an interactive lifelog retrieval system. In: CLEF 2019 Conference and Labs of The Evaluation Forum - Information Access Evaluation meets Multilinguality, Multimodality, and Visualization, 9-12 Sept 2019, Lugano Switzerland. (2019)
	BASE
	Show details

8	CLEF ehealth 2019 evaluation lab
	Kelly, Liadh; Goeuriot, Lorraine; Suominen, Hanna. - : Springer, 2019
	BASE
	Show details

9	Evaluation of the Terminology Coverage in the French Corpus LiSSa.
	Cabot, Chloé; Soualmia, Lina F.; Grosjean, Julien...
	In: ISSN: 0926-9630 ; EISSN: 1879-8365 ; Studies in Health Technology and Informatics ; https://hal.archives-ouvertes.fr/hal-01843039 ; Studies in Health Technology and Informatics, IOS Press, 2018, pp.126-130 (2018)
	BASE
	Show details

10	Spoken content retrieval beyond pipeline integration of automatic speech recognition and information retrieval
	Racca, David. - : Dublin City University. School of Computing, 2018. : Dublin City University. ADAPT, 2018
	In: Racca, David (2018) Spoken content retrieval beyond pipeline integration of automatic speech recognition and information retrieval. PhD thesis, Dublin City University. (2018)
	BASE
	Show details

11	Promoting user engagement and learning in search tasks by effective document representation
	Arora, Piyush. - : Dublin City University. School of Computing, 2018. : Dublin City University. ADAPT, 2018
	In: Arora, Piyush orcid:0000-0002-4261-2860 (2018) Promoting user engagement and learning in search tasks by effective document representation. PhD thesis, Dublin City University. (2018)
	BASE
	Show details

12	ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
	Grobol, Loïc; Tellier, Isabelle; Villemonte de La Clergerie, Éric...
	In: LREC 2018 - 11th edition of the Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-01744572 ; LREC 2018 - 11th edition of the Language Resources and Evaluation Conference, May 2018, Miyazaki, Japan ; http://lrec2018.lrec-conf.org/en/ (2018)
	BASE
	Show details

13	Extracting Absolute Spatial Entities from SMS : Comparing a Supervised and an Unsupervised Approach
	Lopez, Cédric; Zenasni, Sarah; Kergosien, Eric...
	In: CMC and Language, special issue in the Cahiers du Cental (UCL) ; https://hal.archives-ouvertes.fr/hal-01728717 ; CMC and Language, special issue in the Cahiers du Cental (UCL), 9, Presses universitaires de Louvain, pp.15-22, 2018, 978-2-87558-697-1. ⟨10.18167/DVN1/0ZGJRC⟩ (2018)
	BASE
	Show details

14	Computational Approaches for Analyzing Social Support in Online Health Communities
	Khan Pour, Hamed. - : University of North Texas, 2018
	BASE
	Show details

15	Data-Driven Identification of German Phrasal Compounds
	Barbaresi, Adrien; Hein, Katrin
	In: Text, Speech, and Dialogue ; https://hal.archives-ouvertes.fr/hal-01575651 ; Kamil Ekštein; Václav Matoušek. Text, Speech, and Dialogue, 10415, Springer International Publishing, pp.192-200, 2017, Lecture Notes in Computer Science, 978-3-319-64205-5. ⟨10.1007/978-3-319-64206-2_22⟩ ; https://link.springer.com/bookseries/558 (2017)
	BASE
	Show details

16	Die Korpusplattform des „Digitalen Wörterbuchs der deutschen Sprache“ (DWDS)
	Geyken, Alexander; Barbaresi, Adrien; Didakowski, Jörg...
	In: ISSN: 0301-3294 ; EISSN: 1613-0626 ; Zeitschrift für Germanistische Linguistik ; https://hal.archives-ouvertes.fr/hal-01575661 ; Zeitschrift für Germanistische Linguistik, De Gruyter, 2017, Zeitschrift für Germanistische Linguistik, 45 (2), pp.327-344. ⟨10.1515/zgl-2017-0017⟩ ; https://www.degruyter.com/view/j/zfgl.2017.45.issue-2/zgl-2017-0017/zgl-2017-0017.xml (2017)
	BASE
	Show details

17	Discriminating between Similar Languages using Weighted Subword Features
	Barbaresi, Adrien
	In: Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017) ; https://hal.archives-ouvertes.fr/hal-01575656 ; Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017), Association for Computational Linguistics (ACL), Apr 2017, Valence, Spain. pp.184-189, ⟨10.18653/v1/W17-1223⟩ ; http://ttg.uni-saarland.de/vardial2017/ (2017)
	BASE
	Show details

18	Interoperable annotation of (co)references in the Democrat project
	Grobol, Loïc; Landragin, Frédéric; Heiden, Serge
	In: ISA-13 2017 - Proceedings of the 13th Joint ISO - ACL Workshop on Interoperable Semantic Annotation ; Thirteenth Joint ISO-ACL Workshop on Interoperable Semantic Annotation ; https://hal.archives-ouvertes.fr/hal-01583527 ; Thirteenth Joint ISO-ACL Workshop on Interoperable Semantic Annotation, ACL Special Interest Group on Computational Semantics (SIGSEM); ISO TC 37/SC 4 (Language Resources) WG 2, Sep 2017, Montpellier, France ; https://sigsem.uvt.nl/isa13/ (2017)
	BASE
	Show details

19	Entity Recognition and Language Identification with FELTS
	Jourlin, Pierre
	In: Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum ; CLEF 2017 ; https://hal-univ-avignon.archives-ouvertes.fr/hal-02133508 ; CLEF 2017, Sep 2017, Dublin, Ireland (2017)
	BASE
	Show details

20	Towards a toolbox to map historical text collections
	Barbaresi, Adrien
	In: 11th Workshop on Geographic Information Retrieval (GIR'17) ; https://hal.archives-ouvertes.fr/hal-01654526 ; 11th Workshop on Geographic Information Retrieval (GIR'17), Nov 2017, Heidelberg, Germany. ⟨10.1145/3155902.3155905⟩ (2017)
	BASE
	Show details

Page: 1 2 3 4 5...21

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern