1 |
English machine reading comprehension: new approaches to answering multiple-choice questions
|
|
Dzendzik, Daria. - : Dublin City University. School of Computing, 2021. : Dublin City University. ADAPT, 2021
|
|
In: Dzendzik, Daria (2021) English machine reading comprehension: new approaches to answering multiple-choice questions. PhD thesis, Dublin City University. (2021)
|
|
Abstract:
Reading comprehension is often tested by measuring a person or system’s ability to answer questions about a given text. Machine reading comprehension datasets have proliferated in recent years, particularly for the English language. The aim of this thesis is to investigate and improve data-driven approaches to automatic reading comprehension. Firstly, I provide a full classification of question and answer types for the reading comprehension task. I also present a systematic overview of English reading comprehension datasets (over 50 datasets). I observe that the majority of questions were created using crowdsourcing and the most popular data source is Wikipedia. There is also a lack of why, when, and where questions. Additionally, I address the question “What makes a dataset difficult?” and highlight the difference between datasets created for people and datasets created for machine reading comprehension. Secondly, focusing on multiple-choice question answering, I propose a computationally light method for answer selection based on string similarities and logistic regression. At the time (December 2017), the proposed approach showed the best performance on two datasets (MovieQA and MCQA: IJCNLP 2017 Shared Task 5 Multi-choice Question Answering in Examinations) outperforming some CNN-based methods. Thirdly, I investigate methods for Boolean Reading Comprehension tasks including the use of Knowledge Graph (KG) information for answering questions. I provide an error analysis of a transformer model’s performance on the BoolQ dataset. This reveals several important issues such as unstable model behaviour and some issues with the dataset itself. Experiments with incorporating knowledge graph information into a baseline transformer model do not show a clear improvement due to a combination of the model’s ability to capture new information, inaccuracies in the knowledge graph, and imprecision in entity linking. Finally, I develop a Boolean Reading Comprehension dataset based on spontaneously user-generated questions and reviews which is extremely close to a real-life question-answering scenario. I provide a classification of question difficulty and establish a transformer-based baseline for the new proposed dataset.
|
|
Keyword:
Artificial intelligence; Computational linguistics; Information retrieval; Machine learning; machine reading comprehension; question answering; transformer language models
|
|
URL: http://doras.dcu.ie/26534/
|
|
BASE
|
|
Hide details
|
|
2 |
Quantum semantics of text perception
|
|
|
|
In: ISSN: 2045-2322 ; EISSN: 2045-2322 ; Scientific Reports ; https://hal-centralesupelec.archives-ouvertes.fr/hal-03147747 ; Scientific Reports, Nature Publishing Group, 2021, 11 (1), ⟨10.1038/s41598-021-83490-9⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Learning to scale multilingual representations for vision-language tasks
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Neural Methods Towards Concept Discovery from Text via Knowledge Transfer
|
|
|
|
In: http://rave.ohiolink.edu/etdc/view?acc_num=osu1572387318988274 (2019)
|
|
BASE
|
|
Show details
|
|
6 |
Multilingual Information Access (MLIA) Tools on Google and WorldCat: Bi/Multilingual University Students’ Experience and Perceptions
|
|
|
|
In: FIMS Publications (2019)
|
|
BASE
|
|
Show details
|
|
7 |
ImproteK: introducing scenarios into human-computer music improvisation
|
|
|
|
In: ACM Computers in Entertainment ; https://hal.archives-ouvertes.fr/hal-01380163 ; ACM Computers in Entertainment, 2017, ⟨10.1145/3022635⟩ (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Knowledge-Based Probabilistic Modeling For Tracking Lyrics In Music Audio Signals ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Knowledge-Based Probabilistic Modeling For Tracking Lyrics In Music Audio Signals ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Knowledge-based probabilistic modeling for tracking lyrics in music audio signals
|
|
|
|
In: TDX (Tesis Doctorals en Xarxa) (2017)
|
|
BASE
|
|
Show details
|
|
11 |
Distributional Thesauri for Information Retrieval and vice versa
|
|
|
|
In: Proceedings of Language and Resource Conference, LREC ; Language and Resource Conference, LREC ; https://hal.archives-ouvertes.fr/hal-01394770 ; Language and Resource Conference, LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Evaluating topic model interpretability from a primary care physician perspective.
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Supervised Topic Models for Diagnosis Code Assignment to Discharge Summaries
|
|
|
|
In: 17th International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02052345 ; 17th International Conference on Intelligent Text Processing and Computational Linguistics, Apr 2016, Konya, Turkey (2016)
|
|
BASE
|
|
Show details
|
|
14 |
Proceedings of the 7th Workshop on Computational Models of Narrative
|
|
|
|
In: 7th Workshop on Computational Models of Narrative (CMN 2016) ; https://hal.inria.fr/hal-01427217 ; 7th Workshop on Computational Models of Narrative (CMN 2016), Jul 2016, Cracovie, Poland. 53, 2016, OASICS, 978-3-95977-020-0 ; http://drops.dagstuhl.de/portals/oasics/index.php?semnr=16021 (2016)
|
|
BASE
|
|
Show details
|
|
15 |
Special Session on Emotion and Sentiment in Intelligent Systems and Big Social Data Analysis (SentISData 2016)
|
|
|
|
In: 3rd IEEE International Conference on Data Science and Advanced Analytics (DSAA 2016) ; https://hal.archives-ouvertes.fr/hal-03176429 ; Benamara, Farah; Bosco, Cristina; Fersini, Elisabetta; Patti, Viviana; Viviancos, Emilio. 3rd IEEE International Conference on Data Science and Advanced Analytics (DSAA 2016), Oct 2016, Montréal, Canada. 2016 ; https://sites.ualberta.ca/~dsaa16/specialsessions.html (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Efficient Inference, Search and Evaluation for Latent Variable Models of Text with Applications to Information Retrieval and Machine Translation
|
|
|
|
In: Doctoral Dissertations (2016)
|
|
BASE
|
|
Show details
|
|
18 |
Thésaurus distributionnels pour la recherche d'information et vice-versa
|
|
|
|
In: Conférence en Recherche d’Information et Applications ; https://hal.archives-ouvertes.fr/hal-01226532 ; Conférence en Recherche d’Information et Applications, Mar 2015, Paris, France (2015)
|
|
BASE
|
|
Show details
|
|
19 |
Thésaurus distributionnels pour la recherche d'information et vice-versa
|
|
|
|
In: ISSN: 1279-5127 ; EISSN: 1963-1014 ; Document Numérique ; https://hal.archives-ouvertes.fr/hal-01226551 ; Document Numérique, Lavoisier, 2015, 18 (2-3), ⟨10.3166/DN.18.2-3.101-121⟩ (2015)
|
|
BASE
|
|
Show details
|
|
20 |
Single-word predictions of upcoming language during comprehension: Evidence from the cumulative semantic interference task.
|
|
|
|
In: Cognitive psychology, vol 79 (2015)
|
|
BASE
|
|
Show details
|
|
|
|