1 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Cross-lingual few-shot hate speech and offensive language detection using meta learning
|
|
|
|
In: ISSN: 2169-3536 ; EISSN: 2169-3536 ; IEEE Access ; https://hal.archives-ouvertes.fr/hal-03559484 ; IEEE Access, IEEE, 2022, 10, pp.14880-14896. ⟨10.1109/ACCESS.2022.3147588⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Ensemble of Opinion Dynamics Models to Understand the Role of the Undecided in the Vaccination Debate ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
The Online Behaviour of the Algerian Abusers in Social Media Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Discussion Networks and Resilience of College Students: Explicating Tie Strength in Communicative Interaction
|
|
|
|
In: International Journal of Communication; Vol 16 (2022); 25 ; 1932-8036 (2022)
|
|
BASE
|
|
Show details
|
|
6 |
“Thou Shalt Not Take the Lord’s Name in Vain”: A Methodological Proposal to Identify Religious Hate Content on Digital Social Networks
|
|
|
|
In: International Journal of Communication; Vol 16 (2022); 22 ; 1932-8036 (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Conceptual structure and the growth of scientific knowledge ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
INNOVATIVE APPROACHES AND METHODS IN TEACHING FOREIGN LANGUAGES ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
INNOVATIVE APPROACHES AND METHODS IN TEACHING FOREIGN LANGUAGES ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Multilingual Abusiveness Identification on Code-Mixed Social Media Text ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Discovering Affinity Relationships between Personality Types ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Networks and Identity Drive Geographic Properties of the Diffusion of Linguistic Innovation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Cyberbullying Classifiers are Sensitive to Model-Agnostic Perturbations ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Feature-rich multiplex lexical networks reveal mental strategies of early language learning ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
It Takes a Village: Using Network Science to Identify the Effect of Individual Differences in Bilingual Experience for Theory of Mind
|
|
|
|
In: Brain Sciences; Volume 12; Issue 4; Pages: 487 (2022)
|
|
BASE
|
|
Show details
|
|
19 |
Analysis of the Full-Size Russian Corpus of Internet Drug Reviews with Complex NER Labeling Using Deep Learning Neural Networks and Language Models
|
|
|
|
In: Applied Sciences; Volume 12; Issue 1; Pages: 491 (2022)
|
|
Abstract:
The paper presents the full-size Russian corpus of Internet users’ reviews on medicines with complex named entity recognition (NER) labeling of pharmaceutically relevant entities. We evaluate the accuracy levels reached on this corpus by a set of advanced deep learning neural networks for extracting mentions of these entities. The corpus markup includes mentions of the following entities: medication (33,005 mentions), adverse drug reaction (1778), disease (17,403), and note (4490). Two of them—medication and disease—include a set of attributes. A part of the corpus has a coreference annotation with 1560 coreference chains in 300 documents. A multi-label model based on a language model and a set of features has been developed for recognizing entities of the presented corpus. We analyze how the choice of different model components affects the entity recognition accuracy. Those components include methods for vector representation of words, types of language models pre-trained for the Russian language, ways of text normalization, and other pre-processing methods. The sufficient size of our corpus allows us to study the effects of particularities of annotation and entity balancing. We compare our corpus to existing ones by the occurrences of entities of different types and show that balancing the corpus by the number of texts with and without adverse drug event (ADR) mentions improves the ADR recognition accuracy with no notable decline in the accuracy of detecting entities of other types. As a result, the state of the art for the pharmacological entity extraction task for the Russian language is established on a full-size labeled corpus. For the ADR entity type, the accuracy achieved is 61.1% by the F1-exact metric, which is on par with the accuracy level for other language corpora with similar characteristics and ADR representativeness. The accuracy of the coreference relation extraction evaluated on our corpus is 71%, which is higher than the results achieved on the other Russian-language corpora.
|
|
Keyword:
adverse drug events; annotated corpus; coreference relation extraction; deep learning; information extraction; language models; machine learning; MESHRUS; named entity recognition; neural networks; pharmacovigilance; social media; UMLS
|
|
URL: https://doi.org/10.3390/app12010491
|
|
BASE
|
|
Hide details
|
|
20 |
Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages
|
|
|
|
In: Future Internet; Volume 14; Issue 3; Pages: 69 (2022)
|
|
BASE
|
|
Show details
|
|
|
|