Page: 1 2 3 4 5 6 7 8 9... 150
81 |
EnvEdit: Environment Editing for Vision-and-Language Navigation ...
|
|
|
|
BASE
|
|
Show details
|
|
82 |
Homepage2Vec: Language-Agnostic Website Embedding and Classification ...
|
|
|
|
Abstract:
Currently, publicly available models for website classification do not offer an embedding method and have limited support for languages beyond English. We release a dataset of more than two million category-labeled websites in 92 languages collected from Curlie, the largest multilingual human-edited Web directory. The dataset contains 14 website categories aligned across languages. Alongside it, we introduce Homepage2Vec, a machine-learned pre-trained model for classifying and embedding websites based on their homepage in a language-agnostic way. Homepage2Vec, thanks to its feature set (textual content, metadata tags, and visual attributes) and recent progress in natural language representation, is language-independent by design and generates embedding-based representations. We show that Homepage2Vec correctly classifies websites with a macro-averaged F1-score of 0.90, with stable performance across low- as well as high-resource languages. Feature analysis shows that a small subset of efficiently computable ... : Published in Proc. of ICWSM 2022 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2201.03677 https://arxiv.org/abs/2201.03677
|
|
BASE
|
|
Hide details
|
|
83 |
Multilinguals at SemEval-2022 Task 11: Transformer Based Architecture for Complex NER ...
|
|
|
|
BASE
|
|
Show details
|
|
84 |
A new approach to calculating BERTScore for automatic assessment of translation quality ...
|
|
|
|
BASE
|
|
Show details
|
|
85 |
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
86 |
ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language ...
|
|
|
|
BASE
|
|
Show details
|
|
87 |
EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
89 |
Learning Bidirectional Translation between Descriptions and Actions with Small Paired Data ...
|
|
|
|
BASE
|
|
Show details
|
|
90 |
A Feasibility Study of Answer-Agnostic Question Generation for Education ...
|
|
|
|
BASE
|
|
Show details
|
|
91 |
Language Generation for Broad-Coverage, Explainable Cognitive Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
95 |
Regional Negative Bias in Word Embeddings Predicts Racial Animus--but only via Name Frequency ...
|
|
|
|
BASE
|
|
Show details
|
|
98 |
Grounding Hindsight Instructions in Multi-Goal Reinforcement Learning for Robotics ...
|
|
|
|
BASE
|
|
Show details
|
|
99 |
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity ...
|
|
|
|
BASE
|
|
Show details
|
|
100 |
Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8 9... 150
|
|