1 |
A Distant Learning Approach for Extracting Hypernym Relations from Wikipedia Disambiguation Pages
|
|
|
|
In: Procedia Computer Science - Vol. 112 - 2017 ; 21st International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2017) ; https://hal.archives-ouvertes.fr/hal-01919073 ; 21st International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2017), Sep 2017, Marseille, France. pp.1764-1773 (2017)
|
|
BASE
|
|
Show details
|
|
2 |
ИСКАЖЕНИЕ ЗНАЧЕНИЙ И СМЫСЛОВ ПОЛИТИКО-ИСТОРИЧЕСКИХ СОБЫТИЙ В РАЗНОЯЗЫЧНЫХ ВЕРСИЯХ СТАТЕЙ ВИКИПЕДИИ
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Variations de la confiance et de la réputation de Wikipédia chez les jeunes (11-25 ans)
|
|
|
|
In: https://hal-univ-tlse3.archives-ouvertes.fr/hal-03636787 ; 2017 (2017)
|
|
BASE
|
|
Show details
|
|
4 |
ASRextractor: A Tool extracting Semantic Relations between Arabic Named Entities
|
|
|
|
In: 3rd International Conference on Arabic Computational Linguistics (ACLing 2017) ; https://hal-univ-tours.archives-ouvertes.fr/hal-01632858 ; 3rd International Conference on Arabic Computational Linguistics (ACLing 2017), Nov 2017, Dubai, United Arab Emirates ; http://acling2017.org/ (2017)
|
|
BASE
|
|
Show details
|
|
5 |
Developing an annotator for Latin texts using Wikipedia
|
|
|
|
In: EISSN: 2416-5999 ; Journal of Data Mining and Digital Humanities ; https://hal.archives-ouvertes.fr/hal-01279853 ; Journal of Data Mining and Digital Humanities, Episciences.org, In press (2017)
|
|
BASE
|
|
Show details
|
|
6 |
Wikiconflits : un corpus de discussions éditoriales conflictuelles du Wikipédia francophone
|
|
|
|
In: Corpus de communication médiée par les réseaux : construction, structuration, analyse ; https://hal.archives-ouvertes.fr/hal-01485427 ; Ciara R. Wigham & Gudrun Ledegen. Corpus de communication médiée par les réseaux : construction, structuration, analyse, L'Harmattan, 2017, 978-2-343-11212-1 (2017)
|
|
BASE
|
|
Show details
|
|
7 |
Extraction of Semantic Relation between Arabic Named Entities Using Different Kinds of Transducer Cascades
|
|
|
|
In: 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2017) ; https://hal.archives-ouvertes.fr/hal-01491290 ; 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2017), Apr 2017, Budapest, Hungary ; http://www.cicling.org/2017/ (2017)
|
|
BASE
|
|
Show details
|
|
10 |
Toktrack: A Complete Token Provenance And Change Tracking Dataset For The English Wikipedia ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Toktrack: A Complete Token Provenance And Change Tracking Dataset For The English Wikipedia ...
|
|
|
|
Abstract:
Please cite 10.5281/zenodo.789289 for all versions of this dataset, which will always resolve to the latest. ----------------- This dataset contains every instance of all tokens (≈ words) ever written in undeleted, non-redirect English Wikipedia articles until October 2016, in total 13,545,349,787 instances. Each token is annotated with (i) the article revision it was originally created in, and (ii) lists with all the revisions in which the token was ever deleted and (potentially) re-added and re-deleted from its article, enabling a complete and straightforward tracking of its history. This data would be exceedingly hard to create by an average potential user as it is (i) very expensive to compute and as (ii) accurately tracking the history of each token in revisioned documents is a non-trivial task. Adapting a state-of-the-art algorithm, we have produced a dataset that allows for a range of analyses and metrics, already popular in research and going beyond, to be generated on complete-Wikipedia scale; ... : Attention: In this current version we spotted an inconsistency with the 'str' column (token values ) in token csv files. Some of the tokens which contain regular quotes ( '"' ) inside were written into csv files without considering " as a quoting character. This will be fixed in an upcoming version. For example a token '"press' must be written into csv as '"""press', but it is sometimes written as '"press' in csv files. To overcome this while parsing csv files: 1. Iterate file in lines - 2. Split line with ',' - 3. Check if str value (4th item after split) starts and ends with '"'. If yes, remove them and replace '""' with '"'. Example python function: https://gist.github.com/faflo/19d3cf1768fbd7939f76ce3e9ee3b087 ...
|
|
Keyword:
Authorship; Collaborative Writing; Computational Linguistics; Conflict; Content Persistence; Content Survival; Controversy; Dataset; Provenance; Reverts; Wikipedia
|
|
URL: https://zenodo.org/record/789289 https://dx.doi.org/10.5281/zenodo.789289
|
|
BASE
|
|
Hide details
|
|
12 |
Toktrack: A Complete Token Provenance And Change Tracking Dataset For The English Wikipedia ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Comparison of Wikipedia articles in different languages ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Wikipedia. Macht. Archäologie ... : Wikipedia. Power. Archaeology ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Wikipedia access and contribution : language choice in multilingual communities . A case study
|
|
|
|
BASE
|
|
Show details
|
|
18 |
The Third Man : hierarchy formation in Wikipedia
|
|
|
|
In: Applied Network Science ; 2 (2017). - 24. - eISSN 2364-8228 (2017)
|
|
BASE
|
|
Show details
|
|
19 |
Mexican World Heritage information on the web: Institutional presence and visibility
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Wikipedia as a source of monolingual and multilingual information about the Spanish heritage ; Wikipédia como fonte de informação monolíngue e multilíngue sobre o patrimônio histórico da Espanha
|
|
|
|
BASE
|
|
Show details
|
|
|
|