DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...41
Hits 1 – 20 of 810

1
Abstracts from the KAS corpus KAS-Abs 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
2
Corpus of academic Slovene KAS 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko; Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola; Ferme, Marko; Borovič, Mladen; Boškovič, Borko; Ojsteršek, Milan; Hrovat, Goran. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
Abstract: The KAS corpus of Slovene academic writing consists of almost 65,000 BSc/BA, 16,000 MSc/MA and 1,600 PhD theses (82 thousand texts, 5 million pages or 1,5 billion tokens) written 2000 - 2018 and gathered from the digital libraries of Slovene higher education institutions via the Slovene Open Science portal (http://openscience.si/). The theses have associated with them significant metadata, while each thesis in the corpus contains its textual body, i.e. without their front and back matter. The body is divided into chapters, then into pages, these into paragraphs, and then into sentences. The sentence tokens are tagged with morphosyntactically descriptions (detailed part-of-speech tags) and the words lemmatised. As opposed to the previous version 1.0, the KAS corpus of Slovene academic writing 2.0 is cleaner and contains segmentations into chapters. The metadata also contains more information about research fields of each work. Both versions consist of the same number of BSc/BA, MSc/MA, and PhD theses, however, the processing was done from scratch for 2.0, so the number of e.g. pages and tokens is different. Note also that the new version does not contain links to the PNG pictures of individual pages , nor does it contain annotated terms, both present in version 1.0. It is, unlike 1.0, also not mounted on the CLARIN.SI concordancers. The new version is distributed in the canonical TEI encoding, JSON, and as plain text files. In the TEI format, chapter names are denoted with the tag. Each entry in JSON files have a string ID and a list containing names of chapters as its first element and texts as its second element. Chapters without text are represented as an empty string. The plain text files contain only text bodies without segmentation information. References: Žagar, A., Kavaš, M., & Robnik Šikonja, M. (2021). Corpus KAS 2.0: cleaner and with new datasets. In Information Society - IS 2021: Proceedings of the 24th International Multiconference. https://doi.org/10.5281/zenodo.5562228
Keyword: academic writing; BSc/BA theses; MSc/MA theses; PhD theses; TEI
URL: http://hdl.handle.net/11356/1448
BASE
Hide details
3
Summarization datasets from the KAS corpus KAS-Sum 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
4
Machine Translation datasets from the KAS corpus KAS-MT 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
5
Surveys in toponymy in Brazil: works produced in postgraduate stricto sensu ; Pesquisas em toponímia no Brasil: trabalhos produzidos na pós-graduação stricto sensu
In: Acta Scientiarum. Language and Culture; Vol 44 No 1 (2022): Jan.-June; e53282 ; Acta Scientiarum. Language and Culture; v. 44 n. 1 (2022): Jan.-June; e53282 ; 1983-4683 ; 1983-4675 (2022)
BASE
Show details
6
Abstracts from the KAS corpus KAS-Abs 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2021. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2021
BASE
Show details
7
In search of safety: A qualitative study on how LGBT+ college students find safe spaces on college campuses
BASE
Show details
8
College Students’ Attitudes Toward Immigration within the United States
Lee, Walker V.. - 2021
BASE
Show details
9
Improving the Accessibility of Arabic Electronic Theses and Dissertations (ETDs) with Metadata and Classification
Abdelrahman, Eman. - : Virginia Tech, 2021
BASE
Show details
10
Le complotisme « transnational » et le discours de haine : le cas de Chypre et de l’Italie
In: Mots. Les langages du politique, n 125, 1, 2021-02-15, pp.15-34 (2021)
BASE
Show details
11
Immigrant Assimilation Through Theatre ...
Anastasiadis, Grace. - : Maryland Shared Open Access Repository, 2020
BASE
Show details
12
English-Slovene term candidates KAS-biterm 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2020
BASE
Show details
13
The Vocal Pedagogy of the Behnke Family: The Behnke Method
Stapleton, Megan. - : University of North Texas, 2020
BASE
Show details
14
An overview of studies within applied linguistics in Brazilian graduation programs between 2017 and 2020 ; Fotografias da pesquisa em linguística aplicada na pós-graduação brasileira entre 2017 e 2020
In: Entrepalavras; v. 10, n. 3 (10) (2020)
BASE
Show details
15
A Brave Space for Community: Bolstering K-12 Theatre Education for Diversity, Equity, and Inclusion ...
Loest, Tylor. - : Maryland Shared Open Access Repository, 2019
BASE
Show details
16
Monitoring Academic Studies of Turkish Lexicography: A Bibliometric Study of 84 Years
In: Lexikos; Vol. 29 (2019) ; 2224-0039 (2019)
BASE
Show details
17
Corpus of academic Slovene KAS 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
18
Corpus of Academic Slovene (PhD theses) KAS-dr 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
19
Corpus of Academic Slovene (MSc/MA theses) KAS-mag 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
20
Corpus of Academic Slovene (BSc/BA theses) KAS-dipl 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details

Page: 1 2 3 4 5...41

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
1
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
809
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern