Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...14

Hits 1 – 20 of 264

1	Language identification, a tool for Corsican and for the evaluation of linguistic resources ; L'identification de langue, un outil au service du corse et de l'évaluation des ressources linguistiques
	Kevers, Laurent
	In: Traitement Automatique des Langues ; https://hal.archives-ouvertes.fr/hal-03633290 ; Traitement Automatique des Langues, 2022, Diversité Linguistique, 62 (3), pp.13-37 ; https://www.atala.org/content/diversité-linguistique-linguistic-diversity-natural-language-processing (2022)
	BASE
	Show details

2	Machine Translation and Gender biases in video game localisation: a corpus-based analysis
	Rivas Ginel, María,; Theroine, Sarah
	In: https://hal.archives-ouvertes.fr/hal-03540605 ; 2022 (2022)
	BASE
	Show details

3	Lothian Diaries Dataset 1 (May-September 2020) ...
	Hall-Lew, Lauren. - : Edinburgh DataVault, 2022
	BASE
	Show details

4	Frequency, Informativity and Word Length: Insights from Typologically Diverse Corpora
	Natalia Levshina
	In: Entropy; Volume 24; Issue 2; Pages: 280 (2022)
	Abstract: Zipf’s law of abbreviation, which posits a negative correlation between word frequency and length, is one of the most famous and robust cross-linguistic generalizations. At the same time, it has been shown that contextual informativity (average surprisal given previous context) is more strongly correlated with word length, although this tendency is not observed consistently, depending on several methodological choices. The present study examines a more diverse sample of languages than the previous studies (Arabic, Finnish, Hungarian, Indonesian, Russian, Spanish and Turkish). I use large web-based corpora from the Leipzig Corpora Collection to estimate word lengths in UTF-8 characters and in phonemes (for some of the languages), as well as word frequency, informativity given previous word and informativity given next word, applying different methods of bigrams processing. The results show different correlations between word length and the corpus-based measure for different languages. I argue that these differences can be explained by the properties of noun phrases in a language, most importantly, by the order of heads and modifiers and their relative morphological complexity, as well as by orthographic conventions.
	Keyword: corpora; frequency; informativity; linguistic typology; n-grams; Zipf’s law of abbreviation
	URL: https://doi.org/10.3390/e24020280
	BASE
	Hide details

5	Text+: Language- and text-based Research Data Infrastructure ...
	Hinrichs, Erhard; Leinen, Peter; Geyken, Alexander. - : Zenodo, 2022
	BASE
	Show details

6	Text+: Language- and text-based Research Data Infrastructure ...
	Hinrichs, Erhard; Leinen, Peter; Geyken, Alexander. - : Zenodo, 2022
	BASE
	Show details

7	Text+: Language- and text-based Research Data Infrastructure ...
	Hinrichs, Erhard; Leinen, Peter; Geyken, Alexander. - : Zenodo, 2022
	BASE
	Show details

8	ANLIzing the Adversarial Natural Language Inference Dataset
	Williams, Adina; Thrush, Tristan; Kiela, Douwe
	In: Proceedings of the Society for Computation in Linguistics (2022)
	BASE
	Show details

9	Control in free adjuncts: the 'dangling modifier' in English ...
	Donaldson, James. - : The University of Edinburgh, 2021
	BASE
	Show details

10	Loose and tight languages: A typology based on associations between constructions and lexemes ...
	Levshina, Natalia; Hawkins, John A.. - : Zenodo, 2021
	BASE
	Show details

11	Loose and tight languages: A typology based on associations between constructions and lexemes ...
	Levshina, Natalia; Hawkins, John A.. - : Zenodo, 2021
	BASE
	Show details

12	Community Involvement in Research Infrastructures: The User Story Call for Text+ ...
	Rißler-Pipka, Nanette; Barthauer, Raisa; Buddenbohm, Stefan. - : Zenodo, 2021
	BASE
	Show details

13	Community Involvement in Research Infrastructures: The User Story Call for Text+ ...
	Rißler-Pipka, Nanette; Barthauer, Raisa; Buddenbohm, Stefan. - : Zenodo, 2021
	BASE
	Show details

14	You’re a bitch, the stallion said: estudio contrastivo inglés-español sobre el uso sexista del lenguaje.
	Alonso González, María; Baliña Ben, Lucía. - 2021
	BASE
	Show details

15	Control in free adjuncts: the 'dangling modifier' in English
	Donaldson, James. - : The University of Edinburgh, 2021
	BASE
	Show details

16	Corpora in the Classroom - the Case of the Serbian Language for Italian Speakers
	Perisic Olja. - : URSS, 2021. : country:RUS, 2021. : place:Mosca, 2021
	BASE
	Show details

17	Clausal Complementation in Nepal Bhasa
	Zhang, Borui. - 2021
	BASE
	Show details

18	Overview of AMALGUM – Large Silver Quality Annotations across English Genres
	Gessler, Luke D; Peng, Siyao; Liu, Yang...
	In: Proceedings of the Society for Computation in Linguistics (2021)
	BASE
	Show details

19	Boosting English Vocabulary Knowledge through Corpus-Aided Word Formation Practice
	González Martínez, Ana; Gandón-Chapela, Evelyn
	In: RAEL: revista electrónica de lingüística aplicada, ISSN 1885-9089, Vol. 20, Nº. 1, 2021, pags. 49-70 (2021)
	BASE
	Show details

20	Semantic prosody and collocation: A corpus study of the near-synonyms persist and persevere
	Supakorn Phoocharoensil
	In: Eurasian Journal of Applied Linguistics, Vol 7, Iss 1, Pp 240-258 (2021) (2021)
	BASE
	Show details

Page: 1 2 3 4 5...14

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern