Home
Catalogue search
Refine your search:
Keyword:
Artificial intelligence (1)
Computational linguistics (1)
Machine learning (1)
code switching (1)
language identification (1)
natural language processing (1)
social media (1)
Creator / Publisher:
Barman, Utsab (1)
Das, Amitava (1)
Foster, Jennifer (1)
Wagner, Joachim (1)
Year
Medium
Type
BLLDB-Access
Search in the Catalogues and Directories
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
Sort by
creator [A → Z]
'
creator [Z → A]
'
publishing year ↑ (asc)
'
publishing year ↓ (desc)
'
title [A → Z]
'
title [Z → A]
'
Simple Search
Hits 1 – 1 of 1
1
Code mixing: a challenge for language identification in the language of social media
Barman, Utsab
;
Das, Amitava
;
Wagner, Joachim
;
Foster, Jennifer
In: Barman, Utsab, Das, Amitava orcid:0000-0003-3418-463X , Wagner, Joachim orcid:0000-0002-8290-3849 and Foster, Jennifer orcid:0000-0002-7789-4853 (2014) Code mixing: a challenge for language identification in the language of social media. In: First Workshop on Computational Approaches to Code Switching, 25 Oct 2014, Doha, Qatar. (2014)
Abstract:
In social media communication, multilingual speakers often switch between languages, and, in such an environment, automatic language identification becomes both a necessary and challenging task. In this paper, we describe our work in progress on the problem of automatic language identification for the language of social media. We describe a new dataset that we are in the process of creating, which contains Facebook posts and comments that exhibit code mixing between Bengali, English and Hindi. We also present some preliminary word-level language identification experiments using this dataset. Different techniques are employed, including a simple unsupervised dictionary-based approach, supervised word-level classification with and without contextual clues, and sequence labelling using Conditional Random Fields. We find that the dictionary-based approach is surpassed by supervised classification and sequence labelling, and that it is important to take contextual clues into consideration.
Keyword:
Artificial intelligence
;
code switching
;
Computational linguistics
;
language identification
;
Machine learning
;
natural language processing
;
social media
URL:
http://doras.dcu.ie/25186/
BASE
Hide details
Mobile view
All
Catalogues
UB Frankfurt Linguistik
0
IDS Mannheim
0
OLC Linguistik
0
UB Frankfurt Retrokatalog
0
DNB Subject Category Language
0
Institut für Empirische Sprachwissenschaft
0
Leibniz-Centre General Linguistics (ZAS)
0
Bibliographies
BLLDB
0
BDSL
0
IDS Bibliografie zur deutschen Grammatik
0
IDS Bibliografie zur Gesprächsforschung
0
IDS Konnektoren im Deutschen
0
IDS Präpositionen im Deutschen
0
IDS OBELEX meta
0
MPI-SHH Linguistics Collection
0
MPI for Psycholinguistics
0
Linked Open Data catalogues
Annohub
0
Online resources
Link directory
0
Journal directory
0
Database directory
0
Dictionary directory
0
Open access documents
BASE
1
Linguistik-Repository
0
IDS Publikationsserver
0
Online dissertations
0
Language Description Heritage
0
© 2013 - 2024 Lin|gu|is|tik
|
Imprint
|
Privacy Policy
|
Datenschutzeinstellungen ändern