2 |
A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units
|
|
|
|
In: Sensors ; Volume 21 ; Issue 19 (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Complete Variable-Length Codes: An Excursion into Word Edit Operations
|
|
|
|
In: LATA 2020 ; https://hal.archives-ouvertes.fr/hal-02389403 ; LATA 2020, Mar 2020, Milan, Italy (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Acoustic Data-Driven Subword Units Obtained through Segment Embedding and Clustering for Spontaneous Speech Recognition
|
|
|
|
In: Applied Sciences ; Volume 10 ; Issue 6 (2020)
|
|
BASE
|
|
Show details
|
|
5 |
Subunits Inference and Lexicon Development Based on Pairwise Comparison of Utterances and Signs
|
|
|
|
In: Information ; Volume 10 ; Issue 10 (2019)
|
|
Abstract:
Communication languages convey information through the use of a set of symbols or units. Typically, this unit is word. When developing language technologies, as words in a language do not have the same prior probability, there may not be sufficient training data for each word to model. Furthermore, the training data may not cover all possible words in the language. Due to these data sparsity and word unit coverage issues, language technologies employ modeling of subword units or subunits, which are based on prior linguistic knowledge. For instance, development of speech technologies such as automatic speech recognition system presume that there exists a phonetic dictionary or at least a writing system for the target language. Such knowledge is not available for all languages in the world. In that direction, this article develops a hidden Markov model-based abstract methodology to extract subword units given only pairwise comparison between utterances (or realizations of words in the mode of communication), i.e., whether two utterances correspond to the same word or not. We validate the proposed methodology through investigations on spoken language and sign language. In the case of spoken language, we demonstrate that the proposed methodology can lead up to discovery of phone set and development of phonetic dictionary. In the case of sign language, we demonstrate how hand movement information can be effectively modeled for sign language processing and synthesized back to gain insight about the derived subunits.
|
|
Keyword:
hidden Markov model; phone set; pronunciation lexicon; sign language processing; speech processing; subword units; under-resourced
|
|
URL: https://doi.org/10.3390/info10100298
|
|
BASE
|
|
Hide details
|
|
6 |
Learning Subword Embedding to Improve Uyghur Named-Entity Recognition
|
|
|
|
In: Information ; Volume 10 ; Issue 4 (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Using pronunciation-based morphological subword units to improve OOV handling in keyword search
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Quotient Complexity of Bifix-, Factor-, and Subword-Free Regular Language
|
|
|
|
BASE
|
|
Show details
|
|
11 |
An STD system for OOV query terms using various subword units
|
|
|
|
In: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings9/NTCIR/10-NTCIR9-SpokenDoc-SaitoH.pdf (2011)
|
|
BASE
|
|
Show details
|
|
12 |
Contextual verification for open vocabulary spoken term detection
|
|
|
|
In: Fraunhofer IAIS (2011)
|
|
BASE
|
|
Show details
|
|
13 |
Volkov, Modular and threshold subword counting and matrix representations of finite monoids
|
|
|
|
In: http://www.fc.up.pt/cmup/home/jalmeida/preprints/radicalshort2.pdf (2005)
|
|
BASE
|
|
Show details
|
|
14 |
Verbumculus and the discovery of unusual words
|
|
|
|
In: Apostolico, A; Gong, F C; & Lonardi, S. (2004). Verbumculus and the discovery of unusual words. Journal of Computer Science and Technology, 19(1), 22 - 41. UC Riverside: Retrieved from: http://www.escholarship.org/uc/item/5m66k36w (2004)
|
|
BASE
|
|
Show details
|
|
15 |
On average sequence complexity � www.elsevier.com/locate/tcs
|
|
|
|
In: http://www.cs.ucr.edu/~stelo/papers/tcs04.pdf (2003)
|
|
BASE
|
|
Show details
|
|
16 |
SUBWORD LATENT SEMANTIC ANALYSIS FOR TEXTTILING-BASED AUTOMATIC STORY SEGMENTATION OF CHINESE BROADCAST NEWS
|
|
|
|
In: http://isca-speech.org/archive_open/archive_papers/iscslp2008/358.pdf
|
|
BASE
|
|
Show details
|
|
17 |
and
|
|
|
|
In: http://www.mimuw.edu.pl/~rytter/MYPAPERS/PSC08_journal.pdf
|
|
BASE
|
|
Show details
|
|
18 |
1 Performance Analysis of Instruction Set Architecture Extensions for Multimedia§
|
|
|
|
In: http://www.cs.berkeley.edu/~slingn/publications/mm_isa_perf/mm_isa_perf_msp3.pdf
|
|
BASE
|
|
Show details
|
|
19 |
Parikh Matrices and Words over Tertiary Ordered Alphabet
|
|
|
|
In: http://research.ijcaonline.org/volume85/number4/pxc3893069.pdf
|
|
BASE
|
|
Show details
|
|
20 |
Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages
|
|
|
|
In: http://www-speech.sri.com/cgi-bin/run-distill?papers/acm2007-morph-asr.ps.gz
|
|
BASE
|
|
Show details
|
|
|
|