DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 37

1
Recovering capitalization and punctuation marks for automatic speech recognition: case study for Portuguese broadcast news
Abstract: The following material presents a study about recovering punctuation marks, and capitalization information from European Portuguese broadcast news speech transcriptions. Different approaches were tested for capitalization, both generative and discriminative, using: finite state transducers automatically built from language models; and maximum entropy models. Several resources were used, including lexica, written newspaper corpora and speech transcriptions. Finite state transducers produced the best results for written newspaper corpora, but the maximum entropy approach also proved to be a good choice, suitable for the capitalization of speech transcriptions, and allowing straightforward on-the-fly capitalization. Evaluation results are presented both for written newspaper corpora and for broadcast news speech transcriptions. The frequency of each punctuation mark in BN speech transcriptions was analyzed for three different languages: English, Spanish and Portuguese. The punctuation task was performed using a maximum entropy modeling approach, which combines different types of information both lexical and acoustic. The contribution of each feature was analyzed individually and separated results for each focus condition are given, making it possible to analyze the performance differences between planned and spontaneous speech. All results were evaluated on speech transcriptions of a Portuguese broadcast news corpus. The benefits of enriching speech recognition with punctuation and capitalization are shown in an example, illustrating the effects of described experiments into spoken texts. ; info:eu-repo/semantics/acceptedVersion
Keyword: Capitalization; Domínio/Área Científica::Ciências Agrárias::Outras Ciências Agrárias; Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação; Domínio/Área Científica::Ciências Naturais::Ciências Físicas; Language modeling; Maximum entropy; Punctuation recovery; Rich transcription; Sentence boundary detection; Truecasing; Weighted finite state transducers
URL: https://doi.org/10.1016/j.specom.2008.05.008
http://hdl.handle.net/10071/22063
BASE
Hide details
2
Strong Generative Capacity of Morphological Processes
In: Proceedings of the Society for Computation in Linguistics (2021)
BASE
Show details
3
Past, present and future: Computational approaches to mapping historical Irish cognate verb forms
FRANSEN, THEODORUS LEMAN. - : Trinity College Dublin. School of Linguistic Speech & Comm Sci. C.L.C.S., 2019
BASE
Show details
4
Decomposing phonological transformations in serial derivations
In: Proceedings of the Society for Computation in Linguistics (2018)
BASE
Show details
5
Named entity recognition within Arabic text and their semantic relations ; Extraction d'information à partir d'un texte arabe : extraction des entités nommées et leurs relations sémantiques
Doumi, Noureddine. - : HAL CCSD, 2017
In: https://hal.archives-ouvertes.fr/tel-01716911 ; Intelligence artificielle [cs.AI]. Université Djillali Liabes de Sidi Bel Abbès, 2017. Français (2017)
BASE
Show details
6
Long-distance consonant agreement and subsequentiality
In: Glossa: a journal of general linguistics; Vol 2, No 1 (2017); 52 ; 2397-1835 (2017)
BASE
Show details
7
A Compact Representation of Pronunciation Lexicons Using Finite-state Super Transducers
In: Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave, Vol 4, Iss 1, Pp 79-96 (2017) (2017)
BASE
Show details
8
A Semi-automatic and Low Cost Approach to Build Scalable Lemma-based Lexical Resources for Arabic Verbs
In: International Journal of Information Technology and Computer Science(IJITCS) ; https://hal.archives-ouvertes.fr/hal-01270974 ; International Journal of Information Technology and Computer Science(IJITCS), IJIT, 2016, 8 (2), pp.1-13 ; http://www.mecs-press.org/ijitcs/ijitcs-v8-n2/IJITCS-V8-N2-1.pdf (2016)
BASE
Show details
9
Phonétisation statistique adaptable d'énoncés pour le français
In: Journées d'Études sur la Parole ; https://hal.inria.fr/hal-01321358 ; Journées d'Études sur la Parole, Jul 2016, Paris, France (2016)
BASE
Show details
10
Evaluating the impact of using a domain-specific bilingual lexicon on the performance of a hybrid machine translation approach
In: 10th International Conference on Recent Advances in Natural Language Processing, RANLP 201 ; https://hal-cea.archives-ouvertes.fr/cea-01844051 ; 10th International Conference on Recent Advances in Natural Language Processing, RANLP 201, Sep 2015, Hissar, Bulgaria. pp.579-587 (2015)
BASE
Show details
11
Using finite-state transducers to build lexical resources for Unitex Arabic package
In: CEC-TAL ; https://hal.archives-ouvertes.fr/hal-01177017 ; CEC-TAL, Mar 2015, Sousse, Tunisia. pp.90-100 (2015)
BASE
Show details
12
Adaptive Statistical Utterance Phonetization for French
In: Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.inria.fr/hal-01109757 ; Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia. 5 p., 2 columns (2015)
BASE
Show details
13
Weighted tree automata and transducers for syntactic natural language processing ...
May, Jonathan David Louis. - : University of Southern California Digital Library (USC.DL), 2015
BASE
Show details
14
Intersection of Multitape Transducers vs. Cascade of Binary Transducers: The Example of Egyptian Hieroglyphs Transliteration
In: FSMNLP - International Workshop on Finite State Methods and Natural Language Processing ; https://hal.inria.fr/hal-00660490 ; FSMNLP - International Workshop on Finite State Methods and Natural Language Processing, Jul 2011, Blois, France. pp.74--82 ; http://www.aclweb.org/anthology/W11-4410 (2011)
BASE
Show details
15
A weighted finite state transducer implementation of phoneme rewrite rules for English-to-Korean pronunciation conversion
In: Faculty Publications (2011)
BASE
Show details
16
GREAT: open source software for statistical machine translation
González Mollá, Jorge; Casacuberta Nolla, Francisco. - : Springer Netherlands, 2011
BASE
Show details
17
Methods and Tools for Weak Problems of Translation ; Méthodes et outils pour les problèmes faibles de traduction
Malik, Muhammad Ghulam Abbas. - : HAL CCSD, 2010
In: https://tel.archives-ouvertes.fr/tel-00502192 ; Computer Science [cs]. Université Joseph-Fourier - Grenoble I, 2010. English (2010)
BASE
Show details
18
Onoma: un conjugador de verbos y neologismos verbales ; Onoma: a conjugator tool for verbs and verb neologisms
Rello, Luz; Basterrechea, Eduardo. - : Sociedad Española para el Procesamiento del Lenguaje Natural, 2010
BASE
Show details
19
Boosting Robustness of a Named Entity Recognizer
In: EISSN: 1793-7108 ; International Journal of Semantic Computing ; https://hal.archives-ouvertes.fr/hal-00436301 ; International Journal of Semantic Computing, World Scientific, 2009, 3 (1), pp.1-14 (2009)
BASE
Show details
20
Recognition of Personal Names in Serbian Texts
In: International Conference Recent Advances in Natural Language Processing (RANLP'05) ; https://hal.archives-ouvertes.fr/hal-01108230 ; International Conference Recent Advances in Natural Language Processing (RANLP'05), 2005, Borovets, Bulgaria. pp.288-292 (2005)
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
37
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern