1 |
An EM algorithm for context-based searching and disambiguation with application to synonym term alignment
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/PACLIC.2009/EM.CBS.C2C.PACLIC09.2009.1019.v18.15.+proof.pdf (2010)
|
|
BASE
|
|
Show details
|
|
2 |
A Preliminary Study on Probabilistic Models for Chinese Abbreviations
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/acl04/acl04.sighan.v08R.Jing.Shin.Chang.PDF (2004)
|
|
BASE
|
|
Show details
|
|
3 |
A preliminary study on probabilistic models for Chinese abbreviations
|
|
|
|
In: http://www.aclweb.org/anthology/W/W04/W04-1102.pdf (2004)
|
|
BASE
|
|
Show details
|
|
4 |
A customizable, self-learnable parameterized MT system: the next generation
|
|
|
|
In: http://www.mt-archive.info/MTS-1999-Su.pdf (1999)
|
|
BASE
|
|
Show details
|
|
5 |
An Unsupervised Iterative Method for Chinese New Lexicon Extraction
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/ukw9706f.ps.gz (1997)
|
|
BASE
|
|
Show details
|
|
6 |
A Multivariate Gaussian Mixture Model for Automatic Compound Word Extraction
|
|
|
|
In: http://www.bdc.com.tw/~shin/doc/rocling/cpnr.RocX.final.ps.Z (1997)
|
|
BASE
|
|
Show details
|
|
7 |
An Overview of Corpus-Based Statistics-Oriented (CBSO) Techniques for Natural Language Processing
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/overview.cbso.tech.ps.gz (1996)
|
|
BASE
|
|
Show details
|
|
8 |
Automatic construction of a Chinese electronic dictionary
|
|
|
|
In: http://www.mt-archive.info/VLC-1995-Chang.pdf (1995)
|
|
Abstract:
In this paper, an unsupervised approach for constructing a large-scale Chinese electronic dictionary is surveyed. The main purpose is to enable cheap and quick acquisition of a large-scale dictionary from a large untagged text corpus with the aid of the information i a small tagged seed corpus. The basic model is based on a Viterbi reestimation technique. During the dictionary construction process, it tries to optimize the automatic segmentation and tagging process by repeatedly refining the set of parameters of the underlying language model. The refined parameters are then used to furtherget a better tagging result. In addition, a two-class classifier, which is capable of classifying an n-gram either as a word or a non-word, is used in combination with the Viterbi training module to improve the system performance. Two different system configurations had been developed to construct he dictionary. The configurations include (1) a Viterbi word identification module followed by a Viterbi POS tagging module and (2) a two-class classification module as the postfilter for the above Viterbi word identification module. With a seed of 1,000 sentences and an untagged corpus of 311,591 sentences, the performance for bigram word identification is 56.88 % in precision and 77.37 % in recall when the two-class classifier is applied to the word list suggested by the Viterbi word identification module. The Viterbi part of speech tag reestimation stage gives the figures of 71.16 % and 71.81 % weighted precision rates and 73.42 % and 73.83 % weighted recall rates for the 2 different configurations when using a seed corpus of 9676 sentences.
|
|
URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.574.2536 http://www.mt-archive.info/VLC-1995-Chang.pdf
|
|
BASE
|
|
Hide details
|
|
9 |
A corpus-based two-way design for parameterized mt systems: Rationale, architecture and training issues
|
|
|
|
In: http://www.mt-archive.info/TMI-1995-Su.pdf (1995)
|
|
BASE
|
|
Show details
|
|
10 |
A Corpus-Based Statistics-Oriented Two-Way Design for Parameterized MT Systems: Rationale, Architecture and Training Issues
|
|
|
|
In: http://www.bdc.com.tw/~shin/doc/cbso.2way.mt.ps.gz (1995)
|
|
BASE
|
|
Show details
|
|
11 |
Introduction to Corpus-based Statistics-oriented (CBSO) Techniques
|
|
|
|
In: http://www.aclclp.org.tw/clclp/v1n1/v1n1a4.pdf (1994)
|
|
BASE
|
|
Show details
|
|
12 |
A Corpus-Based Statistics-Oriented Transfer and Generation Model for Machine Translation
|
|
|
|
In: http://www.mt-archive.info/TMI-1993-Chang.pdf (1993)
|
|
BASE
|
|
Show details
|
|
13 |
Statistical models for word segmentation and unknown resolution
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/sws/sws.PDF (1992)
|
|
BASE
|
|
Show details
|
|
14 |
GPSM: A Generalized Probabilistic Semantic Model for Ambiguity Resolution
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/archtran/PDF/gpsm.acl92.2c.PDF (1992)
|
|
BASE
|
|
Show details
|
|
15 |
GPSM: A Generalized Probabilistic Semantic Model for Ambiguity Resolution
|
|
|
|
In: http://acl.ldc.upenn.edu/P/P92/P92-1023.pdf (1992)
|
|
BASE
|
|
Show details
|
|
16 |
Why Corpus-Based Statistics-Oriented Machine Translation
|
|
|
|
In: http://www.mt-archive.info/TMI-1992-Su.pdf (1992)
|
|
BASE
|
|
Show details
|
|
17 |
The Semantic Score Approach to the Disambiguation of PP Attachment Problem
|
|
|
|
In: http://www.cs.nccu.edu.tw/~chaolin/papers/rocling90.pdf (1990)
|
|
BASE
|
|
Show details
|
|
18 |
Mining Atomic Chinese Abbreviation Pairs with a Probabilistic Single Character Word Recovery Model
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/LRE/AAP.LRE.0612.v08R.29.short.21.Jing.Shin.Chang.pdf
|
|
BASE
|
|
Show details
|
|
19 |
A Chinese-to-Chinese Statistical Machine Translation Model for Mining Synonymous Simplified-Traditional Chinese Terms
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/crs/doc/summit/MTS-2007-Chang-2.pdf
|
|
BASE
|
|
Show details
|
|
20 |
Computational Tools and Resources for Linguistic Studies
|
|
|
|
In: http://nlp.csie.ncnu.edu.tw/~shin/doc/rocling/CLTools.ps.Z
|
|
BASE
|
|
Show details
|
|
|
|