1 |
Evaluation of Chinese Natural Language Processing System Based on Metamorphic Testing
|
|
|
|
In: Mathematics; Volume 10; Issue 8; Pages: 1276 (2022)
|
|
Abstract:
A natural language processing system can realize effective communication between human and computer with natural language. Because its evaluation method relies on a large amount of labeled data and human judgment, the question of how to systematically evaluate its quality is still a challenging task. In this article, we use metamorphic testing technology to evaluate natural language processing systems from the user’s perspective to help users better understand the functionalities of these systems and then select the appropriate natural language processing system according to their specific needs. We have defined three metamorphic relation patterns. These metamorphic relation patterns respectively focus on some characteristics of different aspects of natural language processing. Moreover, on this basis, we defined seven metamorphic relations and chose three tasks (text similarity, text summarization, and text classification) to evaluate the quality of the system. Chinese is used as target language. We extended the defined abstract metamorphic relations to these tasks, and seven specific metamorphic relations were generated for each task. Then, we judged whether the metamorphic relations were satisfied for each task, and used them to evaluate the quality and robustness of the natural language processing system without reference output. We further applied the metamorphic test to three mainstream natural language processing systems (including BaiduCloud API, AliCloud API, and TencentCloud API), and on the PWAS-X datasets, LCSTS datasets, and THUCNews datasets. Experiments were carried out, revealing the advantages and disadvantages of each system. These results further show that the metamorphic test can effectively test the natural language processing system without annotated data.
|
|
Keyword:
metamorphic testing; natural language processing; quality assessment
|
|
URL: https://doi.org/10.3390/math10081276
|
|
BASE
|
|
Hide details
|
|
2 |
Language Assessment Literacy of Middle School English Teachers in Mexico
|
|
|
|
In: Languages; Volume 7; Issue 1; Pages: 32 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Validation of a large-scale task-based test: functional progression in dialogic speaking performance ; Task-based language teaching and assessment: Contemporary reflections from across the world
|
|
|
|
BASE
|
|
Show details
|
|
4 |
CAF across proficiency levels and profiles: an investigation of ESL student writings in an English placement test
|
|
|
|
BASE
|
|
Show details
|
|
6 |
A Rationale for Using a Scenario-Based Assessment to Measure Competency-Based, Situated Second and Foreign Language Proficiency
|
|
|
|
BASE
|
|
Show details
|
|
7 |
The Use of an Elicited Imitation Test to Measure Global Oral Proficiency of L2 Chinese at the Postsecondary Classroom
|
|
|
|
In: Chinese Language Teaching Methodology and Technology (2021)
|
|
BASE
|
|
Show details
|
|
8 |
The diagnosis of listening in English as a foreign language, with a special focus on lexical knowledge ; Diagnostic et remédiation orientés vers le lexique en compréhension aurale de l’anglais
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-03170753 ; Linguistique. Université Lyon 2 Lumière, 2021. Français (2021)
|
|
BASE
|
|
Show details
|
|
9 |
The diagnosis of listening in English as a foreign language, with a special focus on lexical knowledge ; Diagnostic et remédiation orientés vers le lexique en compréhension aurale de l'anglais
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03235381 ; Linguistique. Université de Lyon, 2021. Français. ⟨NNT : 2021LYSE2004⟩ (2021)
|
|
BASE
|
|
Show details
|
|
10 |
MODERN MEANS OF EVALUATING THE RESULTS OF LEARNING FOREIGN LANGUAGES ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Establishing Equity: Aligning Dual Language Bilingual Education to HB3 Sec. 11.185 Texas Early Childhood Literacy & Mathematics Proficiency Plans ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Teaching to the test: The effects of coaching on English-proficiency scores for university entry
|
|
|
|
In: Journal of the European Second Language Association; Vol 5, No 1 (2021); 1–15 ; 2399-9101 (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Language neutrality of the LLAMA test explored: The case of agglutinative languages and multiple writing systems
|
|
|
|
In: Journal of the European Second Language Association; Vol 5, No 1 (2021); 87–100 ; 2399-9101 (2021)
|
|
BASE
|
|
Show details
|
|
16 |
English Proficiency as a Predictor of ACT Scores: A Predictive Correlational Study
|
|
|
|
In: Doctoral Dissertations and Projects (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Improving communication outcomes for children with hearing loss in their early years: tracking progress and guiding intervention
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Policy in Practice: Teachers’ Conceptualizations of L2 English Oral Proficiency as Operationalized in High-Stakes Test Assessment
|
|
|
|
In: Languages; Volume 6; Issue 4; Pages: 204 (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Tungafɛtaa Badara ye Sidabana sɔrɔ cogo min na ; How Badara got AIDS ; L'aventurier Comment Badara a eu le SIDA
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Tuŋasigeya - Badara sidaakirɛn baanaa ; L'aventurier - Comment Badara a eu le SIDA
|
|
|
|
BASE
|
|
Show details
|
|
|
|