Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	The Usefulness of the Computer-Based Speaking Tasks of the AP Japanese Exam
	Suzumura, Nana. - : University of Hawai'i at Manoa, 2020
	Abstract: The Advanced Placement (AP) Japanese Language and Culture exam (the AP Japanese exam) is a large-scale high-stakes test targeting high school students in the United States. It is a computer-based test that consists of speaking, listening, reading, and writing sections. The present study focuses on the speaking section which aims to measure examinees’ interpersonal communication skills and presentational skills. The usefulness of the speaking tasks was investigated from a variety of perspectives. Drawing upon Bachman and Palmer’s (1996) model of test usefulness, the present study scrutinized reliability, construct validity, and authenticity of the AP Japanese speaking test tasks. In order to strengthen arguments regarding the usefulness of the AP Japanese speaking test tasks, the present study employed a mixed methods research design. The primary design was a concurrent design (R. B. Johnson & Onwuegbuzie, 2004; Teddlie & Tashakkori, 2006, 2009). The present study also employed a partial sequential design to integrate quantitative and qualitative methods at the analysis stage. Evidence for the usefulness of the test was collected by examining the speaking tasks used for the AP Japanese exam in the past, examinees’ language samples and test scores from the simulation of the AP Japanese exam, and examinee and rater survey responses. Four sets of conversation tasks and two sets of cultural perspective presentation tasks were selected from past test items and used in the simulation test. A total of 111 high school students, from two U.S. states, who were planning to take the AP Japanese exam participated in the present study. Their performance was rated by a total of six raters who participated in the present study: three raters for the conversation task and three raters for the presentation task. From the quantitative strand, generalizability theory and many-facet Rasch measurement (MFRM) (Linacre, 1989) were employed to analyze the test scores and to collect evidence primarily for reliability and construct validity. From the qualitative strand, discourse analysis was employed to analyze test task characteristics and examinees’ performance and to collect evidence primarily for authenticity and construct validity. The present study found that overall the AP Japanese speaking section has a reasonable level of usefulness in that it showed a reasonable level of reliability and it aims to represent more dimensions of speaking ability than other computer-based language tests. Yet it also has some issues related to construct validity and authenticity, especially for the conversation task. In particular, the present study recommends reviewing the amount of contextual information, the proportion of information-seeking prompts and non-information-seeking prompts, and the appropriateness of the conversation scoring criteria in relation to the nature of the conversation prompts. The present study illustrated how intricately different facets of test usefulness can interact. This study exemplified how useful a mixed methods approach can be in collecting more comprehensive evidence for test validation and how that in turn can allow for more meaningful and practical suggestions to improve the overall usefulness of a language test. ; Ph.D.
	Keyword: AP exam; discourse analysis; Educational tests & measurements; generalizability theory; Language; Linguistics; many-facet Rasch measurement; mixed methods research; performance assessment
	URL: http://hdl.handle.net/10125/68977
	BASE
	Hide details

Search in the Catalogues and Directories