1 |
Automatic Dialect Density Estimation for African American English ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Dialogue State Tracking with a Language Model using Schema-Driven Prompting ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Neural Models for Integrating Prosody in Spoken Language Understanding
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Analysis of Language Use in K-16 STEM Education and Impact on Student Performance
|
|
|
|
Abstract:
Thesis (Ph.D.)--University of Washington, 2020 ; There is a growing community of research focusing on educational applications of natural language processing (NLP). The applications tend to focus on analysis of student writing for scoring and feedback, and analysis of language learning. There has been less focus on analysis of language use in educational content, like assessment questions and textbooks, which is largely an expert driven process. This work examines this space, presenting automated tools for analysis of language use in K-16 science, technology, engineering and mathematics (STEM) education, and demonstrates the utility of automatically extracted features in studying student performance. This work also serves to bridge research in educational measurement and machine learning, providing a machine learning framework for analysis of factors that contribute to the difficulty of science assessment items. Within the broader umbrella of language use, this work focuses on two aspects: language difficulty (or linguistic complexity), and gender representation. Linguistic complexity has been studied from both the expert driven educational perspective and in the context of machine learning and NLP based tools. For the latter, models have shown a high agreement with expert annotation for longer documents, however, have not been shown to work well for shorter, informational texts. This work presents a discourse aware hierarchical neural model for classification of linguistic complexity quantified as grade level, demonstrated to work accurately for shorter texts, achieving state-of-the- art performance. Unlike most existing NLP based methods, the performance of our model is also validated for the downstream task of predicting student performance, where we find an impact both for K-12 and college level STEM assessments. The model for classification also generalizes to other text classification problems. Educational measurement research for prediction of difficulty of assessments questions is important in the context of assessment design and analysis of student learning. To understand the relative importance of factors impacting difficulty, many past studies have relied on use of linear models for predicting item difficulty given item characteristics. Some more recent work has looked at non- linear tree-based ensemble methods, but without analysis to identify important item characteristics. In our work with linear methods, we provide specific examples showing that the commonly used assumptions of feature independence and linear relationship between features and difficulty do not hold in practice. We also use non-linear ensemble models for the prediction problem, but unlike previous work, present a robust analysis of model performance, and apply recently introduced methods of feature interpretation to analyze aspects that contribute to question difficulty. Our results demonstrate that some item characteristics, including linguistic complexity, have a non-linear impact on item difficulty. Analysis of how gender roles are depicted in content, including assessment questions, is also a growing area of research in the educational space. This is important since negative stereotypes can impact both student performance and retention of students in STEM. Expert annotation for this task is very time consuming and can be prohibitively expensive for large text collections. Our work presents NLP based methods to automate this process for STEM textbooks and middle school assessment items. Specifically, we extract gendered mention counts, more nuanced aspects of roles, agency and authority of gendered characters, and activity characteristics. Using these features, we develop tools for analysis of content and assessments for gender biases, showing that biases exist both in terms of the frequency with which masculine and feminine characters appear in the texts, as well as in terms of the activities, roles, agency and authority of these mentions. Together, these results show the utility of NLP tools for analysis of language use in educational content, providing downstream validation with analysis of student performance. Our findings demonstrate that NLP-based analysis tools can identify sources of difficulty even in expert-curated educational content.
|
|
Keyword:
Computer science; Computer science and engineering; Educational Applications; Educational tests & measurements; Electrical engineering; Gender Bias; Linguistic Complexity; Machine Learning; Natural Language Processing
|
|
URL: http://hdl.handle.net/1773/46349
|
|
BASE
|
|
Hide details
|
|
7 |
Asynchronous Speech Recognition Affects Physician Editing of Notes
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Effective Use of Cross-Domain Parsing in Automatic Speech Recognition and Error Detection
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Graph-based Algorithms for Lexical Semantics and its Applications
|
|
|
|
BASE
|
|
Show details
|
|
|
|