Research Article
Abstract
References
Information
Both tree models and logistic regression models are widely used to analyze multifactorial data in recent corpus studies. Using my previous corpus study on relative clauses, this paper argues that tree models have difficulties dealing with the integrated effect of multiple linguistic factors, that is, a three-way interaction of non-syntactic factors that affect the preference of relative clause types. The integrated interaction effect cannot be captured by adding interaction terms in a logistic regression model but by suppressing an intercept and creating a single variable that is the combination of all three factors. A mixed-effects logistic regression analysis is ultimately implemented by adding the random effect of register, which has been ignored in the corpus linguistics literature on relative clauses.
- Ai, C. and E. C. Norton. 2003. Interaction Terms in Logit and Probit Models. Economics Letters 80.1, 123-129. 10.1016/S0165-1765(03)00032-6
- Aissen, J. 1999. Markedness and Subject Choice in Optimality Theory. Natural Language and Linguistic Theory 17.4, 673-711. 10.1023/A:1006335629372
- Aissen, J. 2003. Differential Object Marking: Iconicity vs. Economy. Natural Language and Linguistic Theory 21.3, 435-483. 10.1023/A:1024109008573
- Albert, A. and J. Anderson. 1984. On the Existence of Maximum Likelihood Estimates in Logistic Regression Models. Biometrika 71.1, 1-10. 10.1093/biomet/71.1.1
- Barth, D. and V. Kapatsinski. 2018. Evaluating Logistic Mixed-effects Models for Corpus-linguistic Data in Light of Lexical Diffusion. In D. Speelman, K. Heylen, and D. Geeraets (eds), Mixed-effects Regression Models in Linguistics. Springer. 99-116. 10.1007/978-3-319-69830-4_6
- Comrie, B. 1989. Language Universals and Linguistic Typology. Chicago: University of Chicago Press.
- Eddington, D. 2010. A Comparison of Two Tools for Analyzing Linguistic Data: Logistic Regression and Decision Trees. Italian Journal of Linguistics 22.2, 265-286.
- Fox, J., W. Sanford, B. Price, J. Hong, R. Anderson, D. Firth, S. Taylor, and the R Core Team. 2020. Effect Displays for Linear, Generalized Linear, and Other Models. UTC.
- Gennari, S. and M. MacDonald. 2008. Semantic Indeterminacy in Object Relative Clauses. Journal of Memory and Language 58.2, 161-187. 10.1016/j.jml.2007.07.004 10.1016/j.jml.2007.07.004 PMC2735264
- Gennari, S. and M. MacDonald. 2009. Linking Production and Comprehension Processes: The Case of Relative Clauses. Cognition 111.1, 1-23. 10.1016/j.cognition.2008.12.006 19215912
- Gordon, P., R. Hendrick, and M. Johnson. 2001. Memory Interference during Language Processing. Journal of Experimental Psychology: Learning, Memory, and Cognition 27.6, 1411-1423. 10.1037/0278-7393.27.6.1411
- Gordon, P., R. Hendrick, and M. Johnson. 2004. Effects of Noun Phrase Type on Sentence Complexity. Journal of Memory and Language 51.1, 97-114. 10.1016/j.jml.2004.02.003
- Gordon, P. and R. Hendrick, 2005. Relativization, Ergativity, and Corpus Frequency. Linguistic Inquiry 36, 456-463. 10.1162/0024389054396953
- Gries, S. 2015. The Most Under-used Statistical Method in Corpus Linguistics: Multi-Level (and mixed-effects) Models. Corpora 10.1, 95-125. 10.3366/cor.2015.0068
- Gries, S. 2020. On Classification Trees and Random Forests in Corpus Linguistics: Some Words of Caution and Suggestions for Improvement. Corpus Linguistics and Linguistic Theory 16.3, 617-647. 10.1515/cllt-2018-0078
- Gries, S. 2021. (Generalized linear) Mixed-effects Modeling: A Learner Corpus Example. Language Learning 1-42. 10.1111/lang.12448
- Hörberg, T. 2018. Functional Motivations Behind Direct Object Fronting in Written Swedish: A Corpus-Distributional Account. Glossa 3.1, 81. 10.5334/gjgl.502
- Jenset, G., B. McGillivray, and M. Rundell. 2018. The English Dative Alternation Revisited: Fresh Insights from Contemporary British Spoken Data. In V. Brezina, R. Love, and K. Aijmer (eds.), Corpus Approaches to Contemporary British Speech: Sociolinguistic Studies of the Spoken BNC 2014. London: Routledge, 185-207. 10.4324/9781315268323-10
- Just, M. and P. Carpenter. 1992. A Capacity Theory of Comprehension: Individual Differences in Working Memory Capacity. Psychological Review 99.1, 122-149. 10.1037/0033-295X.99.1.122 1546114
- King, J. and M. Just. 1991. Individual Differences in Syntactic Processing: The Role of Working Memory. Journal of Memory and Language 30.5, 580-602. 10.1016/0749-596X(91)90027-H
- Levy, R. 2012. Probabilistic Methods in Linguistics. Lecture 14: Logistic regression. Manuscript. UC San Diego.
- Manning, C. 2007. Logistic regression (with R). Manuscript. Stanford University.
- Norton, E. C., H. Wang, and C. Ai. 2004. Computing Interaction Effects and Standard Errors in Logit and Probit Models. Stata Journal 4.2, 154-167. 10.1177/1536867X0400400206
- R Core Team. 2021. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL: https://www.R-project.org/.
- Reali, F. and M. Christiansen. 2007. Processing of Relative Clauses is Made Easier by Frequency of Occurrence. Journal of Memory and Language 57, 1-23. 10.1016/j.jml.2006.08.014
- Roland, D., F. Dick, and J. Elman. 2007. Frequency of Basic English Grammatical Structures: A Corpus Analysis. Journal of Memory and Language 57, 348-379. 10.1016/j.jml.2007.03.002 19668599 PMC2722756
- Roland, D., G. Mauner C., O’Meara, and H. Yun. 2012. Discourse Expectations and Relative Clause Processing. Journal of Memory and Language 66, 479-508. 10.1016/j.jml.2011.12.004
- Shin, K. 2019. An Expectation-Based Account for the Processing Difficulty of it Object-Extracted Relative Clauses. Korean Journal of Linguistics 44.4, 807-829. 10.18855/lisoko.2019.44.4.006
- Shin, K. 2020a. Some Remarks on Gries’s Criticism on a Tree-Based Approach of Multifactorial Data. Language and Information 24.1, 15-28. 10.29403/LI.24.1.2
- Shin, K. 2020b. Non-linear Interactions of Factors Influencing Relative Clasue Distribution and Their Implications on Relative Clause Processing. Korean Journal of Linguistics 45.1, 919-940.
- Silverstein, M. 1976. Hierarchy of Features and Ergativity, In R. Dixon (ed.), Grammatical Categories in Australian Languages. Canberra: Australian Institute of Aboriginal Studies, 112-171.
- Traxler, M., R. Morris, and R. Seely. 2002. Processing Subject and Object Relative Clauses: Evidence from Eye Movements. Journal of Memory and Language 47.1, 69-90. 10.1006/jmla.2001.2836
- Traxler, M., R. Williams, S. Blozis, and R. Morris. 2005. Working Memory, Animacy, and Verb Class in the Processing of Relative Clauses. Journal of Memory and Language 53.2, 204-224. 10.1016/j.jml.2005.02.010
- Publisher :The Modern Linguistic Society of Korea
- Publisher(Ko) :한국현대언어학회
- Journal Title :The Journal of Studies in Language
- Journal Title(Ko) :언어연구
- Volume : 37
- No :3
- Pages :291-305
- DOI :https://doi.org/10.18627/jslg.37.3.202111.291