A Linguistic Study on Types of Hallucinations in Large Language Models

doi:10.18627/jslg.42.1.202605.7

All Issue

2026 Vol.42, Issue 1 Next Page

Research Article

A Linguistic Study on Types of Hallucinations in Large Language Models 초거대 언어 모델의 할루시네이션 유형에 대한 언어학적 분석 연구: 양채연¹, 남지순^1*
Yang, Chae-yeon¹, Nam, Jee-Sun^1*; ¹한국외국어대학교

¹Hankuk University of Foreign Studies

31 May 2026. pp. 7-30

PDF

Abstract

This study investigates LLM hallucinations as linguistic patterns emerging at the interface of contextual constraints and intrinsic knowledge, operationalized as a failure of contextual faithfulness. Moving beyond top-down error-injection, this research adopts a bottom-up inductive approach to analyze spontaneous linguistic elaboration. To ensure selection validity and avoid self-referential evaluation, 341 high-purity cases were extracted from 10,360 Wikipedia-based QA pairs through automated pre-screening and expert linguistic verification. Results show that models prioritize narrative coherence, primarily through ‘Causal Narrative Reconstruction’ (35.48%) and ‘Target Property Reconstruction’ (32.26%). Notably, ‘Input Premise Modification’ exhibited the highest syntactic complexity, with an average of 64.8 characters. These findings reveal that LLMs utilize specific linguistic markers and syntactic expansion to maintain internal coherence. By identifying concrete predictive features—such as contrastive particles and sentence length—this study provides an empirical foundation for robust detection strategies, bridging descriptive analysis and practical mitigation.

Keywords

Large Language Model (LLM)

hallucination

bottom-up analysis

analysis of hallucination types

linguistic features

References

남지순. 2026. AI 모델을 위한 멀티홉 추론 데이터. 성남시: 리니토북스. 10.979.11988991/87
문현석, 소아람, 임희석. 2025. 다지선다형 질의응답에서 초거대언어모델의 필연적 환각현상. 제37회 한글 및 한국어 정보처리 학술대회 논문집, 676-681.
Cole, J., Michael, J., and Oleson, O. 2023. Selectively Answering Ambiguous Questions. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL).
Dietvorst, B. J., Simmons, J. P., and Massey, C. 2015. Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err. Journal of Experimental Psychology: General 144.1, 114-126. 10.1037/xge0000033
Emery, D., Goitia, M., Vargus, F., and Neagu, I. 2025. HalluMix: A Task-Agnostic Multi-Domain Benchmark for Real-World Hallucination Detection. arXiv preprint arXiv:2505.00506.
Huang, L., Yu, W., Ma, W., Zhong, W., Cui, F., Xie, Z., Ma, H., Hui, P., Lin, L., Gao, J., Peng, X., Dong, J., Fu, J., Liu, Y., Wu, L., Guo, Y., and Fu, J. 2023. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. arXiv preprint arXiv:2311.05232.
Kalai, A. T., and Vempala, S. 2024. Calibrated Language Models Must Hallucinate. Proceedings of the 12th International Conference on Learning Representations (ICLR). 10.1145/3618260.3649777
Karpowicz, M. P. 2025. On the Fundamental Impossibility of Hallucination Control in Large Language Models. arXiv preprint arXiv:2506.06382.
Kocielnik, R., Amershi, S., and Bennett, P. N. 2019. Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-User Expectations of AI Systems. CHI 2019, 1-14. 10.1145/3290605.3300641
Li, J., Cheng, X., Zhao, W. X., Nie, J.-Y., and Wen, J.-R. 2023. HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models. arXiv preprint arXiv:2305.11747. 10.18653/v1/2023.emnlp-main.397
Min, S., Krishna, K., Lyu, X., Lewis, M., Yih, W., Koh, P. W., Iyyer, M., Zettlemoyer, L., and Hajishirzi, H. 2023. FActScore: Fine-Grained Atomic Evaluation of Factual Precision in Long Form Text Generation. Proceedings of EMNLP 2023. 10.18653/v1/2023.emnlp-main.741
Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. Proceedings of EMNLP 2016. 10.18653/v1/D16-1264
Seo, J. and Lim, H. 2025. K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models. Proceedings of the International Conference on Learning Representations (ICLR).
Talmor, A., and Berant, J. 2019. ComQA: A Community-Sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters. Proceedings of NAACL 2019, 263-273.
Wu, X., Yu, K., Wu, J., and Tan, K. C. 2025. LLM Cannot Discover Causality, and Should Be Restricted to Non-Decisional Support in Causal Discovery. arXiv preprint arXiv:2506.00844.
Xu, Z., Jain, S., and Kankanhalli, M. 2024. Hallucination Is Inevitable: An Innate Limitation of Large Language Models. arXiv preprint arXiv:2401.11817.
Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Xu, C., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., & Shi, S. 2023. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arXiv preprint arXiv:2309.01219.

Information

Publisher :The Modern Linguistic Society of Korea
Publisher(Ko) :한국현대언어학회
Journal Title :The Journal of Studies in Language
Journal Title(Ko) :언어연구
Volume : 42
No :1
Pages :7-30
DOI :https://doi.org/10.18627/jslg.42.1.202605.7

[1] 남지순. 2026. AI 모델을 위한 멀티홉 추론 데이터. 성남시: 리니토북스. 10.979.11988991/87

[2] 문현석, 소아람, 임희석. 2025. 다지선다형 질의응답에서 초거대언어모델의 필연적 환각현상. 제37회 한글 및 한국어 정보처리 학술대회 논문집, 676-681.

[3] Cole, J., Michael, J., and Oleson, O. 2023. Selectively Answering Ambiguous Questions. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL).

[4] Dietvorst, B. J., Simmons, J. P., and Massey, C. 2015. Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err. Journal of Experimental Psychology: General 144.1, 114-126. 10.1037/xge0000033

[5] Emery, D., Goitia, M., Vargus, F., and Neagu, I. 2025. HalluMix: A Task-Agnostic Multi-Domain Benchmark for Real-World Hallucination Detection. arXiv preprint arXiv:2505.00506.

[6] Huang, L., Yu, W., Ma, W., Zhong, W., Cui, F., Xie, Z., Ma, H., Hui, P., Lin, L., Gao, J., Peng, X., Dong, J., Fu, J., Liu, Y., Wu, L., Guo, Y., and Fu, J. 2023. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. arXiv preprint arXiv:2311.05232.

[7] Kalai, A. T., and Vempala, S. 2024. Calibrated Language Models Must Hallucinate. Proceedings of the 12th International Conference on Learning Representations (ICLR). 10.1145/3618260.3649777

[8] Karpowicz, M. P. 2025. On the Fundamental Impossibility of Hallucination Control in Large Language Models. arXiv preprint arXiv:2506.06382.

[9] Kocielnik, R., Amershi, S., and Bennett, P. N. 2019. Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-User Expectations of AI Systems. CHI 2019, 1-14. 10.1145/3290605.3300641

[10] Li, J., Cheng, X., Zhao, W. X., Nie, J.-Y., and Wen, J.-R. 2023. HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models. arXiv preprint arXiv:2305.11747. 10.18653/v1/2023.emnlp-main.397

[11] Min, S., Krishna, K., Lyu, X., Lewis, M., Yih, W., Koh, P. W., Iyyer, M., Zettlemoyer, L., and Hajishirzi, H. 2023. FActScore: Fine-Grained Atomic Evaluation of Factual Precision in Long Form Text Generation. Proceedings of EMNLP 2023. 10.18653/v1/2023.emnlp-main.741

[12] Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. Proceedings of EMNLP 2016. 10.18653/v1/D16-1264

[13] Seo, J. and Lim, H. 2025. K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models. Proceedings of the International Conference on Learning Representations (ICLR).

[14] Talmor, A., and Berant, J. 2019. ComQA: A Community-Sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters. Proceedings of NAACL 2019, 263-273.

[15] Wu, X., Yu, K., Wu, J., and Tan, K. C. 2025. LLM Cannot Discover Causality, and Should Be Restricted to Non-Decisional Support in Causal Discovery. arXiv preprint arXiv:2506.00844.

[16] Xu, Z., Jain, S., and Kankanhalli, M. 2024. Hallucination Is Inevitable: An Innate Limitation of Large Language Models. arXiv preprint arXiv:2401.11817.

[17] Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Xu, C., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., & Shi, S. 2023. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arXiv preprint arXiv:2309.01219.

The Journal of Studies in Language ISSN:1225-4770(Print) 2671-6151(Online) 언어연구

All Issue