Research Article
Abstract
References
Information
In the field of computational linguistics, this study aims to investigate how multi-word expressions, which play an important role in disambiguating text meaning and effectively communicating a paper’s content, are represented as word embeddings in large-scale language models (LLMs) and utilized in context. Additionally, we identify multi-word expressions (MWEs) such as “artificial intelligence” and “machine learning” through phrase mining in scientific papers, extract acronyms such as “AI” and “GPT”, and analyze the characteristics of technical terms using MWE word embeddings. To achieve this, this study not only collects 41,230 abstracts from recent international academic papers related to ChatGPT but also extracts and analyzes the MWEs in these abstracts. In addition, it extends the application of word embeddings for MWEs in natural language processing and seeks to understand the process of integrating natural language processing technology with the workings of generative AI.
- 구유선, 이병희. 2024. 챗GPT 관련 요약문에서 기술 신조어 추출과 단어 임베딩. 언어연구, 40.2, 109-125.
- 김영광. 2021. 영화를 이용한 다단어 표현(Multi-word Expressions) 자각력 향상 연구. 영상영어교육, 22.1, 39-63. 10.16875/stem.2021.22.1.39
- 나혜인, 이병희. 2023. 국가R&D정보서비스 고도화를 위한 신문기사를 이용한 ChatGPT 도입 PEST 분석과 텍스트마이닝. 한국정보기술학회논문지, 21.10, 171-184. 10.14801/jkiit.2023.21.10.171
- 전윤식. 2024. 우리 기업이 주목할 만한 2024년 글로벌 기술 트렌드 전망: AI Everywhere All at Once. 한국무역협회, Trade Focus, 11.
- Brandl, S., Lassner, D., Baillot, A., and Nakajima, S. 2023. Domain-Specific Word Embeddings with Structure Prediction. Transactions of the Association for Computational Linguistics 11, 320-335. 10.1162/tacl_a_00538
- Hariri, W. 2023. Unlocking the Potential of ChatGPT: A Comprehensive Exploration of Its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing.
- Li, S., Zhang, X., and Wang, J. 2024. A Novel Optimization Scheme for Named Entity Recognition with Pre-trained Language Models. Journal of Electronic Research and Application 8.5, 125-133. 10.26689/jera.v8i5.8402
- Martínez, G., Molero, J. D., González, S., Conde, J., Brysbaert, M., and Reviriego, P. 2024. Using Large Language Models to Estimate Features of Multi-word Expressions: Concreteness, Valence, Arousal. Behavior Research Methods 57.5. 10.3758/s13428-024-02515-z
- Masini, F. 2019. Multi-word Expressions and Morphology. Oxford Research Encyclopedia of Linguistics. Oxford: Oxford University Press. 10.1093/acrefore/9780199384655.013.611
- Shang, J., Liu, J., Jiang, M., Ren, X., Voss, C. R., and Han, J. 2018. Automated Phrase Mining from Massive Text Corpora. IEEE Transactions on Knowledge and Data Engineering 30.10, 1825-1837. 10.1109/TKDE.2018.2812203 31105412 PMC6519941
- Veyseh, A. P. B., Meister, N., Dernoncourt, F., and Nguyen, T. H. 2022. Acronym Extraction and Acronym Disambiguation Shared Tasks at the Scientific Document Understanding Workshop.
- Wu, L., Yen, I. E., Xu, K., Xu, F., Balakrishnan, A., Chen, P. Y., and Witbrock, M. J. 2018. Word Mover’s Embedding: From Word2vec to Document Embedding. 10.18653/v1/D18-1482
- Publisher :The Modern Linguistic Society of Korea
- Publisher(Ko) :한국현대언어학회
- Journal Title :The Journal of Studies in Language
- Journal Title(Ko) :언어연구
- Volume : 41
- No :2
- Pages :109-126
- DOI :https://doi.org/10.18627/jslg.41.2.202508.109


The Journal of Studies in Language





