One-Shot Prompting for Russian Dependency Parsing
DOI:
https://doi.org/10.14529/jsfi250308Keywords:
LLM, parsing, dependency tree, one-shot, prompt-tuning, Russian languageAbstract
This study investigates the application of Large Language Models (LLMs) for dependency parsing of Russian sentences. We evaluated several models (including Qwen, RuAdapt, YandexGPT, T-pro, T-lite, and Llama) in a one-shot mode across multiple Russian treebanks: SynTagRus, GSD, PUD, Poetry, and Taiga. Among the models tested, Llama70 achieved the highest scores in both UAS and LAS. Furthermore, we observed a general trend where larger models tended to perform better. Our analysis also revealed that parsing quality for Qwen4 and RuAdapt4 on the Taiga treebank was notably sensitive to prompt design. However, the results from all LLMs remained lower than those obtained from classical neural parsers. A key challenge encountered by many models was a difference between generated token sets and gold token sets, which was observed in a considerable portion of each treebank. Additionally, the T-pro and T-lite models produced a significant number of extra lines. The implementation for this study is publicly available at https://github.com/Derinhelm/llm_parsing/tree/main.
References
Alonso, M.A., Gómez-Rodríguez, C., Vilares, J.: On the use of parsing for named entity recognition. Applied Sciences 11(3) (2021). https://doi.org/10.3390/app11031090
Bai, X., Wu, J., Chen, Y., et al.: Constituency parsing using LLMs. IEEE Transactions on Audio, Speech and Language Processing 33, 3762–3775 (2025). https://doi.org/10.1109/TASLPRO.2025.3600867
Corbetta, C., Passarotti, M., Moretti, G.: The Rise and Fall of Dependency Parsing in Dante Alighieri's Divine Comedy. In: Sprugnoli, R., Passarotti, M. (eds.) Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024. pp. 50–56. ELRA and ICCL (2024), https://aclanthology.org/2024.lt4hala-1.7/
Droganova, K., Lyashevskaya, O., Zeman, D.: Data Conversion and Consistency of Monolingual Corpora: Russian UD Treebanks. In: Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018). pp. 53–66. Linköping University Electronic Press, Linkping, Sweden (2018), https://ep.liu.se/ecp/155/007/ecp18155007.pdf
Ezquerro, A., Gómez-Rodríguez, C., Vilares, D.: Better benchmarking LLMs for zero-shot dependency parsing. In: Johansson, R., Stymne, S. (eds.) Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025). pp. 121–135. University of Tartu Library (2025), https://aclanthology.org/2025.nodalida-1.13/
Hromei, C.D., Croce, D., Basili, R.: U-DepPLLaMA: Universal Dependency Parsing via Auto-regressive Large Language Models. IJCoL. Italian Journal of Computational Linguistics 10(1) (2024). https://doi.org/10.4000/125nm
Kim, T.: Revisiting the practical effectiveness of constituency parse extraction from pretrained language models. In: Calzolari, N., Huang, C.R., Kim, H., et al. (eds.) Proceedings of the 29th International Conference on Computational Linguistics. pp. 5398–5408. International Committee on Computational Linguistics (2022), https://aclanthology.org/2022.coling-1.479/
Le-Hong, P., Cambria, E.: Integrating graph embedding and neural models for improving transition-based dependency parsing. Neural Computing and Applications 36, 2999–3016 (2024). https://doi.org/10.1007/s00521-023-09223-3
Lin, L., Ziyang, C., Shuxing, L., et al.: Event extraction in complex sentences based on dependency parsing and longformer. In: Nianyin, Z., Pachori, R.B. (eds.) Proceedings of 2024 International Conference on Machine Learning and Intelligent Computing. Proceedings of Machine Learning Research, vol. 245, pp. 1–7. PMLR (2024), https://proceedings.mlr.press/v245/lin24a.html
Liu, T., Sun, Y., Wu, J., et al.: Unsupervised paraphrasing under syntax knowledge. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 13273–13281 (2023). https://doi.org/10.1609/aaai.v37i11.26558
Morozov, D., Lagutina, K., Drozhashchikh, G., et al.: Exploring the feature space for cross-domain assessing the complexity of russian-language texts. In: 2024 Ivannikov Ispras Open Conference (ISPRAS). pp. 1–8 (2024). https://doi.org/10.1109/ISPRAS64596.2024.10899137
Nikolaev, I.E.: Knowledge and skills extraction from the job requirements texts. Ontology of Designing 13(2), 282–293 (2023). https://doi.org/10.18287/2223-9537-2023-13-2-282-293
Qi, P., Zhang, Y., Zhang, Y., et al.: Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 101–108 (2020)
Shamaeva, E.: Russian parser comparison. In: International Journal of Open Information Technologies Proceedings (2025)
Taufiq, U., Pulungan, R., Suyanto, Y.: Named entity recognition and dependency parsing for better concept extraction in summary obfuscation detection. Expert Systems with Applications 217, 119579 (2023). https://doi.org/10.1016/j.eswa.2023.119579
Tian, Y., Xia, F., Song, Y.: Large language models are no longer shallow parsers. Commun. ACM 58(4), 7131–7142 (2024). https://doi.org/10.18653/v1/2024.acl-long.384
Tikhomirov, M., Chernyshev, D.: Facilitating large language model russian adaptation with learned embedding propagation. Journal of Language and Education 10(4), 130–145 (2024). https://doi.org/10.17323/jle.2024.22224
Vasiliev, S., Korobkin, D., Fomenkov, S.: Extracting the Component Composition Data of Inventions from Russian Patents using Dependency Tree Analysis. In: 2023 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM). pp. 1030–1034 (2023). https://doi.org/10.1109/ICIEAM57311.2023.10139170
Zhou, H., Chersoni, E., Hsu, Y.Y.: Branching out: Exploration of chinese dependency parsing with fine-tuned large language models. In: Conference on Recent Advances in Natural Language Processing (RANLP 2025), Varna, September 8-10, 2025. pp. 1437–1445. Association for Computational Linguistics (2025). https://doi.org/10.26615/978-954-452-098-4-166
Downloads
Published
How to Cite
License
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-Non Commercial 3.0 License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.