Document-Level Approach to Extracting Argumentation Structures from the Russian Texts of Scientific Communication

Authors

DOI:

https://doi.org/10.14529/jsfi250304

Keywords:

argument mining, document-level argument relation prediction, long-range argumentative relation, text2text generative language model, scientific communication

Abstract

The study addresses the problem of automatic extraction of argumentative structures in scientific communication texts in Russian. Such texts are characterized by a branched logical structure, including distant references and interrelations. To address these complexities, recent methodological advances attempt to leverage the text itself as a contextual foundation for extracting connections. This study presents a generative approach for extracting argumentative relations, reframing the prediction task as a problem of generating marked-up text and making it an end-to-end approach, rather than the traditional pipeline. Two Russian-language corpora were used in the experiments: the translated corpus of microtexts ruMTC and the annotated corpus of scientific communication texts ArgNetSC. A comparative analysis was conducted to evaluate the performance of T5 architecture models trained with supervised fine-tuning (SFT) and Large Language Models on various Russian-language datasets. To facilitate the analysis of long texts, a text segmentation method using a sliding window was proposed. The evaluation revealed that the highest performance in argumentative relation extraction was consistently achieved on the corpus of microtexts. Notably, the smaller models fine-tuned using the SFT method and large language models that were prompted to generate marked texts demonstrated comparable performance (F1 ~ 0.32–0.37). For larger texts, however, this trend did not persist, as the FRED-T5 model outperformed all other models with F1 ~ 0.23 on texts related to the genre of scientific articles.

References

Accuosto, P., Saggion, H.: Mining arguments in scientific abstracts with discourse-level embeddings. Data & Knowledge Engineering 129, 101840 (2020). https://doi.org/10.1016/j.datak.2020.101840

Akhmadeeva, I.R., Kononenko, I., Sidorova, E., Shestakov, V.: Using rhetorical structures to analyze argumentation in scientific communication texts. Computational Linguistics and Intellectual Technologies (2025), https://api.semanticscholar.org/CorpusID:280935169

Bao, J., He, Y., Sun, Y., et al.: A generative model for end-to-end argument mining with reconstructed positional encoding and constrained pointer mechanism. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. pp. 10437–10449. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (dec 2022). https://doi.org/10.18653/v1/2022.emnlp-main.713

Bao, J., Jing, M., Dong, K., et al.: UniASA: A Unified Generative Framework for Argument Structure Analysis. Computational Linguistics 51(3), 739–784 (09 2025). https://doi.org/10.1162/coli_a_00553

Cabrio, E., Villata, S.: Node: A benchmark of natural language arguments. In: Computational Models of Argument, pp. 449–450. IOS Press (2014). https://doi.org/10.3233/978-1-61499-436-7-449

Chen, Z., Chen, L., Chen, B., et al.: UniDU: Towards a unified generative dialogue understanding framework. In: Lemon, O., Hakkani-Tur, D., Li, J.J., et al. (eds.) Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue. pp. 442–455. Association for Computational Linguistics, Edinburgh, UK (sep 2022). https://doi.org/10.18653/v1/2022.sigdial-1.43

Chistova, E.: End-to-end argument mining over varying rhetorical structures. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Findings of the Association for Computational Linguistics: ACL 2023. pp. 3376–3391. Association for Computational Linguistics, Toronto, Canada (jul 2023). https://doi.org/10.18653/v1/2023.findings-acl.209

Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: International Conference on Learning Representations. Toulon, France (apr 2017), https://openreview.net/forum?id=Hk95PK9le

Du, X., Li, S., Ji, H.: Dynamic global memory for document-level argument extraction. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 5264–5275. Association for Computational Linguistics, Dublin, Ireland (may 2022). https://doi.org/10.18653/v1/2022.acl-long.361

Fishcheva, I., Kotelnikov, E.: Cross-Lingual Argumentation Mining for Russian Texts. In: van der Aalst, W.M.P., Batagelj, V., Ignatov, D.I., et al. (eds.) Analysis of Images, Social Networks and Texts. pp. 134–144. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-37334-4_12

Galassi, A., Lippi, M., Torroni, P.: Multi-task attentive residual networks for argument mining. IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1877–1892 (2023). https://doi.org/10.1109/TASLP.2023.3275040

Hu, X., Wan, X.: RST Discourse Parsing as Text-to-Text Generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 3278–3289 (2023). https://doi.org/10.1109/TASLP.2023.3306710

Kawarada, M., Hirao, T., Uchida, W., Nagata, M.: Argument mining as a text-to-text generation task. In: Graham, Y., Purver, M. (eds.) Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 2002–2014. Association for Computational Linguistics, St. Julian’s, Malta (mar 2024). https://doi.org/10.18653/v1/2024.eacl-long.121

Kotelnikov, E., Loukachevitch, N., Nikishina, I., Panchenko, A.: RuArg-2022: Argument Mining Evaluation. pp. 333–348 (06 2022). https://doi.org/10.28995/2075-7182-2022-21-333-348

Lauscher, A., Glavaš, G., Ponzetto, S.P.: An argument-annotated corpus of scientific publications. In: Slonim, N., Aharonov, R. (eds.) Proceedings of the 5th Workshop on Argument Mining. pp. 40–46. Association for Computational Linguistics, Brussels, Belgium (nov 2018). https://doi.org/10.18653/v1/W18-5206

Lawrence, J., Reed, C.: Argument mining: A survey. Computational Linguistics 45(4), 765–818 (01 2020). https://doi.org/10.1162/coli_a_00364

Li, S., Ji, H., Han, J.: Document-level event argument extraction by conditional generation. In: Toutanova, K., Rumshisky, A., Zettlemoyer, L., et al. (eds.) Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 894–908. Association for Computational Linguistics, Online (jun 2021). https://doi.org/10.18653/v1/2021.naacl-main.69

Li, Z., Lin, T.E., Wu, Y., et al.: UniSA: Unified Generative Framework for Sentiment Analysis. In: Proceedings of the 31st ACM International Conference on Multimedia. pp. 6132–6142. MM ’23, Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3581783.3612336

Liu, J., Chen, Y., Xu, J.: Machine reading comprehension as data augmentation: A case study on implicit event argument extraction. In: Moens, M.F., Huang, X., Specia, L., Yih, S.W.t. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp. 2716–2725. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (nov 2021). https://doi.org/10.18653/v1/2021.emnlp-main.214

Lu, Y., Liu, Q., Dai, D., et al.: Unified structure generation for universal information extraction. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 5755–5772. Association for Computational Linguistics, Dublin, Ireland (may 2022). https://doi.org/10.18653/v1/2022.acl-long.395

Mayer, T., Marro, S., Cabrio, E., Villata, S.: Enhancing evidence-based medicine with natural language argumentative analysis of clinical trials. Artificial Intelligence in Medicine 118, 102098 (2021). https://doi.org/10.1016/j.artmed.2021.102098

Niculae, V., Park, J., Cardie, C.: Argument mining with structured SVMs and RNNs. In: Barzilay, R., Kan, M.Y. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 985–995. Association for Computational Linguistics, Vancouver, Canada (jul 2017). https://doi.org/10.18653/v1/P17-1091

Paolini, G., Athiwaratkun, B., Krone, J., et al.: Structured prediction as translation between augmented natural languages. In: International Conference on Learning Representations (2021), https://openreview.net/forum?id=US-TP-xnXI

Peldszus, A., Stede, M.: An annotated corpus of argumentative microtexts. In: Argumentation and Reasoned Action: Proceedings of the 1st European Conference on Argumentation, Lisbon 2015. vol. 2, pp. 801–815. College Publications, London (2016)

Sidorova, E., Akhmadeeva, I., Zagorulko, Y., et al.: An integrated approach to the analysis of argumentative relationships in scientific communication texts. Ontology of Designing 13(4), 562–579 (12 2023). https://doi.org/10.18287/2223-9537-2023-13-4-562-579, (in Russian)

Srivastava, P., Bhatnagar, P., Goel, A.: Argument Mining using BERT and Self-Attention based Embeddings. In: 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N). pp. 1536–1540 (2022). https://doi.org/10.1109/ICAC3N56670.2022.10074559

Stab, C., Gurevych, I.: Annotating argument components and relations in persuasive essays. In: Tsujii, J., Hajic, J. (eds.) Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. pp. 1501–1510. Dublin City University and Association for Computational Linguistics, Dublin, Ireland (aug 2014), https://aclanthology.org/C14-1142/

Stab, C., Gurevych, I.: Parsing argumentation structures in persuasive essays. Computational Linguistics 43(3), 619–659 (09 2017). https://doi.org/10.1162/COLI_a_00295

Strubell, E., Verga, P., Andor, D., et al.: Linguistically-informed self-attention for semantic role labeling. In: Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. pp. 5027–5038. Association for Computational Linguistics, Brussels, Belgium (oct – nov 2018). https://doi.org/10.18653/v1/D18-1548

Timofeeva, M., Ilina, D., Kononenko, I.: Argumentative annotation of the scientific Internet-communication corpus: Genre analysis and study of typical reasoning models based on the ArgNetBank Studio platform. NSU Vestnik 22(1), 27–49 (2024). https://doi.org/10.25205/1818-7935-2024-22-1-27-49, (in Russian)

Toulmin, S.E.: The Uses of Argument. Cambridge University Press, 2 edn. (2003). https://doi.org/10.1017/CBO9780511840005

Walker, M., Tree, J.F., Anand, P., et al.: A corpus for research on deliberation and debate. In: Calzolari, N., Choukri, K., Declerck, T., et al. (eds.) Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). pp. 812–817. European Language Resources Association (ELRA), Istanbul, Turkey (may 2012), https://aclanthology.org/L12-1643/

Xu, H., Ashley, K.: Multi-granularity argument mining in legal texts. In: Legal Knowledge and Information Systems, pp. 261–266. Frontiers in Artificial Intelligence and Applications, IOS Press (2022). https://doi.org/10.3233/FAIA220477

Yan, H., Gui, T., Dai, J., et al.: A unified generative framework for various NER subtasks. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 5808–5822. Association for Computational Linguistics, Online (aug 2021). https://doi.org/10.18653/v1/2021.acl-long.451

Ye, Y., Teufel, S.: End-to-end argument mining as biaffine dependency parsing. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. pp. 669–678. Association for Computational Linguistics, Online (apr 2021). https://doi.org/10.18653/v1/2021.eacl-main.55

Downloads

Published

2025-12-25

How to Cite

Sidorova, E. A., Akhmadeeva, I. R., Ilina, D. V., Kononenko, I. S., Sery, A. S., & Zagorulko, Y. A. (2025). Document-Level Approach to Extracting Argumentation Structures from the Russian Texts of Scientific Communication. Supercomputing Frontiers and Innovations, 12(3), 47–62. https://doi.org/10.14529/jsfi250304