Еволюція нейронних моделей генерування тексту: систематичний огляд досліджень 2022–2024 років

Артем Слободянюк; Сергій Семеріков

doi:10.31558/2786-9482.2024.2.4

Автор(и)

Артем Слободянюк Криворізький державний педагогічний університет https://orcid.org/0009-0007-9425-1255
Сергій Семеріков Криворізький державний педагогічний університет; Інститут цифровізації освіти НАПН України; Державний університет “Житомирська політехніка”; Криворізький національний університет; Академія когнітивних та природничих наук https://orcid.org/0000-0003-0789-0272

DOI:

https://doi.org/10.31558/2786-9482.2024.2.4

Ключові слова:

нейроне генерування тексту, глибоке навчання, систематичний огляд, обробка природної мови, метрика, набори даних, низькоресурсні мови, трансформери, механізми уваги

Анотація

Останні роки характеризуються значним прогресом у сфері нейронного генерування тексту завдяки появі великих мовних моделей та зростанню інтересу до цієї галузі. Цей систематичний огляд ідентифікує та узагальнює сучасні тенденції, підходи та методи нейронного генерування тексту за період 2022–2024 рр., доповнюючи попередній огляд за 2015–2021 рр. Відповідно до методології PRISMA, для аналізу було початково відібрано 89 статей з бази даних Scopus, із яких після перевірки критеріїв включення та виключення залишилося 43 статті. Виявлено зміщення акценту в бік інноваційних архітектур моделей, як-от Transformer-based (GPT-2, GPT-3, BERT), механізмів уваги та контрольованого генерування тексту. Метрики BLEU, ROUGE та оцінювання людиною залишаються найпопулярнішими. Але з’явилися і нові метрики, поміж яких виділимо BERTScore. Набори даних охоплюють різноманітні домени і типи даних; спостерігається зростання інтересу до неанотованих даних. Сфери застосування розширилися до областей генерування тексту на основі таблиць та графів знань, синтезу анотацій та машинного перекладу. У галузевому плані виділяється генерування медичних текстів. Хоча англійська мова продовжує домінувати, але спостерігається зростання досліджень для низькоресурсних мов, зокрема до німецької та китайської. Огляд також висвітлює актуальні виклики в цій галузі, зокрема адаптацію моделей для низькоресурсних мов, генерування тексту за умов обмеженості навчальних даних та етичні аспекти використання потужних мовних моделей. Автори підкреслюють важливість розробки більш ефективних та інтерпретовних архітектур, вдосконалення методів контрольованого генерування тексту та створення нових оцінювальних метрик. Результати дослідження підкреслюють швидку еволюцію методів нейронного генерування тексту, розширення сфер його застосування. В огляді також окреслено перспективні напрями для майбутніх досліджень з урахуванням актуальних викликів та етичних принципів.

Посилання

Ganegedara, T. (2018). Natural Language Processing with TensorFlow: Teach language to machines using Python’s deep learning library. Packt Publishing. https://tinyurl.com/3xps3c5u

Fatima, N., Imran, A. S., Kastrati, Z., Daudpota, S. M., & Soomro, A. (2022). A systematic literature review on text generation using deep neural network models. IEEE Access, 10, 53490-53503. https://doi.org/10.1109/ACCESS.2022.3174108

OpenAI (2022). Introducing ChatGPT. https://openai.com/blog/chatgpt

(2023). Large language models – Google Trends. https://trends.google.com/trends/explore?date=2022-01-01%202023-12-21&q=large%20language%20models&hl=en

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., … Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. https://doi.org/10.1136/bmj.n71

Bas, A., Topal, M. O., Duman, Ç., & Van Heerden, I. (2022). A brief history of deep learning-based text generation. In J. M. Alja’Am, S. AlMaadeed, S. A. Elseoud, & O. Karam (eds.), Proceedings of the International Conference on Computer and Applications (pp. 1–4). IEEE. https://doi.org/10.1109/ICCA56443.2022.10039545

Zhu, J., Ma, X., Lin, Z., & De Meo, P. (2023). A quantum-like approach for text generation from knowledge graphs. CAAI Transactions on Intelligence Technology. https://doi.org/10.1049/cit2.12178

Zhang, H., Song, H., Li, S., Zhou, M., & Song, D. (2023). A survey of controllable text generation using transformer-based pre-trained language models. ACM Computing Surveys, 56, 64. https://doi.org/10.1145/3617680

Yu, W., Zhu, C., Li, Z., Hu, Z., Wang, Q., Ji, H., & Jiang, M. (2022). A survey of knowledge-enhanced text generation. ACM Computing Surveys, 54, 227. https://doi.org/10.1145/3512467

Wu, T., Wang, H., Zeng, Z., Wang, W., Zheng, H.-T., & Zhang, J. (2023). Enhancing text generation with cooperative training. Frontiers in Artificial Intelligence and Applications, 372, 2704-2711. https://doi.org/10.3233/FAIA230579

Du, H., Xing, W., & Pei, B. (2023). Automatic text generation using deep learning: providing large-scale support for online learning communities. Interactive Learning Environments, 31, 5021–5036. https://doi.org/10.1080/10494820.2021.1993932

Chen, Q., Sun, H., Liu, H., Jiang, Y., Ran, T., Jin, X., … Niu, Z. (2023). An extensive benchmark study on biomedical text generation and mining with ChatGPT. Bioinformatics, 39, btad557. https://doi.org/10.1093/bioinformatics/btad557

Alonso, I., Agirre, E. (2024). Automatic logical forms improve fidelity in table-to-text generation. Expert Systems with Applications, 238. https://doi.org/10.1016/j.eswa.2023.121869

Kreiss, E., Fang, F., Goodman, N. D., & Potts, C. (2022). Concadia: Towards image-based text generation with a purpose. In Y. Goldberg, Z. Kozareva, & Y. Zhang (eds.). Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 4667–4684). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.308

Rao, K. Y., Rao, K. S., & Narayana, S. V. S. (2023). Conditional-aware sequential text generation in knowledge-enhanced conversational recommendation system. Journal of Theoretical and Applied Information Technology, 101, 2820–2836. http://www.jatit.org/volumes/Vol101No7/30Vol101No7.pdf

Tazalli, T., Aunshu, Z. A., Liya, S. S., Hossain, M., Mehjabeen, Z., Ahmed, M. S., & Hossain, M. I. (2022). Computer vision-based Bengali sign language to text generation. In 5th IEEE International Image Processing, Applications and Systems Conference (pp. 1–6). IEEE. https://doi.org/10.1109/IPAS55744.2022.10052928

Teng, Z., Chen, C., Zhang, Y., & Zhang, Y. (2022). Contrastive latent variable models for neural text generation. In J. Cussens & K. Zhang (Eds.), Proceedings of Machine Learning Research (Vol. 180, pp. 1928–1938). ML Research Press. https://proceedings.mlr.press/v180/teng22a.html

An, C., Feng, J., Lv, K., Kong, L., Qiu, X., Huang, X. (2022). CONT: contrastive neural text generation. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS’22 (p. 160). Curran Associates Inc., Red Hook, NY, USA. https://dl.acm.org/doi/10.5555/3600270.3600430

Seo, H., Jung, S., Jung, J., Hwang, T., Namgoong, H., & Roh, Y.-H. (2023). Controllable text generation using semantic control grammar. IEEE Access, 11, 26329-26343. https://doi.org/10.1109/ACCESS.2023.3252017

Zhou, W., Jiang, Y. E., Wilcox, E., Cotterell, R., & Sachan, M. (2023). Controlled text generation with natural language instructions. In A. Krause, E. Brunskill, C. K., B. Engelhardt, S. Sabato, & J. Scarlett (eds.). Proceedings of Machine Learning Research (Vol. 202, pp. 42602–42613). ML Research Press. https://proceedings.mlr.press/v202/zhou23g/zhou23g.pdf

Bayer, M., Kaufhold, M.-A., Buchhold, B., Keller, M., Dallmeyer, J., Reuter, C. (2023). Data augmentation in natural language processing: a novel text generation approach for long and short text classifiers. International Journal of Machine Learning and Cybernetics, 14, 135–150. https://doi.org/10.1007/s13042-022-01553-3

Hong, S., Moon, S., Kim, J., Lee, S., Kim, M., Lee, D., & Kim, J.-Y. (2022). DFX: A low-latency multi-FPGA appliance for accelerating transformer-based text generation. In Proceedings of the Annual International Symposium on Microarchitecture (pp. 616–630). IEEE Computer Society. https://doi.org/10.1109/MICRO56248.2022.00051

Ghazvininejad, M., Karpukhin, V., Gor, V., & Celikyilmaz, A. (2022). Discourse-aware soft prompting for text generation. In Y. Goldberg, Z. Kozareva, & Y. Zhang (eds.). Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 4570–4589). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.303

Koplin, J. J. (2023). Dual-use implications of AI text generation. Ethics and Information Technology, 25, 32. https://doi.org/10.1007/s10676-023-09703-z

Pautrat-Lertora, A., Perez-Lozano, R., & Ugarte, W. (2022). EGAN: Generatives adversarial networks for text generation with sentiments. In F. Coenen, A. Fred, & J. Filipe (eds.). International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (Vol. 1, pp. 249-256). Science and Technology Publications. https://doi.org/10.5220/0011548100003335

Wu, J., Guo, Y., Gao, C., & Sun, J. (2023). An automatic text generation algorithm of technical disclosure for catenary construction based on knowledge element model. Advanced Engineering Informatics, 56, 101913. https://doi.org/10.1016/j.aei.2023.101913

Li, Y., Cui, L., Yan, J., Yin, Y., Bi, W., Shi, S., & Zhang, Y. (2023). Explicit syntactic guidance for neural text generation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 14095–14112). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.788

Chu, X. (2022). Feature extraction and intelligent text generation of digital music. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/7952259

Shahriar, S. (2022). GAN computers generate arts? A survey on visual arts, music, and literary text generation using generative adversarial network. Displays, 73, 102237. https://doi.org/10.1016/j.displa.2022.102237

Strobelt, H., Kinley, J., Krueger, R., Beyer, J., Pfister, H., & Rush, A. M. (2022). GenNI: Human-AI collaboration for data-backed text generation. IEEE Transactions on Visualization and Computer Graphics, 28, 1106–1116. https://doi.org/10.1109/TVCG.2021.3114845

Yin, X., & Wan, X. (2022). How do Seq2Seq models perform on end-to-end data-to-text generation? In S. Muresan, P. Nakov, & A. Villavicencio (eds.). Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 7701–7710). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.531

Montella, S., Nasr, A., Heinecke, J., Bechet, F., & Rojas-Barahona, L. M. (2023). Investigating the effect of relative positional embeddings on AMR-to-text generation with structural adapters. In EACL 2023 – 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 727–736). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.51

Fatima, N., Daudpota, S. M., Kastrati, Z., Imran, A. S., Hassan, S., Elmitwally, N. S. (2023). Improving news headline text generation quality through frequent POS-Tag patterns analysis. Engineering Applications of Artificial Intelligence, 125, 106718. https://doi.org/10.1016/j.engappai.2023.106718

Seifossadat, E., & Sameti, H. (2024). Improving semantic coverage of data-to-text generation model using dynamic memory networks. Natural Language Engineering, 30, 454-479. https://doi.org/10.1017/S1351324923000207

Meyer, C., Adkins, D., Pal, K., Galici, R., Garcia-Agundez, A., & Eickhoff, C. (2023). Neural text generation in regulatory medical writing. Frontiers in Pharmacology, 14. https://doi.org/10.3389/fphar.2023.1086913

Lu, X., Welleck, S., West, P., Jiang, J., Kasai, D., Khashabi, R., Le Bras, L., Qin, Y., Yu, R., Zellers, N. A., Smith, Y., & Choi, Y. (2022). NEUROLOGIC AFesque decoding: Constrained text generation with lookahead heuristics. In NAACL 2022 – 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 780–799). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.naacl-main.57

Xu, W., Tuan, Y., Lu, Y., Saxon, M., Li, L., & Wang, W. Y. (2022). Not all errors are equal: Learning text generation metrics using stratified error synthesis. In Y. Goldberg, Z. Kozareva, & Y. Zhang (eds.). Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 6588–6603). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.489

Hanafi, A., Bouhorma, M., & Elaachak, L. (2022). Machine learning-based augmented reality for improved text generation through recurrent neural networks. Journal of Theoretical and Applied Information Technology, 100, 518–530. http://www.jatit.org/volumes/Vol100No2/18Vol100No2.pdf

Le, H., Le, D.-T., Weber, V., Church, C., Rottmann, K., Bradford, M., & Chin, P. (2022). Semi-supervised adversarial text generation based on Seq2Seq models. In EMNLP 2022 – Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track (pp. 264–272). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-industry.26

Yue, X., Inan, H. A., Li, X., Kumar, G., McAnallen, J., Shajari, H., Sun, H., Levitan, D., & Sim, R. (2023). Synthetic text generation with differential privacy: A simple and practical recipe. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 1321–1342). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.74

Lin, Z., Gong, Y., Shen, Y., Wu, T., Fan, Z., Lin, C., Duan, N., & Chen, W. (2023). Text generation with diffusion language models: A pre-training approach with continuous paragraph denoise. In Proceedings of the 40th International Conference on Machine Learning. JMLR.org. https://dl.acm.org/doi/abs/10.5555/3618408.3619275

Amin, M. S., Mazzei, A., Anselma, L. (2022). Towards Data Augmentation for DRS-to-Text Generation. CEUR Workshop Proceedings, 3287, 141–152. https://ceur-ws.org/Vol-3287/paper14.pdf

Chen, M., Lu, X., Xu, T., Li, Y., Zhou, J., Dou, D., Xiong, H. (2022). Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach. In Y. Goldberg, Z. Kozareva, Y. Zhang (eds.). Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 8199–8210). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.562

Agarwal, V., Ghosh, S., BSS, H., Arora, H., Raja, B. R. K. (2024). TrICy: Trigger-Guided Data-to-Text Generation With Intent Aware Attention-Copy. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32, 1173–1184. https://doi.org/10.1109/TASLP.2024.3353574

Si, W. M., Backes, M., Zhang, Y., & Salem, A. (2023). Two-in-one: A model hijacking attack against text generation models. In 32nd USENIX Security Symposium (Vol. 3, pp. 2223–2240). USENIX Association. https://www.usenix.org/system/files/usenixsecurity23-si.pdf

Gong, H., Feng, X., & Qin, B. (2023). Quality control for distantly-supervised data-to-text generation via meta learning. Applied Sciences, 13, 5573. https://doi.org/10.3390/app13095573

Mou, L. (2022). Search and learning for unsupervised text generation. AI Magazine, 43, 344-352. https://doi.org/10.1002/aaai.12068

Taunk, D., Sagare, S., Patil, A., Subramanian, S., Gupta, M., & Varma, V. (2023). XWikiGen: Cross-lingual summarization for encyclopedic text generation in low resource languages. In ACM Web Conference 2023 – Proceedings of the World Wide Web Conference (pp. 1703–1713). Association for Computing Machinery. https://doi.org/10.1145/3543507.3583405

Introducing the next generation of Claude (2024). https://www.anthropic.com/news/claude-3-family

Awesomegpts.ai (2024). Scholar GPT. https://chatgpt.com/g/g-kZ0eYXlJe-scholar-gpt?oai-dm=1

Slobodianiuk, A. V. (2024). Papers’ review. https://docs.google.com/spreadsheets/d/e/2PACX-1vR6ZUaeeBjVgVl-do6QXm-Pua-HdztOxjC4DUqunrSDZ_-YSRz-Ng9xktYH9b0LDT502SiVy3YePx9F/pubhtml

Hugging Face (2024). Languages. https://huggingface.co/languages