nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

LongStory: Coherent, Complete and Length Controlled Long Story Generation

verfasst von : Kyeongman Park, Nakyeong Yang, Kyomin Jung

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model’s ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Towards Cost-Efficient Federated Multi-agent RL with Learnable Aggregation

Nächstes Kapitel Relation-Aware Label Smoothing for Self-KD

We covered a test where alpha is also a learnable parameter rather than a constant hyperparameter in Sect. 4.3.2. In this test, the BERT-tiny determines \(\alpha \), \(\beta \), and \(\gamma \) independently, and finally divides by their sum, so that each variable adds up to 1.

https://huggingface.co/facebook/bart-large.

We crawl this dataset from https://blog.reedsy.com/short-stories/..

https://github.com/hrashkin/plotmachines.

Note that in-self-BLEU score is not the same as self-BLEU score [6]. The self-BLEU score has taken one whole generated document as a hypothesis and the others as references, which cannot represent inner repetitiveness.

Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)

Lin, Z., Riedl, M.O.: Plug-and-blend: a framework for plug-and-play controllable story generation with sketches. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, No. 1, pp. 58–65 (2021)

Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 (2018)

OpenAI. GPT-4 Technical Report (2023)

Peng, N., Ghazvininejad, M., May, J., Knight, K.: Towards controllable story generation. In: Proceedings of the First Workshop on Storytelling, pp. 43–49 (2018)

Rashkin, H., Celikyilmaz, A., Choi, Y., Gao, J.: PlotmaChines: outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:2004.14967 (2020)

Yang, K., Peng, N., Tian, Y., Klein, D.: Re3: generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:2210.06774 (2022)

Tang, C., Lin, C., Huang, H., Guerin, F., Zhang, Z.: EtriCA: event-triggered context-aware story generation augmented by cross attention. arXiv preprint arXiv:2210.12463 (2022)

Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)

10.

Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)

11.

Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20 (2010)

12.

Wang, S., Durrett, G., Erk, K.: Narrative interpolation for generating and understanding stories.arXiv preprint arXiv:2008.07466 (2020)

13.

Yang, K., Klein, D., Peng, N., Tian, Y.: DOC: improving long story coherence with detailed outline control. arXiv preprint arXiv:2212.10077 (2022)

14.

Kryściński, W., Rajani, N., Agarwal, D., Xiong, C., Radev, D.: Booksum: a collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209 (2021)

15.

Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., Yan, R.: Plan-and-write: towards better automatic storytelling. InP: roceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7378–7385 (2019)

16.

Alabdulkarim, A., Li, W., Martin, L.J., Riedl, M.O.: Goal-directed story generation: augmenting generative language models with reinforcement learning. arXiv preprint arXiv:2112.08593 (2021)

17.

Pradyumna, T., Murtaza, D., Lara, J. M., Mehta, A., Harrison, B.: Controllable neural story plot generation via reward shaping. In: Proceedings of the International Joint Conference Artificial Intelligence, pp. 5982–5988 (2019)

18.

Guan, J., Huang, F., Zhao, Z., Zhu, X., Huang, M.: A knowledge-enhanced pretraining model for commonsense story generation. In: Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108 (2020)

19.

Peng, X., Li, S., Wiegreffe, S., Riedl, M.: Inferring the reader: guiding automated story generation with commonsense reasoning. arXiv preprint arXiv:2105.01311 (2021)

20.

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

21.

Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004)

22.

Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

23.

Safovich, Y., Azaria, A.: Fiction sentence expansion and enhancement via focused objective and novelty curve sampling. In: 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 835–843. IEEE (2020)

24.

Li, J., Bing, L., Qiu, L., Chen, D., Zhao, D., Yan, R.: Learning to write stories with thematic consistency and wording novelty. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, vol. 01, pp. 1715–1722 (2019)

25.

Hu, Z., Chan, H.P., Liu, J., Xiao, X., Wu, H., Huang, L.: Planet: dynamic content planning in autoregressive transformers for long-form text generation. arXiv preprint arXiv:2203.09100 (2022)

26.

Yang, K., Klein, D.: FUDGE: controlled text generation with future discriminators. arXiv preprint arXiv:2104.05218 (2021)

27.

Sakaguchi, K., Bhagavatula, C., Bras, R. L., Tandon, N., Clark, P., Choi, Y.: proscript: partially ordered scripts generation via pre-trained language models. arXiv preprint arXiv:2104.08251 (2021)

28.

Budzianowski, P., Vulić, I.: Hello, it’s GPT-2–how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:1907.05774 (2019)

29.

Welleck, S., Kulikov, I., Kim, J., Pang, R.Y., Cho, K.: Consistency of a recurrent language model with respect to incomplete decoding. arXiv preprint arXiv:2002.02492 (2020)

30.

Zellers, R., et al.: Defending against neural fake news. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

31.

Guan, J., Mao, X., Fan, C., Liu, Z., Ding, W., Huang, M.: Long text generation by modeling sentence-level and discourse-level coherence. arXiv preprint arXiv:2105.08963 (2021)

32.

Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)

33.

McCoy, R.T., Smolensky, P., Linzen, T., Gao, J., Celikyilmaz, A.: How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven. Trans. Assoc. Comput. Linguist. 11, 652–670 (2023)CrossRef

Titel: LongStory: Coherent, Complete and Length Controlled Long Story Generation
verfasst von: Kyeongman Park
Nakyeong Yang
Kyomin Jung
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_15

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner