Skip to main content

2024 | OriginalPaper | Buchkapitel

LongStory: Coherent, Complete and Length Controlled Long Story Generation

verfasst von : Kyeongman Park, Nakyeong Yang, Kyomin Jung

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model’s ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We covered a test where alpha is also a learnable parameter rather than a constant hyperparameter in Sect. 4.3.2. In this test, the BERT-tiny determines \(\alpha \), \(\beta \), and \(\gamma \) independently, and finally divides by their sum, so that each variable adds up to 1.
 
5
Note that in-self-BLEU score is not the same as self-BLEU score [6]. The self-BLEU score has taken one whole generated document as a hypothesis and the others as references, which cannot represent inner repetitiveness.
 
Literatur
2.
Zurück zum Zitat Lin, Z., Riedl, M.O.: Plug-and-blend: a framework for plug-and-play controllable story generation with sketches. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, No. 1, pp. 58–65 (2021) Lin, Z., Riedl, M.O.: Plug-and-blend: a framework for plug-and-play controllable story generation with sketches. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, No. 1, pp. 58–65 (2021)
4.
Zurück zum Zitat OpenAI. GPT-4 Technical Report (2023) OpenAI. GPT-4 Technical Report (2023)
5.
Zurück zum Zitat Peng, N., Ghazvininejad, M., May, J., Knight, K.: Towards controllable story generation. In: Proceedings of the First Workshop on Storytelling, pp. 43–49 (2018) Peng, N., Ghazvininejad, M., May, J., Knight, K.: Towards controllable story generation. In: Proceedings of the First Workshop on Storytelling, pp. 43–49 (2018)
6.
Zurück zum Zitat Rashkin, H., Celikyilmaz, A., Choi, Y., Gao, J.: PlotmaChines: outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:2004.14967 (2020) Rashkin, H., Celikyilmaz, A., Choi, Y., Gao, J.: PlotmaChines: outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:​2004.​14967 (2020)
7.
Zurück zum Zitat Yang, K., Peng, N., Tian, Y., Klein, D.: Re3: generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:2210.06774 (2022) Yang, K., Peng, N., Tian, Y., Klein, D.: Re3: generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:​2210.​06774 (2022)
8.
Zurück zum Zitat Tang, C., Lin, C., Huang, H., Guerin, F., Zhang, Z.: EtriCA: event-triggered context-aware story generation augmented by cross attention. arXiv preprint arXiv:2210.12463 (2022) Tang, C., Lin, C., Huang, H., Guerin, F., Zhang, Z.: EtriCA: event-triggered context-aware story generation augmented by cross attention. arXiv preprint arXiv:​2210.​12463 (2022)
9.
Zurück zum Zitat Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020) Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
10.
Zurück zum Zitat Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019) Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:​1910.​13461 (2019)
11.
Zurück zum Zitat Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20 (2010) Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20 (2010)
12.
Zurück zum Zitat Wang, S., Durrett, G., Erk, K.: Narrative interpolation for generating and understanding stories.arXiv preprint arXiv:2008.07466 (2020) Wang, S., Durrett, G., Erk, K.: Narrative interpolation for generating and understanding stories.arXiv preprint arXiv:​2008.​07466 (2020)
13.
Zurück zum Zitat Yang, K., Klein, D., Peng, N., Tian, Y.: DOC: improving long story coherence with detailed outline control. arXiv preprint arXiv:2212.10077 (2022) Yang, K., Klein, D., Peng, N., Tian, Y.: DOC: improving long story coherence with detailed outline control. arXiv preprint arXiv:​2212.​10077 (2022)
14.
Zurück zum Zitat Kryściński, W., Rajani, N., Agarwal, D., Xiong, C., Radev, D.: Booksum: a collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209 (2021) Kryściński, W., Rajani, N., Agarwal, D., Xiong, C., Radev, D.: Booksum: a collection of datasets for long-form narrative summarization. arXiv preprint arXiv:​2105.​08209 (2021)
15.
Zurück zum Zitat Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., Yan, R.: Plan-and-write: towards better automatic storytelling. InP: roceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7378–7385 (2019) Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., Yan, R.: Plan-and-write: towards better automatic storytelling. InP: roceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7378–7385 (2019)
16.
Zurück zum Zitat Alabdulkarim, A., Li, W., Martin, L.J., Riedl, M.O.: Goal-directed story generation: augmenting generative language models with reinforcement learning. arXiv preprint arXiv:2112.08593 (2021) Alabdulkarim, A., Li, W., Martin, L.J., Riedl, M.O.: Goal-directed story generation: augmenting generative language models with reinforcement learning. arXiv preprint arXiv:​2112.​08593 (2021)
17.
Zurück zum Zitat Pradyumna, T., Murtaza, D., Lara, J. M., Mehta, A., Harrison, B.: Controllable neural story plot generation via reward shaping. In: Proceedings of the International Joint Conference Artificial Intelligence, pp. 5982–5988 (2019) Pradyumna, T., Murtaza, D., Lara, J. M., Mehta, A., Harrison, B.: Controllable neural story plot generation via reward shaping. In: Proceedings of the International Joint Conference Artificial Intelligence, pp. 5982–5988 (2019)
18.
Zurück zum Zitat Guan, J., Huang, F., Zhao, Z., Zhu, X., Huang, M.: A knowledge-enhanced pretraining model for commonsense story generation. In: Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108 (2020) Guan, J., Huang, F., Zhao, Z., Zhu, X., Huang, M.: A knowledge-enhanced pretraining model for commonsense story generation. In: Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108 (2020)
19.
Zurück zum Zitat Peng, X., Li, S., Wiegreffe, S., Riedl, M.: Inferring the reader: guiding automated story generation with commonsense reasoning. arXiv preprint arXiv:2105.01311 (2021) Peng, X., Li, S., Wiegreffe, S., Riedl, M.: Inferring the reader: guiding automated story generation with commonsense reasoning. arXiv preprint arXiv:​2105.​01311 (2021)
20.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017) Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
21.
Zurück zum Zitat Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004) Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004)
22.
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002) Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
23.
Zurück zum Zitat Safovich, Y., Azaria, A.: Fiction sentence expansion and enhancement via focused objective and novelty curve sampling. In: 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 835–843. IEEE (2020) Safovich, Y., Azaria, A.: Fiction sentence expansion and enhancement via focused objective and novelty curve sampling. In: 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 835–843. IEEE (2020)
24.
Zurück zum Zitat Li, J., Bing, L., Qiu, L., Chen, D., Zhao, D., Yan, R.: Learning to write stories with thematic consistency and wording novelty. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, vol. 01, pp. 1715–1722 (2019) Li, J., Bing, L., Qiu, L., Chen, D., Zhao, D., Yan, R.: Learning to write stories with thematic consistency and wording novelty. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, vol. 01, pp. 1715–1722 (2019)
25.
Zurück zum Zitat Hu, Z., Chan, H.P., Liu, J., Xiao, X., Wu, H., Huang, L.: Planet: dynamic content planning in autoregressive transformers for long-form text generation. arXiv preprint arXiv:2203.09100 (2022) Hu, Z., Chan, H.P., Liu, J., Xiao, X., Wu, H., Huang, L.: Planet: dynamic content planning in autoregressive transformers for long-form text generation. arXiv preprint arXiv:​2203.​09100 (2022)
26.
27.
Zurück zum Zitat Sakaguchi, K., Bhagavatula, C., Bras, R. L., Tandon, N., Clark, P., Choi, Y.: proscript: partially ordered scripts generation via pre-trained language models. arXiv preprint arXiv:2104.08251 (2021) Sakaguchi, K., Bhagavatula, C., Bras, R. L., Tandon, N., Clark, P., Choi, Y.: proscript: partially ordered scripts generation via pre-trained language models. arXiv preprint arXiv:​2104.​08251 (2021)
28.
Zurück zum Zitat Budzianowski, P., Vulić, I.: Hello, it’s GPT-2–how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:1907.05774 (2019) Budzianowski, P., Vulić, I.: Hello, it’s GPT-2–how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:​1907.​05774 (2019)
29.
Zurück zum Zitat Welleck, S., Kulikov, I., Kim, J., Pang, R.Y., Cho, K.: Consistency of a recurrent language model with respect to incomplete decoding. arXiv preprint arXiv:2002.02492 (2020) Welleck, S., Kulikov, I., Kim, J., Pang, R.Y., Cho, K.: Consistency of a recurrent language model with respect to incomplete decoding. arXiv preprint arXiv:​2002.​02492 (2020)
30.
Zurück zum Zitat Zellers, R., et al.: Defending against neural fake news. In: Advances in Neural Information Processing Systems, vol. 32 (2019) Zellers, R., et al.: Defending against neural fake news. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
31.
Zurück zum Zitat Guan, J., Mao, X., Fan, C., Liu, Z., Ding, W., Huang, M.: Long text generation by modeling sentence-level and discourse-level coherence. arXiv preprint arXiv:2105.08963 (2021) Guan, J., Mao, X., Fan, C., Liu, Z., Ding, W., Huang, M.: Long text generation by modeling sentence-level and discourse-level coherence. arXiv preprint arXiv:​2105.​08963 (2021)
32.
Zurück zum Zitat Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020) Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
33.
Zurück zum Zitat McCoy, R.T., Smolensky, P., Linzen, T., Gao, J., Celikyilmaz, A.: How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven. Trans. Assoc. Comput. Linguist. 11, 652–670 (2023)CrossRef McCoy, R.T., Smolensky, P., Linzen, T., Gao, J., Celikyilmaz, A.: How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven. Trans. Assoc. Comput. Linguist. 11, 652–670 (2023)CrossRef
Metadaten
Titel
LongStory: Coherent, Complete and Length Controlled Long Story Generation
verfasst von
Kyeongman Park
Nakyeong Yang
Kyomin Jung
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2253-2_15

Premium Partner