Skip to main content

2024 | OriginalPaper | Buchkapitel

From Low Resource Information Extraction to Identifying Influential Nodes in Knowledge Graphs

verfasst von : Erica Cai, Olga Simek, Benjamin A. Miller, Danielle Sullivan, Evan Young, Christopher L. Smith

Erschienen in: Complex Networks XV

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a pipeline for identifying important entities from intelligence reports that constructs a knowledge graph, where nodes correspond to entities of fine-grained types (e.g. traffickers) extracted from the text and edges correspond to extracted relations between entities (e.g. cartel membership). The important entities in intelligence reports then map to central nodes in the knowledge graph. We introduce a novel method that extracts fine-grained entities in a few-shot setting (few labeled examples), given limited resources available to label the frequently changing entity types that intelligence analysts are interested in. It outperforms other state-of-the-art methods. Next, we identify challenges facing previous evaluations of zero-shot (no labeled examples) methods for extracting relations, affecting the step of populating edges. Finally, we explore the utility of the pipeline: given the goal of identifying important entities, we evaluate the impact of relation extraction errors on the identification of central nodes in several real and synthetic networks. The impact of these errors varies significantly by graph topology, suggesting that confidence in measurements based on automatically extracted relations should depend on observed network features.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cai, E., et al.: From low resource information extraction to identifying influential nodes in knowledge graphs. arXiv preprint arXiv:2401.04915 (2024) Cai, E., et al.: From low resource information extraction to identifying influential nodes in knowledge graphs. arXiv preprint arXiv:​2401.​04915 (2024)
2.
Zurück zum Zitat Chen, C.Y., Li, C.T.: ZS-BERT: towards zero-shot relation extraction with attribute representation learning. In: NAACL, pp. 3470–3479 (2021) Chen, C.Y., Li, C.T.: ZS-BERT: towards zero-shot relation extraction with attribute representation learning. In: NAACL, pp. 3470–3479 (2021)
3.
Zurück zum Zitat Chen, Q., et al.: Enhanced LSTM for natural language inference. In: ACL, pp. 1657–1668 (2017) Chen, Q., et al.: Enhanced LSTM for natural language inference. In: ACL, pp. 1657–1668 (2017)
4.
Zurück zum Zitat Das, S., et al.: CONTaiNER: Few-shot named entity recognition via contrastive learning. In: ACL (2021) Das, S., et al.: CONTaiNER: Few-shot named entity recognition via contrastive learning. In: ACL (2021)
5.
Zurück zum Zitat Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL. Minneapolis, Minnesota (2019) Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL. Minneapolis, Minnesota (2019)
6.
Zurück zum Zitat Ding, N., et al.: Few-NERD: a few-shot named entity recognition dataset. In: ACL-IJCNLP, pp. 3198–3213 (2021) Ding, N., et al.: Few-NERD: a few-shot named entity recognition dataset. In: ACL-IJCNLP, pp. 3198–3213 (2021)
7.
Zurück zum Zitat Gao, T., et al.: FewRel 2.0: towards more challenging few-shot relation classification. In: EMNLP-IJCNLP, pp. 6250–6255 (2019) Gao, T., et al.: FewRel 2.0: towards more challenging few-shot relation classification. In: EMNLP-IJCNLP, pp. 6250–6255 (2019)
8.
Zurück zum Zitat Gerdes, L.M., et al.: Assessing the Abu Sayyaf Group’s strategic and learning capacities. Stud. Confl. Terror. 37(3), 267–293 (2014)CrossRef Gerdes, L.M., et al.: Assessing the Abu Sayyaf Group’s strategic and learning capacities. Stud. Confl. Terror. 37(3), 267–293 (2014)CrossRef
9.
Zurück zum Zitat Gill, P., et al.: Lethal connections: the determinants of network connections in the Provisional Irish Republican Army, 1970–1998. Int. Interact. 40(1), 52–78 (2014)CrossRef Gill, P., et al.: Lethal connections: the determinants of network connections in the Provisional Irish Republican Army, 1970–1998. Int. Interact. 40(1), 52–78 (2014)CrossRef
10.
Zurück zum Zitat Han, X., et al.: FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: EMNLP, pp. 4803–4809 (2018) Han, X., et al.: FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: EMNLP, pp. 4803–4809 (2018)
11.
Zurück zum Zitat Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: EMNLP, pp. 10408–10423 (2021) Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: EMNLP, pp. 10408–10423 (2021)
12.
Zurück zum Zitat Isella, L., et al.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)MathSciNetCrossRef Isella, L., et al.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)MathSciNetCrossRef
13.
Zurück zum Zitat Jo, H., et al.: Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text. Comput. Secur. 120 (2022) Jo, H., et al.: Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text. Comput. Secur. 120 (2022)
14.
Zurück zum Zitat Leitner, E., et al.: Fine-grained named entity recognition in legal documents. In: SEMANTiCS, pp. 272–287 (2019) Leitner, E., et al.: Fine-grained named entity recognition in legal documents. In: SEMANTiCS, pp. 272–287 (2019)
15.
Zurück zum Zitat Li, J., et al.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2020)CrossRef Li, J., et al.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2020)CrossRef
16.
Zurück zum Zitat Liu, C., Yang, S.: Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst. Appl. 207, 117991 (2022)CrossRef Liu, C., Yang, S.: Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst. Appl. 207, 117991 (2022)CrossRef
17.
Zurück zum Zitat Liu, M., et al.: LTP: a new active learning strategy for CRF-based named entity recognition. Neural Process. Lett. 54(3), 2433–2454 (2022) Liu, M., et al.: LTP: a new active learning strategy for CRF-based named entity recognition. Neural Process. Lett. 54(3), 2433–2454 (2022)
18.
Zurück zum Zitat Lothritz, C., et al.: Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: COLING, pp. 3750–3760 (2020) Lothritz, C., et al.: Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: COLING, pp. 3750–3760 (2020)
19.
Zurück zum Zitat Lyu, Q., et al.: Zero-shot event extraction via transfer learning: challenges and insights. In: ACL-IJCNLP, pp. 322–332 (2021) Lyu, Q., et al.: Zero-shot event extraction via transfer learning: challenges and insights. In: ACL-IJCNLP, pp. 322–332 (2021)
20.
Zurück zum Zitat Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: ACL, pp. 55–60 (2014) Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: ACL, pp. 55–60 (2014)
21.
Zurück zum Zitat Mayhew, S., et al.: Named entity recognition with partially annotated training data. In: CoNLL (2019) Mayhew, S., et al.: Named entity recognition with partially annotated training data. In: CoNLL (2019)
22.
Zurück zum Zitat Najafi, S., Fyshe, A.: Weakly-supervised questions for zero-shot relation extraction. In: EACL, pp. 3075–3087 (2023) Najafi, S., Fyshe, A.: Weakly-supervised questions for zero-shot relation extraction. In: EACL, pp. 3075–3087 (2023)
23.
Zurück zum Zitat Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)MathSciNetCrossRef Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)MathSciNetCrossRef
24.
Zurück zum Zitat Radmard, P., et al.: Subsequence based deep active learning for named entity recognition. In: ACL-IJCNLP, pp. 4310–4321 (2021) Radmard, P., et al.: Subsequence based deep active learning for named entity recognition. In: ACL-IJCNLP, pp. 4310–4321 (2021)
25.
Zurück zum Zitat Ren, Y., et al.: CSKG4APT: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. (2022) Ren, Y., et al.: CSKG4APT: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. (2022)
26.
Zurück zum Zitat Rocktäschel, T., et al.: Reasoning about entailment with neural attention. In: ICLR (2016) Rocktäschel, T., et al.: Reasoning about entailment with neural attention. In: ICLR (2016)
27.
Zurück zum Zitat Siddhant, A., Lipton, Z.C.: Deep Bayesian active learning for natural language processing: results of a large-scale empirical study. In: EMNLP, pp. 2904–2909 (2018) Siddhant, A., Lipton, Z.C.: Deep Bayesian active learning for natural language processing: results of a large-scale empirical study. In: EMNLP, pp. 2904–2909 (2018)
28.
Zurück zum Zitat Simek, O., et al.: XLab: early indications and warnings from open source data with application to biological threat. HICSS (2018) Simek, O., et al.: XLab: early indications and warnings from open source data with application to biological threat. HICSS (2018)
30.
Zurück zum Zitat Tran, V.H., et al.: Improving discriminative learning for zero-shot relation extraction. In: SpaNLP, pp. 1–6 (2022) Tran, V.H., et al.: Improving discriminative learning for zero-shot relation extraction. In: SpaNLP, pp. 1–6 (2022)
31.
Zurück zum Zitat Wang, Q., Li, C.: Evaluating risk propagation in renewable energy incidents using ontology-based bayesian networks extracted from news reports. Int. J. Green Energy 19(12), 1290–1305 (2022)CrossRef Wang, Q., Li, C.: Evaluating risk propagation in renewable energy incidents using ontology-based bayesian networks extracted from news reports. Int. J. Green Energy 19(12), 1290–1305 (2022)CrossRef
32.
Zurück zum Zitat Williams, A., et al.: A broad-coverage challenge corpus for sentence understanding through inference. In: NAACL, pp. 1112–1122 (2018) Williams, A., et al.: A broad-coverage challenge corpus for sentence understanding through inference. In: NAACL, pp. 1112–1122 (2018)
33.
Zurück zum Zitat Xue, M., et al.: Coarse-to-fine pre-training for named entity recognition. In: EMNLP (2020) Xue, M., et al.: Coarse-to-fine pre-training for named entity recognition. In: EMNLP (2020)
34.
Zurück zum Zitat Zhou, B., et al.: MTAAL: multi-task adversarial active learning for medical named entity recognition and normalization. In: AAAI, vol. 35, pp. 14586–14593 (2021) Zhou, B., et al.: MTAAL: multi-task adversarial active learning for medical named entity recognition and normalization. In: AAAI, vol. 35, pp. 14586–14593 (2021)
Metadaten
Titel
From Low Resource Information Extraction to Identifying Influential Nodes in Knowledge Graphs
verfasst von
Erica Cai
Olga Simek
Benjamin A. Miller
Danielle Sullivan
Evan Young
Christopher L. Smith
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-57515-0_2

Premium Partner