Skip to main content

2024 | OriginalPaper | Buchkapitel

Imputation Analysis of Time-Series Data Using a Random Forest Algorithm

verfasst von : Nur Najmiyah Jaafar, Muhammad Nur Ajmal Rosdi, Khairur Rijal Jamaludin, Faizir Ramlie, Habibah Abdul Talib

Erschienen in: Intelligent Manufacturing and Mechatronics

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Missing data poses a significant challenge in extensive datasets, particularly those containing time-series information, leading to potential inaccuracies in data analysis and machine learning model development. To address the issue, this paper compared and evaluated four imputation methods: MissForest, MICE, Simplefill, and Softimpute which utilized Random Forest Algorithm. The research examines the impact of missing ratios and temporal variations on the performance of the imputation methods. The results indicated that MissForest consistently outperformed other methods, exhibiting the lowest RMSE values and a high coefficient of determination (R2), indicating its accuracy and ability to explain the variation in the data. Furthermore, graphical analyses demonstrated the stability of MissForest over time, while MICE and Simplefill showed higher sensitivity to date changes. Softimpute demonstrated relative consistency but slightly lower performance compared to MissForest. Overall, this study highlights the effectiveness of MissForest as the preferred imputation method for AVL time-series data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Little RJ, Rubin DB (2019) Statistical analysis with missing data. Wiley, New York Little RJ, Rubin DB (2019) Statistical analysis with missing data. Wiley, New York
3.
Zurück zum Zitat Tao W, Wang G, Sun Z, Xiao S, Pan L, Wu Q, Zhang M (2023) Feature optimization method for white feather broiler health monitoring technology. Eng Appl Artif Intell 123:106372CrossRef Tao W, Wang G, Sun Z, Xiao S, Pan L, Wu Q, Zhang M (2023) Feature optimization method for white feather broiler health monitoring technology. Eng Appl Artif Intell 123:106372CrossRef
4.
Zurück zum Zitat Navin KS, Nehemiah HK et al (2023) A classification framework using filter-wrapper based feature selection approach for the diagnosis of congenital heart failure. J Intell Fuzzy Syst 44(4):6183–6218CrossRef Navin KS, Nehemiah HK et al (2023) A classification framework using filter-wrapper based feature selection approach for the diagnosis of congenital heart failure. J Intell Fuzzy Syst 44(4):6183–6218CrossRef
5.
Zurück zum Zitat Newman DA (2009) Missing data techniques and low response rates: the role of systematic nonresponse parameters. In: Statistical and methodological myths and urban legend: doctrine, verity, and fable in the organizational and social sciences, p 7036 Newman DA (2009) Missing data techniques and low response rates: the role of systematic nonresponse parameters. In: Statistical and methodological myths and urban legend: doctrine, verity, and fable in the organizational and social sciences, p 7036
6.
Zurück zum Zitat Little RJ (2002) Statistical analysis with missing data, 2nd edn. Wiley, HobokenCrossRef Little RJ (2002) Statistical analysis with missing data, 2nd edn. Wiley, HobokenCrossRef
7.
Zurück zum Zitat Jamshidin M, Benter P (1999) MIL estimation of mean and covariance structures with missing data using complete data routines. J Educ Behav Stat 24(1):21–41CrossRef Jamshidin M, Benter P (1999) MIL estimation of mean and covariance structures with missing data using complete data routines. J Educ Behav Stat 24(1):21–41CrossRef
8.
Zurück zum Zitat Gillespie T (2014) The relevance of algorithms. Media Technol Essays Commun Mater Soc 167:167 Gillespie T (2014) The relevance of algorithms. Media Technol Essays Commun Mater Soc 167:167
9.
Zurück zum Zitat Jamaludin FAKR et al (2022) A review of current publications trend on missing data imputation over three decades: direction and future research. Neural Comput Appl 34:18325–18340CrossRef Jamaludin FAKR et al (2022) A review of current publications trend on missing data imputation over three decades: direction and future research. Neural Comput Appl 34:18325–18340CrossRef
10.
Zurück zum Zitat Medjahed SA (2013) Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int J Comput Appl 62:1–5 Medjahed SA (2013) Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int J Comput Appl 62:1–5
11.
Zurück zum Zitat Mustapha H (2019) Science direct investigating the use of random forest in software effort estimation on Investigating the use of random forest in software effort estimation. Proc Comput Sci 148:343–352CrossRef Mustapha H (2019) Science direct investigating the use of random forest in software effort estimation on Investigating the use of random forest in software effort estimation. Proc Comput Sci 148:343–352CrossRef
12.
Zurück zum Zitat Chong D, Zhu N, Luo W, Pan X (2019) Human thermal risk prediction in indoor hyperthermal environments based on random forest. Sustain Cities Soc 49:101595CrossRef Chong D, Zhu N, Luo W, Pan X (2019) Human thermal risk prediction in indoor hyperthermal environments based on random forest. Sustain Cities Soc 49:101595CrossRef
Metadaten
Titel
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm
verfasst von
Nur Najmiyah Jaafar
Muhammad Nur Ajmal Rosdi
Khairur Rijal Jamaludin
Faizir Ramlie
Habibah Abdul Talib
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-8819-8_4

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.