Skip to main content

2024 | OriginalPaper | Buchkapitel

On the Interaction Between Software Engineers and Data Scientists When Building Machine Learning-Enabled Systems

verfasst von : Gabriel Busquim, Hugo Villamizar, Maria Julia Lima, Marcos Kalinowski

Erschienen in: Software Quality as a Foundation for Security

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together, such as software engineers and data scientists. This paper presents an exploratory case study that aims to understand the current interaction and collaboration dynamics between these two roles in ML projects. We conducted semi-structured interviews with four practitioners with experience in software engineering and data science of a large ML-enabled system project and analyzed the data using reflexive thematic analysis. Our findings reveal several challenges that can hinder collaboration between software engineers and data scientists, including differences in technical expertise, unclear definitions of each role’s duties, and the lack of documents that support the specification of the ML-enabled system. We also indicate potential solutions to address these challenges, such as fostering a collaborative culture, encouraging team communication, and producing concise system documentation. This study contributes to understanding the complex dynamics between software engineers and data scientists in ML projects and provides insights for improving collaboration and communication in this context. We encourage future studies investigating this interaction in other projects.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems, vol. 28 (2015) Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
2.
3.
Zurück zum Zitat Lewis, G.A., Bellomo, S., Ozkaya, I.: Characterizing and detecting mismatch in machine-learning-enabled systems. In: IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN). IEEE, pp. 133–140 (2021) Lewis, G.A., Bellomo, S., Ozkaya, I.: Characterizing and detecting mismatch in machine-learning-enabled systems. In: IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN). IEEE, pp. 133–140 (2021)
4.
Zurück zum Zitat Runeson, P., Host, M., Rainer, A., Regnell, B.: Case Study Research in Software Engineering: Guidelines and Examples. Wiley, Hoboken (2012)CrossRef Runeson, P., Host, M., Rainer, A., Regnell, B.: Case Study Research in Software Engineering: Guidelines and Examples. Wiley, Hoboken (2012)CrossRef
5.
Zurück zum Zitat Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3(2), 77–101 (2006)CrossRef Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3(2), 77–101 (2006)CrossRef
6.
Zurück zum Zitat Braun, V., Clarke, V.: Reflecting on reflexive thematic analysis. Qual. Res. Sport Exerc. Health 11(4), 589–597 (2019)CrossRef Braun, V., Clarke, V.: Reflecting on reflexive thematic analysis. Qual. Res. Sport Exerc. Health 11(4), 589–597 (2019)CrossRef
7.
Zurück zum Zitat Villamizar, H., Kalinowski, M., Lopes, H., Mendez, D.: Identifying concerns when specifying machine learning-enabled systems: a perspective-based approach. arXiv preprint arXiv:2309.07980 (2023) Villamizar, H., Kalinowski, M., Lopes, H., Mendez, D.: Identifying concerns when specifying machine learning-enabled systems: a perspective-based approach. arXiv preprint arXiv:​2309.​07980 (2023)
8.
Zurück zum Zitat Kalinowski, M., Escovedo, T., Villamizar, H., Lopes, H.: Engenharia de Software para Ciência de Dados: Um guia de boas práticas com ênfase na construção de sistemas de Machine Learning em Python. Casa do Código (2023) Kalinowski, M., Escovedo, T., Villamizar, H., Lopes, H.: Engenharia de Software para Ciência de Dados: Um guia de boas práticas com ênfase na construção de sistemas de Machine Learning em Python. Casa do Código (2023)
9.
Zurück zum Zitat Nazir, R., Bucaioni, A., Pelliccione, P.: Architecting ML-enabled systems: challenges, best practices, and design decisions. J. Syst. Softw. 207, 111860 (2023)CrossRef Nazir, R., Bucaioni, A., Pelliccione, P.: Architecting ML-enabled systems: challenges, best practices, and design decisions. J. Syst. Softw. 207, 111860 (2023)CrossRef
10.
Zurück zum Zitat Ishikawa, F., Yoshioka, N.: How do engineers perceive difficulties in engineering of machine-learning systems? - questionnaire survey. In: IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER &IP), pp. 2–9. IEEE (2019) Ishikawa, F., Yoshioka, N.: How do engineers perceive difficulties in engineering of machine-learning systems? - questionnaire survey. In: IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER &IP), pp. 2–9. IEEE (2019)
11.
Zurück zum Zitat Villamizar, H., Escovedo, T., Kalinowski, M.: Requirements engineering for machine learning: a systematic mapping study. In: 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 29–36. IEEE (2021) Villamizar, H., Escovedo, T., Kalinowski, M.: Requirements engineering for machine learning: a systematic mapping study. In: 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 29–36. IEEE (2021)
12.
Zurück zum Zitat Kim, M., Zimmermann, T., DeLine, R., Begel, A.: Data scientists in software teams: state of the art and challenges. IEEE Trans. Software Eng. 44(11), 1024–1038 (2017)CrossRef Kim, M., Zimmermann, T., DeLine, R., Begel, A.: Data scientists in software teams: state of the art and challenges. IEEE Trans. Software Eng. 44(11), 1024–1038 (2017)CrossRef
13.
Zurück zum Zitat Amershi, S., et al.: Software engineering for machine learning: a case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291–300. IEEE (2019) Amershi, S., et al.: Software engineering for machine learning: a case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291–300. IEEE (2019)
14.
Zurück zum Zitat Zhang, A.X., Muller, M., Wang, D.: How do data science workers collaborate? Roles, workflows, and tools. Proc. ACM Hum.-Comput. Interact. 4(CSCW1), 22:1–22:23 (2020) Zhang, A.X., Muller, M., Wang, D.: How do data science workers collaborate? Roles, workflows, and tools. Proc. ACM Hum.-Comput. Interact. 4(CSCW1), 22:1–22:23 (2020)
15.
Zurück zum Zitat Mailach, A., Siegmund, N.: Socio-technical anti-patterns in building ML-enabled software: insights from leaders on the forefront. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 690–702. IEEE (2023) Mailach, A., Siegmund, N.: Socio-technical anti-patterns in building ML-enabled software: insights from leaders on the forefront. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 690–702. IEEE (2023)
16.
Zurück zum Zitat Nahar, N., Zhou, S., Lewis, G., Kästner, C.: Collaboration challenges in building ML-enabled systems: communication, documentation, engineering, and process. In: Proceedings of the 44th International Conference on Software Engineering, ICSE 2022, New York, NY, USA, Association for Computing Machinery, pp. 413–425, July 2022 Nahar, N., Zhou, S., Lewis, G., Kästner, C.: Collaboration challenges in building ML-enabled systems: communication, documentation, engineering, and process. In: Proceedings of the 44th International Conference on Software Engineering, ICSE 2022, New York, NY, USA, Association for Computing Machinery, pp. 413–425, July 2022
17.
Zurück zum Zitat Basili, V.R., Rombach, H.D.: The tame project: towards improvement-oriented software environments. IEEE Trans. Software Eng. 14(6), 758–773 (1988)CrossRef Basili, V.R., Rombach, H.D.: The tame project: towards improvement-oriented software environments. IEEE Trans. Software Eng. 14(6), 758–773 (1988)CrossRef
18.
Zurück zum Zitat Caroli, P.: Lean Inception. Caroli. org, São Paulo (2017) Caroli, P.: Lean Inception. Caroli. org, São Paulo (2017)
19.
Zurück zum Zitat Coelho, G.M., et al.: Text classification in the Brazilian legal domain. In: ICEIS (1), pp. 355–363 (2022) Coelho, G.M., et al.: Text classification in the Brazilian legal domain. In: ICEIS (1), pp. 355–363 (2022)
20.
Zurück zum Zitat Cruzes, D.S., Dyba, T.: Recommended steps for thematic synthesis in software engineering. In: International Symposium on Empirical Software Engineering and Measurement, pp. 275–284. IEEE (2011) Cruzes, D.S., Dyba, T.: Recommended steps for thematic synthesis in software engineering. In: International Symposium on Empirical Software Engineering and Measurement, pp. 275–284. IEEE (2011)
21.
Zurück zum Zitat Brown, N., Stockman, T.: Examining the use of thematic analysis as a tool for informing design of new family communication technologies. In: 27th International BCS Human Computer Interaction Conference (HCI 2013), vol. 27, pp. 1–6 (2013) Brown, N., Stockman, T.: Examining the use of thematic analysis as a tool for informing design of new family communication technologies. In: 27th International BCS Human Computer Interaction Conference (HCI 2013), vol. 27, pp. 1–6 (2013)
22.
Zurück zum Zitat Braun, V., Clarke, V.: Can I use TA? Should I use TA? Should I not use TA? Comparing reflexive thematic analysis and other pattern-based qualitative analytic approaches. Couns. Psychother. Res. 21(1), 37–47 (2021) Braun, V., Clarke, V.: Can I use TA? Should I use TA? Should I not use TA? Comparing reflexive thematic analysis and other pattern-based qualitative analytic approaches. Couns. Psychother. Res. 21(1), 37–47 (2021)
23.
Zurück zum Zitat Wan, Z., Xia, X., Lo, D., Murphy, G.C.: How does machine learning change software development practices? IEEE Trans. Software Eng. 47(9), 1857–1871 (2019) Wan, Z., Xia, X., Lo, D., Murphy, G.C.: How does machine learning change software development practices? IEEE Trans. Software Eng. 47(9), 1857–1871 (2019)
24.
Zurück zum Zitat Villamizar, H., Kalinowski, M., et al.: A catalogue of concerns for specifying machine learning-enabled systems. In: Workshop on Requirements Engineering (WER), pp. 1–14 (2022) Villamizar, H., Kalinowski, M., et al.: A catalogue of concerns for specifying machine learning-enabled systems. In: Workshop on Requirements Engineering (WER), pp. 1–14 (2022)
Metadaten
Titel
On the Interaction Between Software Engineers and Data Scientists When Building Machine Learning-Enabled Systems
verfasst von
Gabriel Busquim
Hugo Villamizar
Maria Julia Lima
Marcos Kalinowski
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-56281-5_4

Premium Partner