nach oben

Service Oriented Computing and Applications

Erschienen in:

Open Access 09.06.2023 | Editorial

Service composition in the ChatGPT era

verfasst von: Marco Aiello, Ilche Georgievski

Erschienen in: Service Oriented Computing and Applications | Ausgabe 4/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

ChatGPT recently attracted vast attention in and outside the research community for its conversational abilities that mimic human ones exceptionally well. At the heart of systems like ChatGPT are Large Language Models (LLM). These models, rooted in deep neural networks, have the ability to predict the next textual token in a series of tokens based on statistical occurrences in extremely large data sets [1]. When the models are sufficiently big and well-tuned, one observes the “unreasonable effectiveness of data” [2] in how the system generates perfectly intelligible and believable sentences. Such ability to have human-like conversations with a software system is both stunning for the quality of the conversation and mind-blowing in terms of the potential impact on society and the job market in particular [3, 4]. There is controversy on whether or not such systems manifest forms of artificial intelligence. Researchers at Microsoft, for instance, attribute signs of intelligence to the current fourth version of Generative Pre-trained Transformer (GPT-4), which is in development at the time of writing [5]. Some authors have successfully solved Theory of Mind tasks using such tools. Kosinski reports a success rate of 95% using GPT-4 in solving false-belief tasks. Other authors are more careful with the excessive anthropomorphization of ChatGPT-like systems [6, 7]. What is sure is that the embedding of an LLM into a system makes it a very powerful tool. Of interest to us in this editorial is the LLM capability to generate programs [8] and its potential impact on Service-Oriented Computing and Applications.

The problem of automated service composition is central to the field of Service-Oriented Computing and Applications [9]. Seamless, unsupervised, automated composition of services available on a network is a potent way to build adaptive information systems. It is the idea that one can execute any task relying on multiple, loosely coupled services, possibly without prior knowledge of such services. This would guarantee that virtually any task can be executed by relying on third-party implementations and resources.

Such a vision of automation is rooted in the fields of software engineering and component-based software engineering. It emerged in a fervent moment of technological evolution. At the beginning of the century, the Internet was becoming pervasive, and the Web emerged as a central technology also for businesses. Every company was moving toward having a Web presence, which also meant having a web server with a port always open as a gateway to their enterprise information systems [10]. Standardized interoperation was also emerging as the Web language HTML was being stripped of its layout and presentation aspects and turned into a platform-agnostic data description language, XML. Specific dialects of XML were proposed to describe wiring of messages for remote interoperation (SOAP), for describing services (WSDL), for describing service orchestrations (BPEL), and service discovery infrastructure was also proposed (UDDI) [11]. The core pattern behind using such technology is that services are published, then they can be discovered, and once found, they can be interacted with (publish-find-bind).

The technological availability of XML-described services inspired many researchers to propose techniques for achieving automated service composition, see [12] for an early survey of the systems. One of the biggest challenges that emerged from the beginning was the necessity to understand what a service was capable of by simply looking at the signature of its interfaces. While the format and type of the exchange data are declared, the service’s actual implementation and business goals are unknown to the service composition engine. This means that either the proposed systems were just doing syntactic matching of the interfaces but could not really provide guarantees of what the composition would do, or some additional semantic information regarding the services had to be (manually) provided. Both the syntactic and the semantic approaches would prove to have drawbacks that made the proposals inapplicable in practice. However, the road of providing additional semantics to the service description was the more intriguing and practiced one. The very strong assumption is that all service implementations come with additional descriptions of the preconditions for running a service and the outcome of invoking the service. In other words, if one invokes a weather service, a time and place need to be provided, and in the end, one will know the temperature in that place at that time. Notice that this is much more information than simply asking to provide three inputs, two floats and a string, and receive back as output an integer.

While semantic service annotations describing the inner working of the services would have made automated service composition much more realistic, the truth is that the world went in another direction and such descriptions were never broadly available. It was not the precise semantic descriptions that would come but a huge amount of data that, taken jointly, would be statistically meaningful and would provide enough information for performing basic tasks. This is how the first versions of the Google search engine become successful, not by understanding the content of the webpages, but by looking at the frequency of words and link structure of the Web. The promises of ChatGPT today come from the same place of data abundance and statistical relevance.¹

There is no point to go against technological evolution. The approach of adding semantic information to service descriptions (and web pages via the Semantic Web, for that matter) was never successful and was superseded by the simple abundance of data, directly or indirectly generated by humans. The question then becomes: Can such data abundance help in automated service composition?

In the vision paper about service composition for the next 50 years, we draw a parallel between the levels of autonomous driving and the development of autonomous service composition, shown in Fig. 1, from [13]. The reasoning is that, similarly to autonomous driving, the automation of service composition goes through levels and needs technological innovations to advance from one step to another. Machine learning and computer vision advancements have helped cars achieve “Partial Automation,” i.e., Level 2. For automated service composition, we have been long stuck with systems of Levels 1-2, and after the initial results of a couple of decades ago, not much innovation has emerged. LLM promises to be a technological innovation that can revive the field and bring it from Level 2 to higher levels of automation. Let us look at an example to see what is already possible today with an available tool such as ChatGPT.

The classic examples of service composition have to do with traveling around the world. Say that one wants a composition that given a country provides the weather in its capital city and the currency conversion rate to the Euro. This requires interaction with several Web services about weather, geography, and currency conversion. It requires passing parameters around that have to fit the syntax and data types of each service and requires knowing that a country has a capital and a currency. The traditional approach to solve this composition is for an engineer to search for the available services, explore their APIs, and then write some orchestration of the services in a workflow language or in some programming language to be run in an application container. This would cost a developer between a few hours and a day, based on his experience and abilities. Could ChatGPT do it for us? We could phrase the composition as: “Given the name of a country, find what the weather is in the capital using the API of https://open-meteo.com and what is the value of the local currency for one euro, and show the city with the currency value on Google Maps via the Google Maps API. In Python, please.”

If we use GPT version 3.5 (GPT\(-\)3.5) and provide it the text in quotes above, we do get Python code! This is both surprising and impressive. The code, generated on May 10, 2023, is displayed in Fig. 2, and it is generated in less than a minute. It looks ready for execution, though it does need editing, most notably, one has to add the API keys for the identified services and create a container for running the code. In other terms, GPT\(-\)3.5 has automated an important number of aspects related to composition. It has performed service discovery. The query did mention where to find the map and weather services but not where to find the currency conversion service. The system has interpreted the interfaces and was able to create input and output parameters. It did even more; it created a function to extract the text identifying the capital of a country given the country’s name. Finally, it has created the “business logic” by ordering the invocations and returning both the values and an invocation of the web browser with the requested results. There is though a small problem: It has invented some parts of the APIs. For Open-Meteo, no API key is required, while the generated code requires and appends an API key to the endpoint URL. Furthermore, the weather API requires a location to be specified using longitude and latitude coordinates, while the generated code appends a city name to the endpoint URL. The rest of the endpoint URL is also faulty. In summary, ChatGPT generated an endpoint URL with non-existing, faulty, and incomplete parameters. Looking at the generated code responsible for handling the message returned by the weather API, it correctly assumes that the message is in a JSON format; however, the code tries to parse wrong or non-existing fields in the received JSON message. Notice that ChatGPT generated code for acquiring a capital city of the given country. It does so by a call to an Open-Meteo’s endpoint that does not exist (see the get_capital(country) function). The code even tries to extract the name of the capital from a JSON message. Similar observations can be made for the other APIs.

While it is impressive that ChatGPT was able to discover services to perform the composition, in our case, Exchange Rates API, that it could “understand” the logic of the request and generate the Python code for the composition, that it “understood” how to send requests to the services; it has difficulties interpreting the interfaces of the services by using wrong and non-existing parameters and has made a call to a non-existing API, a phenomenon often called hallucination. We can engage in a chat with the system to try to refine the generated code to be correct. The process is somewhat frustrating as it requires many iterations to only fix the problem partially. For example, we needed to confabulate with ChatGPT just to let it know that Open-Meteo cannot work with a city name and needs a city’s longitude and latitude coordinates instead. Eventually, ChatGPT corrected this by engaging a fourth service, namely OpenStreetMap, to get the coordinates of a given city. We can also simply correct the code manually and have the composition ready for execution. The next step is to deploy the code and let it orchestrate the services. We can resort to ChatGPT also for this task by stating something like “Can you write Kubernetes code to deploy the above Python program?”, and we get the mostly correct Kubernetes YAML code presented in Fig. 3 together with instructions on how to customize the YAML code and the Unix shell commands to run (for brevity reasons not shown here).

This short experience shows that ChatGPT is capable of service discovery, generating compositions, and service interface interpretation. This is a very powerful automation that potentially eliminates the need for service registries (e.g., UDDI), composition languages (e.g., BPEL), and interface description languages (e.g., WSDL). It also shows that it can take care of the deployment aspects of the code. It also shows that the output is not entirely correct and iterations and supervision by a domain expert are still necessary. Going back to the maturity level diagram shown in Fig. 1, ChatGPT shows very high potential to boost the state of the art beyond Level 2 and be at the heart of tools that can automate the creation of service-oriented systems with some small supervision. It shows that ChatGPT can be the type of tool that will move automated service composition from its current state to Automation Level 3.

To meet such promise, a number of research directions emerge. These are new to the field of automated service composition.

Prompt engineering. What is the best way to formulate a service composition request? If it is natural language, how should this be done effectively? And how to refine such requests through interaction in order to possibly converge to a correct output? Can one integrate natural language, formal specifications, and programming languages into one request? Is there a way the system can ask back questions to the user in order to resolve ambiguities or composition alternatives?
Composition verification and testing. Compositions generated using a formal language approach are correct by design (e.g., [14]). This is not true for compositions generated by ChatGPT. How does one perform verification and testing of compositions generated by LLM-like tools? How can one ensure that the composition is satisfying the original request of the user, especially if expressed in natural language?
Execution monitoring. Once the composition is deployed and in execution, how can one monitor that it is actually performing as intended? How can it deal with run-time contingencies and non-deterministic and Byzantine service behaviors? These are general problems of composition executions, the new aspect here is to investigate if LLM can be also a tool to address these issues, and if so, how.

After the enthusiasm at the beginning of the century about automating service composition, the field has slowed down, and not much progress has been made in the last decade. We believe that the emergence of tools like ChatGPT will have an important impact on automated service composition and its actual practical use, if the open research challenges, including the ones listed above, are addressed. More research and development will show whether the claim of moving soon to Level 3 is correct or not. As a final note, we state that, except for the output screenshots shown in Figs. 2 and 3, no text of this article has been generated by ChatGPT, but it is entirely the responsibility—for good or for worse—of the authors.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nächster Artikel Recommendation-based trust computation and rating prediction model for security enhancement in cloud computing systems

Incidentally, the fortune of the Semantic Web followed a similar destiny to that of automated service composition for basically the same reasons. The Semantic Web is the technological evolution of the Web where textual content is augmented with semantic annotations [10]. The annotations refer to formally defined ontological information on which one can reason. The idea behind its definition was to add descriptions of the meaning of terms to the Web so that more precise searches could be performed and potentially also reasoning. But again, the world did not find enough value in adding such descriptions and the Semantic Web technology was never widely adopted.

Wolfram S (2023) What is ChatGPT doing ... and why does it work? Wolfram Research, Incorporated, Champaign, Illinois

Halevy A, Norvig P, Pereira F (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24(2):8–12CrossRef

Eloundou T, Manning S, Mishkin P, Rock D (2023) GPTs are GPTs: an early look at the labor market impact potential of large language models. Technical Report arXiv:2303.10130

Smith MS (2023) AI can’t take over everyone’s jobs soon (if ever). IEEE Spectrum, 4

Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, Kamar E, Lee P, Lee YT, Li Y, Lundberg S, Nori H, Palangi H, Ribeiro MT, Zhang Y (2023) Sparks of Artificial General Intelligence: early experiments with GPT-4. Technical Report arXiv:2303.12712,

Shanahan M (2023) Talking about large language models. Technical Report arXiv:2212.03551

Brooks R (2023) Just calm down about GPT-4 already. IEEE Spectrum

Meyer B (2022) What do ChatGPT and AI-based automatic program generation mean for the future of software. Commun ACM 65(12):5

Papazoglou M (2003) Service-oriented computing: concepts, characteristics and directions. In: Web Information Systems Engineering, 2003. WISE 2003. Proceedings of the fourth international conference on. IEEE. pp. 3–12

10.

Aiello M (2018) The Web was done by amateurs: a reflection on one of the largest collective systems ever engineered. Springer, HeidelbergCrossRef

11.

Weerawarana S, Curbera F, Leymann F, Storey T, Ferguson DF (2005) Web services platform architecture: SOAP, WSDL, WS-Policy, WS-addressing, WS-BPEL, WS-reliable messaging, and more. Pearson, New York City

12.

Dustdar S, Schreiner W (2005) A survey on Web services composition. Int J Web Grid Serv 1(1):1–30CrossRef

13.

Aiello M (2022) A challenge for the next 50 years of automated service composition. In: Proceedings of 20th international conference of service-oriented computing (ICSOC 2022). Lecture Notes in Computer Science, vol. 13740, Springer, Heidelberg, pp. 635–643

14.

Kaldeli E, Lazovik A, Aiello M (2016) Domain-independent planning for services in uncertain and dynamic environments. Artif Intell 236(7):30–64MathSciNetCrossRefMATH

Titel: Service composition in the ChatGPT era
verfasst von: Marco Aiello
Ilche Georgievski
Publikationsdatum: 09.06.2023
Verlag: Springer London
Erschienen in: Service Oriented Computing and Applications / Ausgabe 4/2023
Print ISSN: 1863-2386
Elektronische ISSN: 1863-2394
DOI: https://doi.org/10.1007/s11761-023-00367-7

Springer Professional

Service composition in the ChatGPT era

Publisher's Note

Publisher's Note

Premium Partner

Springer Professional

Publisher's Note

Publisher's Note

Weitere Artikel der Ausgabe 4/2023

A framework for REST services discovery and composition

Critical path-driven exploration on the MiniGrid environment

Self-improved algorithm for cloud load balancing under SLA constraints

Handwritten Chinese signature detection with simple Copy–Paste augmentation on power plants technical documents

Recommendation-based trust computation and rating prediction model for security enhancement in cloud computing systems

Premium Partner