In this article, we will look at the observability features that were added to the Spring AI 1.0.0 M2.
Spring AI is one of the most exciting projects in the Spring Framework portfolio. It helps Spring developers create AI-powered applications and integrate Spring-based applications with AI models easily. Spring AI is heavily under development, and by releasing version 1.0.0 M2
, it is one step closer to becoming production-ready (version 1). The key focus in Spring AI 1.0.0 M2
was the adding observability functionality to this framework. In this article, we will look at the observability features that were added to the Spring AI 1.0.0 M2.
A short introduction to Spring AI
I am writing a series of articles about Spring AI and implementing a practical project using it, so I want to save time on this introduction. If you are unfamiliar with Spring AI, I recommend reading the introduction part to the first part of the series.
To make it short,
Spring AI provides portable APIs and abstractions for AI Model types and providers, Vector databases, and more on top of the Spring Framework concepts.
It offers a feature set that makes developing AI-driven applications much easier.
Why is Observability important for Spring AI?
We wrote a detailed article about the importance of observability: Here
Spring AI is no exception to this rule, By having different metrics when implementing AI-powered applications, we can have more control over our application. For example, by having metrics for sent tokens to the AI Model, we can measure the usage and prevent additional costs occurs.
On the other hand, Tracing can show us the flow of data and calls between our application components and the AI Model.
Observability in Spring AI
Spring Framework supports observability through Micrometer for metrics and Micrometer Tracing for distributed tracing. Similar to other projects, libraries, and frameworks on top of Spring Framework, Spring AI builds upon these observability features in the Spring ecosystem to provide metrics and tracing for AI-related operations in our Spring Application.
Enabling Observability functionality in Spring AI
Similar to a typical Spring Boot project, in order to enable the observability functionality and observing metrics and tracing, we need to add a few dependencies to our project.
- Adding Spring Boot Actuator
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency>
- Adding a Tracer Implementation
<dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-tracing-bridge-otel</artifactId> </dependency>
- Adding an exporter to store the traces
<dependency> <groupId>io.opentelemetry</groupId> <artifactId>opentelemetry-exporter-otlp</artifactId> </dependency>
Using Digma to analyze observability data
When introducing new observability functionality in Spring AI, instead of using Docker-Compose and Zipkin or Prometheus container, we will use the Digma plugin for IntelliJ.
Digma is a Continuous Feedback platform that dynamically analyses our code and helps us find performance issues. Besides this, since Digma uses OpenTelemetry behind the scenes, we can configure its observability mode to Micrometer and use it to see metrics and traces generated by our Spring Boot application inside the IDE using the Digma IntelliJ plugin.
4 Observability functionality in Spring AI
As I mentioned before, the main focus of the Spring AI 1.0.0 milestone 2 was adding observability functionality in Spring AI. I have created a branch called spring-ai-observability
the Employee Chatbot project GitHub repository that contains all configurations needed to start the chatbot and see the metrics and tracing in the Digma IntelliJ plugin. The only things that you need to consider in your local development environment are:
- Create an OpenAI API key and define it as an environment variable named
OPENAI-API-KEY
. - Install the Digma intelliJ plugin and change its
Spring Boot observability mode
toMicrometer
.
1- Providing observability functionalities for core components
Based on the observability features in the Spring framework (Micrometer), Spring AI provides metrics and tracing functionality for its core components, including ChatClient
, Advisors
, ChatModel
, EmbeddingModel
, ImageModel
, and VectorStore
.
For this release, Spring AI only supports observability for OpenAI
, Ollama
, Mistral
, and Anthropic
Chat Models implementations. However, They announced that support for other Chat Models implementations will be added to the upcoming versions.
2- Introducing several metrics related to Spring AI and AI Models
Spring AI introduces various useful metrics as low cardinality keys in core components. You can read the complete list for each component in the Spring AI official documents, but we have mentioned some of the important ones.
First, run the Employee Chatbot project in IntelliJ IDE, and then ask some questions like these inside your browser:
http://localhost:8080/employee/chat/1/my%20name%20is%20Deli http://localhost:8080/employee/chat/1/What%20is%20my%20name?
It is worth mentioning that Digma automatically changes the project run configuration and adds some additional parameters that enable it to collect observability data.
Now, if you go to the actuator
URL:
http://localhost:8080/actuator/metrics
You can see at least 7 more metrics there related to Spring AI:
{ "names": [ . . . "gen_ai.client.operation", "gen_ai.client.operation.active", "gen_ai.client.token.usage", . . . "spring.ai.chat.client.advisor", "spring.ai.chat.client.advisor.active", "spring.ai.chat.client.operation", "spring.ai.chat.client.operation.active", . . . ] }
For each metric, there are several tags. For example, if you open the URL for the gen_ai.client.token.usage
metric:
http://localhost:8080/actuator/metrics/gen_ai.client.token.usage
We can see the Measures number of input and output tokens used
and their available tags:
{ "name": "gen_ai.client.token.usage", "description": "Measures number of input and output tokens used", "measurements": [ { "statistic": "COUNT", "value": 146 } ], "availableTags": [ { "tag": "gen_ai.operation.name", "values": [ "chat" ] }, { "tag": "gen_ai.response.model", "values": [ "gpt-4o-2024-05-13" ] }, { "tag": "gen_ai.request.model", "values": [ "gpt-4o" ] }, { "tag": "gen_ai.token.type", "values": [ "output", "input", "total" ] }, { "tag": "gen_ai.system", "values": [ "openai" ] } ] }
On the other hand, if we check the Digma Observability
view, we can see that Digma detected our calling for the Employee Chatbot URL and provided us with a lot of helpful information.
Digma Observability
View
3- Providing distributed tracing data
Spring AI provides tracing data for its core components. Several High and low-cardinality keys (tags) will be added to traces. You can check the list of keys for each component from this link.
Let’s check the collected tracing data by Digma. If you click on the Trace
button in each row (Assets) in the Digma Observability
view, You will see the trace timeline view for that request:
Digma Trace view
As you can see, for each interaction between Spring AI components, such as ChatClient
, Advisor
, or ChatModel
, we have a span. Other than that, we have one span with details for calling the Chat Model implementation (in our case, OpenAI API).
By clicking on each span, we can see all the tags that Spring AI adds. There is also some information about the time spent on each interaction.
4- Exposing input and output data in the observations
We can configure Spring AI ChatClient
, ChatModel
, ImageModel
, and VectorStore
components to include input, prompt, completion, or query response in an observation as span attributes. Although these data are usually too big, storing them as span attributes does not make sense. According to the Spring AI documentation, this is because of this limitation:
These data are typically too big to be included in an observation as span attributes. The preferred way to store large data it is as span events, which are supported by OpenTelemetry but not yet surfaced through the Micrometer APIs. Spring AI supports storing these fields as events in OpenTelemetry and will provide a more general event based solution once the issue github.com/micrometer-metrics/micrometer/issues/5238 is resolved.
This feature for all supported components is disabled by default, We can enable them by these configs:
spring.ai.chat.client.observations.include-input=true spring.ai.chat.observations.include-prompt==true spring.ai.chat.observations.include-completion=true spring.ai.image.observations.include-prompt=true spring.ai.vectorstore.observations.include-query-response=true
For example, if we enable the ChatClient
input data as a span attribute. We will see it in the Digma trace view:
Trace’s span details in Digma
Final Thoughts
Observability is an integral part of any library and framework that wants to be used in today’s modern services, and Spring AI is no exception. Adding these observability functionalities to the m2 of version 1.0.0 of this project has been a big step closer to the final release of Spring AI framework version 1. As you can see, throughout the article, we used the Digma plugin for IntelliJ, which made it very easy for us to view and review the observability data.