When discussing observability, OpenTelemetry is crucial because it enables organizations to understand the internal state of their systems through telemetry data. In this article, we will discuss the important role of OpenTelemetry in enabling Observability in Java applications.

Why is OpenTelemetry important for Java applications - puzzles 1

What is Observability?

In modern software development, in which more and more developers and teams are adopting microservices architecture, we face new challenges and difficulties in keeping track of everything. So, finding solutions that simplify our workflows and enhance our productivity becomes essential. We require our system to notify us if any part isn’t functioning correctly rather than consistently verifying its status ourselves.

Basically, Observability is about how well we can understand a system’s internal state and behavior based on its external outputs or signals. In software and systems, it’s all about how easily developers and operators can figure out how a system is doing by checking its logs, metrics, traces, and other data.

We can achieve Observability by implementing tools and practices that enable us to monitor, analyze, and understand a system’s behavior.

Key Components of Observability

Critical components of Observability include:

  • Logs: Records of events, activities, and transactions occurring within a system, including errors, warnings, user actions, and application events.
  • Metrics: Metrics are quantitative measurements that provide information about a system’s state and performance, such as CPU usage, memory consumption, response times, and error rates.
  • Traces: Traces are sequences of events that occur while handling a specific request or transaction across distributed systems. They enable developers to understand the flow of requests and identify bottlenecks or issues.
  • Alerts: Notifications triggered by predefined conditions or thresholds indicate potential problems or anomalies in the system, allowing teams to detect and address issues before they escalate proactively.

What is OpenTelemetry?

OpenTelemetry is an open-source observability framework for cloud-native software. It’s designed to collect telemetry data (metrics, logs, and traces) from your applications and infrastructure to provide insights into their performance and behavior.

It also provides libraries for various programming languages and frameworks, making it easy to instrument applications without extensive manual effort.

OpenTelemetry prioritizes generating, collecting, and exporting telemetry data while leaving the storage and visualization aspects to other tools. Its primary aim is to simplify instrumentation across diverse application environments, irrespective of language or infrastructure.

OpenTelemetry Components

Why is OpenTelemetry important for Java applications - image

OpenTelemetry comprises three major components:

These components work together to provide a framework that enables us to create and manage telemetry data from generate, intake, process, and export.

OpenTelemetry components are vendor-agnostic, which means they are not tied to any specific vendor or tool. They are designed to interoperate with a wide range of open-source and commercial observability backends. This is one of the important features of OpenTelemetry because it allows us to choose the best tools for our needs without being locked into a specific vendor or technology stack. For example, OpenTelemetry can export data to open-source tools like Jaeger (for distributed tracing) and Prometheus (for metrics), as well as commercial offerings like DatadogNew Relic, and Splunk. It enables us to switch between different backends as your needs change without having to rewrite your instrumentation code.

Why are OpenTelemetry APIs, SDKs, and tools important?

OpenTelemetry APIs define a standard API that enables programming languages to easily instrument code and generate telemetry data. OpenTelemetry SDKs implement these APIs for a specific language, allowing us to quickly integrate OpenTelemetry into our code.

OpenTelemetry provides support for multiple programming languages, including Java, JavaScript, Python, Go, C++, C#, Rust, and Erlang/Elixir. It also integrates with popular libraries and frameworks like Akka, ASP.NET Core, Django, and Express.

Another important task that Language SDKs perform is implementing how to export telemetry data. OpenTelemetry SDKs are designed to export telemetry, and each SDK supports configuration for each language using environment variables, for example, by setting OTEL_LOGS_EXPORTEROTEL_METRICS_EXPORTER, or OTEL_TRACES_EXPORTER environment variable, we can specify which exporter is used for logs, metrics or traces, and also there are environment variables that let us configure an OTLP/gRPC or OTLP/HTTP endpoint for traces, metrics, and logs to export to.

How does OpenTelemetry instrument telemetry data?

Before instrumenting a system for observability, its components must emit traces, metrics, and logs.

OpenTelemetry offers two main ways for instrumentation:

1. Utilizing official APIs and SDKs across various languages.

2. Leveraging zero-code solutions.

Code-based solutions, which leverage the OpenTelemetry API, provide in-depth insights and rich telemetry directly from your application. They complement the telemetry generated by zero-code solutions.

Zero-code solutions are ideal for initial setup or situations where application modification is not feasible. They offer comprehensive telemetry from libraries and the application environment, capturing data at the edges of your application.

Both solutions can be used concurrently for comprehensive observability.

OpenTelemetry protocol (OTLP)

The OpenTelemetry Protocol (OTLP) is a set of rules (specification), conventions, and standards that define how telemetry data is encoded, transported, and delivered between telemetry sources (clients), intermediate nodes like collectors, and telemetry backends such as Jaeger, Prometheus and …

OTLP is designed to be vendor-agnostic and general-purpose, allowing it to be used with a wide range of observability tools and platforms. It is the core component of the OpenTelemetry project. By providing a standard protocol for exchanging telemetry data, OTLP enables interoperability between different observability components and tools.

OTLP can be implemented over different transport protocols, including gRPC and HTTP/1.1. The protocol specification defines how the protocol should be implemented and provides a Protocol Buffers schema for the payloads. OLTP also supports Gzip compression as the transport compression mechanism.

Understanding Pipelines (Data Stream) in OpenTelemetry

In OpenTelemetry, pipelines are like pathways that manage how information moves from one place to another. Think of them as organized channels that guide data from where it starts to where it’s needed. Receivers are like the entry gates of these pathways, collecting data from different sources and letting it in. Once inside, processors take over, tweaking and organizing the data so it’s easier to understand. Together, these components ensure that data flows smoothly, making it ready for use in analyzing and understanding what’s happening in your systems.

What is OpenTelemetry Collector?

The OTel Collector is a server that makes the process of collecting (receiving), processing, and exporting telemetry data to various destinations in a vendor-agnostic way easy. Other important features of OTel Collector are:

  • Easy setup
  • Reduce noise in telemetry data (by processors)
  • Export in different formats and standard
  • Single-agent that can collect data from multiple sources (programming languages and framework)
  • And more

Install and configure a Collector

There are different ways to install the OTel Collector: DockerDocker-composeK8s (daemonset and Helm Charts), Nomadbinary and manual installation. (See here to learn more about the installation)

OTel Collector configuration is based on YAML, and the configuration is located in /etc/<otel-directory>/config.yaml in the server.

The configuration file contains the definition of pipelines, which has four important components: ReceiversProcessorsExporters, and Connectors.

How does OpenTelemetry collect telemetry data?

In an OTel Collector pipeline, receivers are responsible for collecting telemetry data from various sources. Depending on the type of receiver configured, receivers can be pull or push-based and collect data related to traces, metrics, logs, and more. To configure receivers in OpenTelemetry using a YAML file, you typically define receiver configurations under the receivers section. Each receiver type has its own configuration parameters.

For example, to configure a simple receiver that specifies the otlp receiver with both http and grpc protocols enabled, you might have a YAML configuration like this, The configuration allows the OpenTelemetry Collector to receive data using both HTTP and gRPC protocols through the OpenTelemetry Protocol (OTLP):

receivers:
  otlp:
    protocols:
      http:
      grpc:

How does OpenTelemetry process telemetry data?

Processors receive telemetry data from the receivers and apply transformations, filtering, and enrichment to the data. Processors can manipulate the data in various ways, such as aggregating metrics, adding attributes to spans, or sampling traces.

Once the data is collected, processors step in to organize and refine it. They might filter out irrelevant information, aggregate similar data points, or add extra details to make the data more useful.

Here are some common types of processors in OpenTelemetry:

  1. Attribute Processor: This process manipulates attributes within telemetry data, allowing for the addition, modification, or removal of attribute values.
  2. Batch Processor: This process groups telemetry data into batches before exporting it, improving efficiency and reducing overhead during export.
  3. Resource Processor: This process extracts, modifies, or enriches resource-related information associated with telemetry data, such as host or service metadata.
  4. Span Processor: This processor is specifically designed to manipulate distributed traces, performing tasks like filtering, adding annotations, or modifying the structure of trace data.
  5. Sampling Processor: Controls the rate at which telemetry data is sampled for export, managing data volume while still providing representative data.
  6. Reduction Processor: Limits the size or length of telemetry data, such as truncating long log messages or reducing the precision of numeric values.
  7. Redaction Processor: Removes sensitive or personally identifiable information (PII) from telemetry data before export to ensure privacy and security compliance.

Here’s an example YAML configuration demonstrating the use of a Batch Processor:

receivers:
  otlp:
    protocols:
      grpc:

processors:
  batch:
    timeout: 1s
    send_batch_size: 100
    max_queue_size: 1000

exporters:
  jaeger:
    endpoint: "http://jaeger-collector:14268/api/traces"
    insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [jaeger]

How does OpenTelemetry export telemetry data?

OpenTelemetry SDKs and Collector are responsible for exporting telemetry data but for two different purposes. OTel collector also exports and sends telemetry data to one or more open-source or commercial backends outside the collector.

OTel Collector can have multiple exporters of the same type with the same pipeline. One interesting aspect of OTel collector exporters is that they can be push-based or even pull-based.

Exported telemetry data depends on the type and can be in OTLPjaegerzipkin, or Prometheus.

To configure an exporter, we need to specify the destination URL and security-related configs like authentication or TLS, in addition to the exporter type.

There are several different exporters in OTel collector like fileKafkadebugtopopen censusZipkinPrometheus, and more. Each exporter supports all or some of the open telemetry data sources (traces, metrics, and logs). For example, the otlp/jaeger refers to an exporter that transforms OpenTelemetry trace data into the format expected by Jaeger, a popular open-source distributed tracing system. This exporter only supports trace data sources. The exporter takes care of converting the OpenTelemetry trace data into the Jaeger format, so that it can be properly ingested and visualized by Jaeger.

exporters:
  otlp/jaeger:
    endpoint: jaeger-server:4317
    tls:
      insecure: true

OpenTelemetry for Java application

OpenTelemetry is important for Java applications because it provides a standardized way to collect, process, and export various forms of telemetry data (metrics, logs, and traces), simplifying the instrumentation of Java applications.

OpenTelemetry has a specific implementation for Java applications, and its components are in stable status.

Instrumentation for Java application

By using the opentelemetry-javaagent.jar, which contains the agent and all requirements for automatic instrumentation, we can auto-instrument every Java application (Java 8+) that is packaged as a jar file in this way:

java -jar ./build/libs/java-app.jar

In this way, OpenTelemetry dynamically injects bytecode to capture telemetry from many popular libraries and frameworks.

There is another way for manual instrumentation by using the ​​opentelemetry-api Java library.

It is worth mentioning that, There is a great library called Micrometer, which acts as an observability facade layer. This library makes integration between a Java application and OpenTelemetry even simpler.

That’s a lot of great data. But what to do with it?

Digma is a great tool that you can install as an IDE plugin, it is designed based on the concept of Continuous Feedback on top of the OpenTelemetry framework (and other libraries). Digma helps us to run a lightweight observability stack based on the OpenTelemetry transparently under the hood and then provides valuable insight based on the telemetry data that is sent by our application during the development process. This is an overview of how Digma works under the hood:

How Digma uses OpenTelemetry under the hood

How Digma uses OpenTelemetry under the hood

Digma utilizes advanced algorithms and machine learning techniques to effectively process all of the telemetry data generated. This data is then analyzed and transformed into valuable insights, which help us to gain a deeper understanding of our code’s performance and identify areas for improvement. By leveraging Digma’s powerful capabilities, we are able to optimize our code to deliver better results and ensure that it meets the highest standards of quality and efficiency. Some of the important Digma insights that can discover important issues during the development are:

  • Suspected N+1
  • Excessive API calls (chatty API)
  • Bottleneck
  • and more (see a complete list in the Digma documentation)
Why is OpenTelemetry important for Java applications - image 2

Digma insight: High number of queries

Final thought

This article tried to give a brief introduction to Observability, OpenTelemetry, and its importance for Java applications. As I suggested in this interview, I highly recommend learning more about OpenTelemetry if you are interested in the observability concept.

Download Digma: Here

Spread the news:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *