AWS Open Source Blog

Adding security workflows to OpenTelemetry

In this blog post, intern engineers Karen Xu and Kelvin Lo describe their experience working in the popular open source project, OpenTelemetry. They describe how they added security scanning workflows to the project, including how it supports a major milestone in readying the software for production use.

In any important and widely adopted open source software project, mitigating security concerns is a major and vital step, one that is required to avoid interruptions in service, which could affect the software users. Adding mandatory vulnerability scanning and patching of major vulnerabilities helps increase customer confidence in the software and also helps projects meet security policies and guidelines required by many user organizations.

OpenTelemetry is an open source Cloud Native Computing Foundation (CNCF) project that provides a single agent solution for the management of all telemetry types and aims to become the vendor-agnostic open standard for telemetry collection. It has been adopted by many leaders in the observability space and continues to grow in popularity. OpenTelemetry provides a telemetry data collection agent, along with a set of APIs and SDKs to collect telemetry data from a variety of user services and software.

As we evaluated OpenTelemetry’s project repositories for security readiness, we found security scanning workflows missing from many of the code repositories. In an effort to keep the repositories up to date and consistent, we started this project to design and add security scanning workflows to the CI/CD pipelines across all OpenTelemetry repositories.

Initial evaluation

We first evaluated what a viable security workflow tool for the OpenTelemetry repositories should look like.

Because the OpenTelemetry project is hosted on GitHub infrastructure, using the built-in GitHub Actions for CI/CD, any selected tool should be integrated easily with GitHub Actions. Furthermore, we needed to ensure that the security scanning tool chosen was able to support the programming languages of the repository it was scanning for.

Comparison of code scanning tools

The next step was to evaluate potential tools that were viable for use by OpenTelemetry repositories.

Two tools that were already being adopted in a few OpenTelemetry code repositories were CodeQL and GoSec. These two are widely used and reputable security tools, which both satisfied security scanning tool requirements. Thus, we evaluated them to see whether they could also support any of repositories in OpenTelemetry that were missing security workflows.

CodeQL is the built-in security scanning tool for GitHub Actions. Because it can support a variety of programming languages (Python, Java, C#, Go, and JavaScript), this tool had a range of features that we needed in our top tool of choice. It could also be further extended to a number of OpenTelemetry repositories.

GoSec is another widely used security scanning tool that can be used for code written in the Go programming language. Besides satisfying the two major requirements of our security scanning tools, GoSec could be initiated as a golint action before every commit, in addition to a GitHub Actions workflow. This tool could be used for any of the existing OpenTelemetry repositories written in Go. However, because CodeQL only supports six programming languages (Python, Go, Java, JavaScript, C++, and C#/Dotnet), and GoSec supports only Go, we also looked at potential providers to use for repositories that were not written in those supported languages.

For some of these languages, such as Rust, PHP, and Ruby, we were able to find alternatives. Although all these alternatives met our mandatory requirements, none were as powerful or comprehensive as CodeQL or GoSec for their detection capabilities. For example, RustAuditCheck relied on user reports of security vulnerabilities instead. Overall, we found limitations or drawbacks to the alternative providers listed and were not sure if they would suit the needs of the repositories.

Furthermore, because third-party providers are often written by one author, we had to consider whether the source was reputable for our needs. We decided to take these concerns to the owners of the repository by filing issues.

Proposed solution

Next, we evaluate to which repositories additional security workflow tooling and implementation must be added.

Existing and proposed security scanning workflows

In the following table, we identified which repositories had security workflows and which repositories were missing security workflows. For the repositories outlined in red, where security scan workflows were not currently implemented, we proposed a few viable security workflow providers that could support them.

1 Repository Language used Existing security workflows Proposed security workflows
2 opentelemetry-operator Go CodeQL / GoSec
3 opentelemetry-go-contrib Go CodeQL / GoSec
4 opentelemetry-go Go CodeQL / GoSec
5 opentelemetry-collector Go CodeQL / GoSec
6 opentelemetry-collector-contrib Go CodeQL / GoSec
7 opentelemetry-dotnet-contrib C#/Dotnet CodeQL
8 opentelemetry-dotnet C#/Dotnet CodeQL
9 opentelemetry-python-contrib Python CodeQL
10 opentelemetry-python Python CodeQL
11 opentelemetry-js-contrib JavaScript CodeQL
12 opentelemetry-js JavaScript CodeQL
13 opentelemetry-java-instrumentation Java CodeQL
14 opentelemetry-java Java CodeQL
15 opentelemetry-log-collection Go Missing security scan workflow CodeQL / GoSec
16 opentelemetry-collector-builder Go Missing security scan workflow CodeQL / GoSec
17 opentelemetry-java-contrib Java Missing security scan workflow CodeQL
18 opentelemetry-js-api JavaScript Missing security scan workflow CodeQL
19 opentelemetry-dotnet-instrumentation C# / Dotnet Missing security scan workflow CodeQL
20 opentelemetry-cpp C++ Missing security scan workflow CodeQL
21 opentelemetry-cpp-contrib C++ Missing security scan workflow CodeQL
22 opentelemetry-rust Rust Missing security scan workflow Filed issues proposing alternatives (Rust Audit Check, RustSec)
23 opentelemetry-php PHP Missing security scan workflow Filed issues proposing alternatives (Psalm)
24 opentelemetry-ruby Ruby Missing security scan workflow Filed issues proposing alternatives (Brakeman, Snyk)
25 opentelemetry-swift Swift Missing security scan workflow None found for GitHub Actions
26 opentelemetry-erlang Erlang Missing security scan workflow None found for GitHub Actions

We chose to stay consistent with the convention of existing OpenTelemetry repositories and use both CodeQL and GoSec for security scanning tools for the OpenTelemetry repository languages that they support. The rows in green are repositories that were written in languages supported by CodeQL or GoSec. For other languages, outlined in yellow or red, finding supported code scanning tools was difficult. For the Swift and Erlang repositories, for example, we could not find any reliable and popular tools that could be integrated into GitHub Actions.

Configuration

As we began writing the configurations for each of the security scanning tools, we started with the basic configurations of GoSec and CodeQL. We compared the existing workflows of CodeQL and GoSec in each repository the tools were configured for already and reused many of the same prompts and build processes.

A sample GitHub Actions workflow for CodeQL consists of the following main parts:

  1. Workflow name
  2. Workflow prompts
  3. A list of jobs to be run and the individual steps taken for each job

The following example shows a GoSec configuration that we used for the opentelemetry-log-collection repository:

example shows a GoSec configuration used for the opentelemetry-log-collection repository

In the preceding configuration file, the on section in lines 2–17 show the prompts for the workflow. To follow existing conventions in other OpenTelemetry repositories, the security scan will take place on every pull request, and is also scheduled to run routinely at 1:30 AM daily. In the jobs section in lines 19–35, we configured the tasks of the job. First we set up the workflow environment and then ran the GoSec security scanning tool. The last step is to upload the results of the security scan to GitHub Security tab, which is where alerts would be shown if security vulnerabilities are found.

Results

The final set of repositories to which we added security workflows are:

  • opentelemetry-log-collection(PR #153, PR#154)
  • opentelemetry-collector-builder(PR #45, PR #46)
  • opentelemetry-java-contrib (PR #35)
  • opentelemetry-js-api(PR #75)
  • opentelemetry-dotnet-instrumentation(PR #170)
  • opentelemetry-cpp(PR #770)
  • opentelemetry-cpp-contrib

For all the repositories in the preceding list, we added CodeQL security workflows. Additionally, for the first two repositories, we added GoSec configurations.

For the remaining repositories without existing workflows, we filed issues proposing our suggestions and provided an evaluation of our concerns in order to determine whether the alternatives were suitable to the community and maintainers of the repository.

In all, we filed a total of eight pull requests (PRs) and six issues to propose a design supporting our security requirements and implement the workflows by adding security scanning workflows. We recorded the progress of each successfully added repository on the tracking issue. Completing the addition of these code scans has helped improve consistency across all the repositories and strengthened the testing bar of each pull request.

The following image shows a screenshot of a CodeQL test passing on a PR.

screenshot of a CodeQL test passing on a PR

With the addition of these security workflow features, if a potential vulnerability is found during a daily scheduled run of a security scan, an alert will be triggered in the Security tab of the repository. An example of an alert is shown in the following:

example of an alert: message says incorrect conversion between interger types

Status badges

While adding security workflows, we noticed another inconsistency across OpenTelemetry repositories in that certain status badges (such as CI status badges and code coverage badges) were missing in each OpenTelemetry repository’s main README file. The following image shows an example of a README for the opentelemetry-go repository, which provides quick and valuable information through the various status badges. As shown in the image, the CI is currently failing, and the code coverage is not yet being met.

image shows that CI is currently failing, and the code coverage is not yet being met

Because these status badges provide an easy way to assess the status of each repository for readability and convenience, we think they are a reasonable and valuable addition to any of the OpenTelemetry repositories.

In total, we filed five PRs and 13 issues to make these enhancements. These features will serve as a quick indicator of code quality of the repository and the ability to detect problems on first glance, as shown in the previous example.

Conclusion

We are excited to have completed this task of reinforcing security best practices in a popular open source project with a positive impact to the OpenTelemetry project. We are grateful to have been able work with a variety of cutting-edge open source technologies and explore best practices in open source. By adding security workflows and status badges, we hope that the OpenTelemetry project will sooner be able to meet the incubation and production-use requirements needed to become a better and more robust project for users.

Kelvin Lo

Kelvin Lo

Kelvin Lo is a senior majoring in computer science at the University of British Columbia. He is currently working as a software engineer intern at AWS and is interested in observability and infrastructure.

Karen Xu

Karen Xu

Karen is a fourth-year student at the University of Waterloo studying computer science and business. She is an AWS intern engineer and is interested in observability and distributed systems.

Alolita Sharma

Alolita Sharma

Alolita is a senior manager at AWS where she leads open source observability engineering and collaboration for OpenTelemetry, Prometheus, Cortex, Grafana. Alolita is co-chair of the CNCF Technical Advisory Group for Observability, member of the OpenTelemetry Governance Committee and a board director of the Unicode Consortium. She contributes to open standards at OpenTelemetry, Unicode and W3C. She has served on the boards of the OSI and SFLC.in. Alolita has led engineering teams at Wikipedia, Twitter, PayPal and IBM. Two decades of doing open source continue to inspire her. You can find her on Twitter @alolita.