AWS Open Source Blog

Announcing Amazon CloudWatch for Ray

Amazon CloudWatch is now available for Ray on Amazon Elastic Compute Cloud (Amazon EC2). Ray is an open source (Apache 2.0 License) framework to build and scale distributed applications. CloudWatch is a monitoring and observability service that provides data and actionable insights to monitor your applications, respond to system-wide performance changes, and optimize resource utilization. With CloudWatch for Ray, you can now deploy your Ray applications in production on Amazon EC2 and monitor their health with near real-time metrics, logs, and alarms.

Release highlights also include support for extended Amazon EC2 metrics, Ray metrics, and Ray logs integrated with CloudWatch. Amazon EC2 extended metrics that can be monitored on CloudWatch include critical insights into your application health such as memory utilization, disk utilization, and running process count. Ray metrics available on CloudWatch include both high-level aggregates at the cluster level and low-level insights at the individual Amazon EC2 instance level. These metrics are automatically integrated into default Ray application dashboards to give you rapid, configurable insights into your overall application’s health so you can quickly identify high-level trends in your Ray clusters, and gain detailed insights into the health of a single Amazon EC2 instance. CloudWatch logs for your Ray applications provide detailed insights into your application’s health and provide a durable history of events that are critical for troubleshooting problems in high-availability, production environments.

Screenshot of Amazon CloudWatch dashboard for Ray applications

Figure 1: Sample Amazon CloudWatch dashboard for Ray applications

Getting Started

Learn more about this integration, and start running your Ray applications on AWS by referring to the setup and usage guide in Ray docs. If you have questions about the integration or run into issues, please file an issue.

Daniel Yeo

Daniel Yeo

Daniel Yeo is a Senior Technical Program Manager at Amazon. He is passionate about advancing technologies to make machine learning scale seamlessly. His team is actively contributing improvements and novel ideas to Ray in Open Source, so customers can reap the full potential of using Ray.

Yiqin(Miranda) Zhu

Yiqin(Miranda) Zhu

Miranda is a Software Development Engineer in the Ray team at Amazon. She is passionate about developing Open Source Ray and integrating Ray with Amazon technologies.