Skip to content

aws-samples/amazon-mwaa-extract-metadata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Persist and analyze metadata in a transient Amazon MWAA environment

This repository contains sample code for persisting and analyzing metadata in a transient Amazon MWAA environment. Storing this metadata in your data lake enables you to better perform pipeline monitoring and analysis. Tearing down instances whilst preserving the metadata enables you to further optimize the costs of Amazon MWAA.

This blog provides a detailed overview and step-by-step instructions on how to export, persist, and analyze Airflow metadata.

Solution Overview

The below diagram illustrates the solution architecture. Please note, Amazon QuickSight is NOT included as part of the CloudFormation stack in this repository. It has been placed in the diagram to illustrate that metadata can be visualized using a business intelligence tool.

Figure 1 - Solution Architecture

Prerequisites

To implement the solution, you will need following :

Deploy Infrastructure

The provisioning takes about 30 minutes to complete.

The CloudFormation template generates the following resources:

What this repo contains

cloudformation/
  managed-airflow-cfn.yaml
dags/	
  extract_metadata.py
  run-simple-dags.py
  run-glue-jobs.py
glue-etl/
  glue-transform.py
  glue-csv-parquet.py
images/
  solution-architecture.png
CODE_OF_CONDUCT.md
CONTRIBUTING.md
LICENSE
README.md

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages