Skip to content

aws-samples/emr-studio-samples

EMR Studio Samples

This repository contains a script and AWS CloudFormation template samples for Amazon EMR Studio preview. You can create EMR Studios in AWS Organization Member accounts by using these samples. For more information about using EMR Studio, see Use EMR Studio in the Amazon EMR Management Guide.

You can submit feedback and requests for changes by opening an issue in this repo or by making proposed changes and submitting a pull request.

Creating an EMR Studio using demo script

⚠️ WARNING
Charges accrue for the AWS resources that the demo script provisions, such as the Amazon VPC, subnets, and the AWS Service Catalog portfolio in the AWS CloudFormation stack.
  1. Make sure you have your AWS credentials configured. For more information, see Configuring the AWS CLI. The IAM principal should contain at least Minimum Studio Admin permissions and AdditionalPermissionForDemoScript.json in this repo
  2. Make sure your AWS CLI version is equal or later than awscli-1.18.184 or awscli-2.1.4
  3. The cluster creations inside EMR Studio use the default EMR resources, such as EMR_EC2_DefaultRole, EMR_DefaultRole and S3 bucket for logging EMR steps (e.g. s3://aws-logs123456789012-us-east-1/elasticmapreduce/). So make sure they are present. If your account has never created any EMR cluster before, these default resources will be missing. The easiest way to bootstrap them is to create an EMR cluster using console.
  4. Clone this repository, or download create.sh using one of the following commands:
    • Clone: git clone https://github.com/aws-samples/emr-studio-samples.git
    • Download: curl https://raw.githubusercontent.com/aws-samples/emr-studio-samples/main/create_demo_studio_with_dependencies.sh --output create_demo_studio_with_dependencies.sh
  5. In the terminal, navigate to the directory where you saved create_demo_studio_with_dependencies.sh
  6. Run: bash create_demo_studio_with_dependencies.sh

Creating an EMR Studio using your own S3 bucket, VPC and cluster templates

⚠️ WARNING
Make sure your VPC and Subnets have the required tag: key = "for-use-with-amazon-emr-managed-policies", value = "true". Update S3 resource in service role policy to your S3 bucket

If you prefer to use existing S3 Bucket, VPC, Private Subnets(with NAT) and Service catalog products, use min_studio_dependencies.yml to create a minimum resource stack for your Studio. This stack contains only one service role, one user role, three example session policies and two securigy groups, which are needed to create an EMR Studio.

  1. If you did not clone the repository, download min_studio_dependencies.yml on your local machine using the following command: curl https://raw.githubusercontent.com/aws-samples/emr-studio-samples/main/min_studio_dependencies.yml -o min_studio_dependencies.yml.
  2. Create a new Cloudformation stack with min_studio_dependencies.yml via AWS Management console or AWS CLI. (Provide VPC Id for the stack parameter VPC)
  3. Remove the egress rule of EngineSecurityGroup (Unfortunately Cloudformation does not support creating 0-egress security group).
  4. Note down the Cloudformation stack outputs: EMRStudioServiceRoleArn, EMRStudioUserRoleArn, EngineSecurityGroup and WorkspaceSecurityGroup
  5. Run
aws emr create-studio --region $region \
--name $studio_name \
--auth-mode SSO \
--vpc-id $your_vpc \
--subnet-ids $your_subnet_1 $your_subnet_2 \
--service-role $service_role \
--user-role $user_role \
--workspace-security-group-id $workspace_sg \
--engine-security-group-id $engine_sg \
--default-s3-location s3://$your_s3_bucket

Security

See CONTRIBUTING for more information.

Copyright and License

All content in this repository, unless otherwise stated, is Copyright © Amazon Web Services, Inc. or its affiliates. All rights reserved.

The sample code within this repository is made available under the MIT-0 License. See the LICENSE file.

About

This repo contains samples for EMR Studio feature.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published