Skip to content

aws-samples/serverless-rss-filtered-feed-gen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Filtered RSS feed generator

Description

RSS is a popular web syndication format that allows users to view updates to a site. This is a configurable serverless solution that generates filtered rss feeds and makes them public accessible. Defined RSS sources are read at a given interval and new filtered feeds are generated and stored.

⚠️ This template can only be deployed in the us-east-1 region. The reason for this is the need for a certificate for CloudFront to be created in us-east-1. For deployment in other regions, a stackset for certificate creation in us-east-1 is required.


Architecture overview

Architecture overview diagram

The architecture uses a minimum number of AWS services to keep it easy to maintain and cost-effective.

Resources

  • Sources Table - Amazon DynamoDB table that holds all sources of RSS feeds, filter keywords and target feed parameter
  • RSS feeds Queue - Amazon SQS queue that dispatches a message for each RSS feed
  • List Sources Function - AWS Lambda function that lists all sources of RSS feeds for processing
  • Process RSS Feeds Function - AWS Lambda function that checks the RSS feed for new content
  • RSS feeds bucket - Amazon S3 bucket that stores new rss feeds and serves feed consumer via Amazon CloudFront distribution

Sources Table - Schema

Attribute Type Description
source String An HTTP/S URL for the RSS feed
filter List of String A list of filter keywords. Regex filter are supported too, e.g.: "Europe (.?Zurich.?)\)"
newfeedname String Key (filename) of the filtered stream object on S3, accessible via CloudFront
[newfeedtitle] String Optional: Titel of filtered feed (Default: title of source feed)

Process flow example

In this example the AWS What's New RSS feed is filtered for announcements related to the Europe (Zurich) Region.

Feed process overview


Prerequisites and tools

This application can be deployed prepackaged via AWS Serverless Application Repository or manually using the AWS Serverless Application Model Command Line Interface (AWS SAM CLI).

Deployment option independent, you must have a registered domain name, such as example.com, and point it to a Route 53 hosted zone in the same AWS account in which you deploy this solution. For more information, see Configuring Amazon Route 53 as your DNS service.

This application deployes the following resources:

  • Amazon CloudWatch Event Rule
  • Amazon DynamoDB Table
  • AWS Lambda Functions
  • Amazon SQS Queue
  • Amazon S3 Bucket
  • AWS Certificate Manager Certificate
  • Amazon Route 53 RecordSet

Parameters

  • CloudFrontHostname: The hostname to be used as alias for the CloudFront distribution.

    • Required: Yes
  • R53HostedZoneId: R53 HostedZoneId to create the Certificate verification and CloudFront alias record

    • Required: Yes

  • [ScheduleExpression]: define how frequently feeds should be updated.

  • [RSSFeedQueueVisibilityTimeout]: The amount of seconds to wait until a message is made visible again for the RSS feed SQS queue, in case prior processing wasn't successful.

    • Required: No.
    • Default: 90s (min 60s)
  • [RSSFeedQueueRetention]: The amount of seconds to retain a message in the channel SQS queue

    • Required: No.
    • Default: 300s

Build, deployment, configuration and testing

⚠️ This template can only be deployed in the us-east-1 region. The reason for this is the need for a certificate for CloudFront to be created in us-east-1. For deployment in other regions, a stackset for certificate creation in us-east-1 is required.

If you have deployed the application via AWS Serverless Application Repository, you can skip to point 3 (adding feed information to the DynamoDB table).

  1. Prepare the serverless application (rss-filtered-feed-gen) for the development

    sam build --use-container
  2. Deploy the serverless application (rss-filtered-feed-gen) to the AWS Cloud

    sam deploy --parameter-overrides ParameterKey="CloudFrontHostname",ParameterValue="<hostname>" ParameterKey="R53HostedZoneId",ParameterValue="<R53 HostedZone Id>"
  3. Once deployed; add the first source feed to the DynamoDB table.

    1. In the AWS console, go to Amazon DynamoDB.
    2. Click Tables
    3. Click on the Table name rss-filtered-feed-gen...
    4. Click Actions -> Create Item
    5. Click JSON View
    6. Paste and customize the following JSON:
      {
        "source": {
          "S": "<SOURCE FEED URL>"
        },
        "filter": {
          "L": [
            {
              "S": "<FILTER STRING>"
            }
          ]
        },
        "newfeedname": {
          "S": "<NEW FEED NAME>"
        },
        "newfeedtitle": {
          "S": "<NEW FEED TITLE>"
        }
      }
  4. Optional: for test purposes you can manually run the rss-filtered-feed-gen-ListSourcesFunction... Lambda function to trigger the flow. Just issue a test run (no input parameters needed) in the Amazon Lambda console or via CLI.

Resources:

License

License: MIT

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages