This project pulls in AWS documentation and extracts, consolidates, and organizes all admonitions.
It is used to generate the content for AWS Reference Notes
AWS Reference Notes is a compilation of the Note-able sections of AWS services1.
It is compiled by parsing all sections of AWS Documentation and extracting specific admonitions (eg. Note, Important, Considerations) which are then organized and compiled here.
AWS Reference Notes exists because I observed that any section of the AWS docs that start with a Note was something that was worth paying attention to. These sections documented gotchas, limits, and other caveats of a particular service. When not observed, they can take anywhere on the order of hours to weeks to work around.
-
Clone the repo
git clone https://github.com/kevinslin/aws-doc-extractor.git
-
Install dependencies
cd aws-doc-extractor yarn
-
Compile typescript
yarn watch
-
Generate docs (optional)
yarn gen:all
NOTE: currently, the default is to write the docs in dendron flavored markdown
These instructions go over generating content to update AWS Reference Notes
- Create a dependencies folder
mkdir dependencies && cd dependencies git clone https://github.com/dendronhq/dendron-api-v2 git clone https://github.com/kevinslin/aws-reference-notes
- Run the api server
cd dendron-api-v2 yarn yarn dev
- Sync the docs
# $ROOT is where the package.json of aws-doc-extractor is cd $ROOT curl --location 'localhost:8080/sync/to' \ --header 'Content-Type: application/json' \ --data '{ "src": "$ROOT/build/artifacts", "dest": "$ROOT/dependencies/aws-reference-notes/services", "targetFormat": "markdown", "include": "hierarchies=*", "exclude": "hierarchies=ignore.*", "deleteMissing": true }'
The following describes how the docs are extracted in pseudocode
- services.forEach
upsertDevGuide
: clone aws repo or pullupsertToc
: fetch table of contents for particular service
- services.forEach
extractNotesFromService
: extract Note sections for each service
generateSiteToc
: generate a global table of contents for all services
processMarkdownFiles
: extract Note sections from aws docscombineTocAndNotes
: merge extracted sections with aws table of contentsrenderFromJSON(renderTargetFormat)
: write sections into target formatfilterSectionWithContent
: exclude all docs that don't have Note sectionssection2VFiles
: convert sections to virtual filesnew {renderTargetFormat}.write
: write output to given target formater
This describes the file layout of the project
- build/
- artifacts/
- {service}/
- {target}/
- SUMMARY.md
- staging/
- {service}/
- docs/
- /{service}/
- developer-guide/
- toc.json
- build: contains the extracted notes
- staging: intermediary files created when extracting notes
- {service}: specific AWS service
- {target}: target format (eg. markdown, html, etc)
- SUMMARY.md: table of contents for all services
- docs: aws upstream doc repos
- {service}:
- developer-guide: doc repo for particular aws service
- toc.json: the table of contents for a particular aws service
- {service}:
- Jest tests don't work. Converted the package to es modules which has caused jest to fail.
- HTML target does not work. Made some refactoring to how targets work. Have not had a chance to update the HTML target spec
Footnotes
-
NOTE: AWS Reference Notes currently has 66 services respresented ↩