Skip to content

aws-neuron/transformers-neuronx

Transformers Neuron for Trn1 and Inf2 is a software package that enables PyTorch users to perform large language model (LLM) inference on second-generation Neuron hardware (See: NeuronCore-v2).

Transformers Neuron (transformers-neuronx) Documentation

Please refer to the Transformers Neuron documentation for setup and developer guides.

Installation

Stable Release

To install the most rigorously tested stable release, use the PyPI pip wheel:

pip install transformers-neuronx --extra-index-url=https://pip.repos.neuron.amazonaws.com

Development Version

The AWS Neuron team is currently restructuring the contribution model of this github repository. This github repository content does not reflect latest features and improvements of transformers-neuronx library. Please install the stable release version from https://pip.repos.neuron.amazonaws.com to get latest features and improvements.

Release Notes and Supported Models

Please refer to the transformers-neuronx release notes to see the latest supported features and models.

Troubleshooting

Please refer to our Contact Us page for additional information and support resources. If you intend to file a ticket and you can share your model artifacts, please re-run your failing script with NEURONX_DUMP_TO=./some_dir. This will dump compiler artifacts and logs to ./some_dir. You can then include this directory in your correspondance with us. The artifacts and logs are useful for debugging the specific failure.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the Apache License 2.0 License.