Front-End Web & Mobile

Text to Speech on Android Using AWS Amplify

AWS Amplify offers many categories that focus on making specific uses cases easier to implement using a variety of AWS Services under the hood. The Amplify Predictions category enables you to integrate machine learning into your application without any prior machine learning experience.

In this blog post, you will learn how to use the Predictions category to implement text to speech in an Android app.

Creating the App

Start by creating a new Android phone project and select Empty Compose Activity:

Name the project and set the Minimum SDK to API 24 or higher:

Initializing Amplify

In Android Studio, open the Terminal and create a new Amplify project by running the following command:

amplify init

Select the default value for each of the prompts or make adjustments as you see fit. The values I entered are listed in the snippet below:

? Enter a name for the project TextToSpeechBlog
The following configuration will be applied:

Project information
| Name: TextToSpeechBlog
| Environment: dev
| Default editor: Visual Studio Code
| App type: android
| Res directory: app/src/main/res

? Initialize the project with the above configuration? Yes
Using default provider  awscloudformation
? Select the authentication method you want to use: AWS profile
? Please choose the profile you want to use default

You will see the following output in the Terminal if you created the initial project successfully:

✅ Initialized your environment successfully.

Next, add the Predictions category by running the following command:

amplify add predictions

The Predictions category requires the Auth category to manage the permissions of who is able to access the Predictions resources. Enter the following values when prompted:

? Please select from one of the categories below Convert
? You need to add auth (Amazon Cognito) to your project in order to add storage for user files. Do you want to add auth now? Yes
? Do you want to use the default authentication and security configuration? Default configuration
? How do you want users to be able to sign in? Username
? Do you want to configure advanced settings? No, I am done.
? What would you like to convert? Generate speech audio from text
? Provide a friendly name for your resource speechGeneratorce5ed73c
? What is the source language? US English
? Select a speaker Joanna - Female
? Who should have access? Auth and Guest users

If you have successfully configured the Auth and Predictions categories, you will see the following output:

✅ Successfully updated auth resource locally.
Successfully added resource speechGeneratorce5ed73c locally

Push the Auth and Predictions configurations up to the cloud by running the following command:

amplify push -y

The -y flag allows you to push your configuration without needing to confirm the changes that will be applied to your Amplify project.

You will see the following output when your resources have successfully been configured:

✔ All resources are updated in the cloud

Installing Dependencies

Now that the Amplify backend is configured, it’s time to add Amplify as a dependency for the Android project. Add the following code to the app build.gradle file:

// 1
implementation 'com.amplifyframework:aws-auth-cognito:2.0.0'
implementation 'com.amplifyframework:aws-predictions:2.0.0'
// 2
implementation "androidx.compose.material:material-icons-extended:$compose_ui_version"
  1. Both the aws-auth-cognito and aws-predictions packages are needed to configure the respective plugins with Amplify.
  2. material-icons-extended provides more material icons which will be used as part of the UI

Then click Sync Now to install the dependencies. You should see the following output in the Build section:

BUILD SUCCESSFUL in 1s

Building the UI

The Android project is ready to work with the Amplify resources. Open MainActivity.kt and replace its contents with the following:

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContent {
            AmplifyTextToSpeechTheme {
                Surface(
                    modifier = Modifier.fillMaxSize(),
                    color = MaterialTheme.colors.background
                ) {
                    TextToSpeechScreen {}
                }
            }
        }
    }
}

@Composable
fun TextToSpeechScreen(message: (String) -> Unit) {
    val messageState = remember { mutableStateOf("") }

    Column(
        verticalArrangement = Arrangement.Center,
        horizontalAlignment = Alignment.CenterHorizontally,
        modifier = Modifier.fillMaxSize()
    ) {
        TextField(value = messageState.value, onValueChange = { messageState.value = it })

        IconButton(onClick = { message(messageState.value) }, modifier = Modifier.size(100.dp)) {
            Icon(
                Icons.Filled.VolumeUp,
                contentDescription = "Volume",
                modifier = Modifier.fillMaxSize(0.75f)
            )
        }
    }
}

The snippet above creates a simple Jetpack Compose UI that consists of a TextField used to capture user input and an IconButton that will be used to trigger the speech to text functionality.

Adding Plugins

Before any Amplify categories can be used by the app, they must be configured first. Create the following function in the MainActivity:

private fun configureAmplify() {
    try {
        Amplify.addPlugin(AWSCognitoAuthPlugin())
        Amplify.addPlugin(AWSPredictionsPlugin())
        Amplify.configure(applicationContext)
        Log.i("AmplifyProject", "Amplify Configured")
    } catch (error: Exception) {
        Log.e("AmplifyProject", "Failed Configure", error)
    }
}

The configureAmplify method will attempt to add the AWSCognitoAuthPlugin and AWSPredictionsPlugin to Amplify which enables their respective APIs. If there is an issue with the configuration, it will be logged in Logcat.

Next, call configureAmplify() in the onCreate method before any Amplify resources are used:

... // super.onCreate(savedInstanceState)

configureAmplify()

... // setContent {

Build and run and you will see the following output:

I/AmplifyProject: Amplify Configured

Handling Audio

Now you can add the logic for playing audio from an InputStream. Add the following method to MainActivity:

private val mp = MediaPlayer()

private fun playAudio(data: InputStream) {
    val mp3File = File(cacheDir, "audio.mp3")
    try {
        FileOutputStream(mp3File).use { out ->
            val buffer = ByteArray(8 * 1024)
            var bytesRead: Int
            while (data.read(buffer).also { bytesRead = it } != -1) {
                out.write(buffer, 0, bytesRead)
            }
            mp.reset()
            mp.setOnPreparedListener { obj: MediaPlayer -> obj.start() }
            mp.setDataSource(FileInputStream(mp3File).fd)
            mp.prepareAsync()
        }
    } catch (error: IOException) {
        Log.e("MyAmplifyApp", "Error writing audio file.")
    }
}

When playAudio is passed an InputStream, the data will be written to an MP3 file and passed to a FileOutputStream to be read by the MediaPlayer.

Next, create a function that will use the Amplify Predictions API to convert a String into an InputStream:

private fun readMessage(message: String) {
    Amplify.Predictions.convertTextToSpeech(
        message,
        { playAudio(it.audioData) },
        { Log.e("AmplifyProject", "Error", it) }
    )
}

The Amplify APIs follow a consistent pattern of selecting a category for the required use-case and offering the different methods relevant to the category. In this case, convertTextToSpeech is passed a String, which is then processed by machine learning resources under the hood to generate a phrase. The first block then passes the audioData to the playAudio function to have the statement read aloud by the device.

Lastly, update the TextToSpeechScreen block to call readMessage when the user taps the speaker button:

TextToSpeechScreen {
    readMessage(it)
}

Build and run. You will now be able to enter a message into the text field and press the button to hear your message read aloud. 🎉

Conclusion

Just like all Amplify categories, the Amplify Predictions category makes it easy to use and implement AWS resources into your Android projects. As you use Amplify to build your next project, be sure to reach out on the GitHub repository, or through the Amplify Discord server under the #android-help channel to help us prioritize features and enhancements.

Clean Up

Now that you’ve finished this walkthrough, you can delete the backend resources to avoid incurring unexpected costs using the command amplify delete.