Quickstart

Prerequisites

Before you begin, ensure you have the following installed:

Android Studio Hedgehog (2023.1.1) or newer
JDK 11 or higher
Android SDK with API level 24 or higher
An Android device or emulator running Android 7.0 (API 24) or higher

The app requires the RECORD_AUDIO permission. Make sure your device or emulator has microphone access enabled.

Installation

Clone the repository

Clone the project to your local machine:

terminal

git clone <repository-url>
cd voice-to-text-android

Open in Android Studio

Open Android Studio and select “Open an existing project”. Navigate to the cloned directory and click “OK”.

Android Studio will automatically start syncing Gradle files. This may take a few minutes on the first run.

Sync Gradle

Wait for Gradle to sync all dependencies. The app uses standard AndroidX libraries:

dependencies

implementation(libs.androidx.core.ktx)
implementation(libs.androidx.lifecycle.runtime.ktx)
implementation(libs.androidx.activity.compose)
implementation(platform(libs.androidx.compose.bom))
implementation(libs.androidx.material3)

Configure device or emulator

Set up your target device:For physical device:

Enable Developer Options and USB debugging
Connect your device via USB
Verify the device appears in Android Studio’s device dropdown

For emulator:

Open AVD Manager in Android Studio
Create a new virtual device with API 24 or higher
Ensure the emulator has microphone access enabled in settings

Run the app

Click the Run button (green play icon) or press Shift + F10. Android Studio will build and install the app on your selected device.

The first build may take several minutes as Gradle downloads dependencies and compiles the project.

Using the app

Once the app is running, you’ll see a simple interface with a text field and a “Speak” button.

Grant microphone permission

Tap the “Speak” button. On first use, the app will request permission to record audio. Tap “Allow” to grant the permission.

MainActivity.kt:125-143

if (ContextCompat.checkSelfPermission(
        context,
        Manifest.permission.RECORD_AUDIO
    ) == PackageManager.PERMISSION_GRANTED
) {
    // Launch speech recognizer
    val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
    speechRecognizerLauncher.launch(intent)
} else {
    // Request permission
    ActivityCompat.requestPermissions(
        context as Activity,
        arrayOf(Manifest.permission.RECORD_AUDIO),
        100
    )
}

Speak your message

After granting permission, tap “Speak” again. A Google speech recognition dialog will appear with the prompt “Speak now…”Speak clearly into your device’s microphone. The recognized text will automatically populate the text field.

View results

The recognized speech appears in the text field. You can also manually type or edit the text if needed.

MainActivity.kt:80-91

val speechRecognizerLauncher = rememberLauncherForActivityResult(
    contract = ActivityResultContracts.StartActivityForResult(),
    onResult = { result ->
        val spokenText = result.data
            ?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
            ?.firstOrNull()
        if (spokenText != null) {
            prompt = spokenText  // Update text field
        } else {
            Toast.makeText(context, "Failed to recognize speech", Toast.LENGTH_SHORT).show()
        }
    }
)

Project structure

Here’s an overview of the key files in the project:

app/src/main/
├── java/com/android/example/voicetotext/
│   ├── MainActivity.kt           # Main activity and UI components
│   └── ui/theme/
│       ├── Color.kt             # Material3 color definitions
│       ├── Theme.kt             # App theme configuration
│       └── Type.kt              # Typography styles
├── res/
│   ├── values/
│   │   └── strings.xml          # String resources
│   └── xml/
│       └── backup_rules.xml     # Backup configuration
└── AndroidManifest.xml          # App manifest with permissions

Understanding the core components

MainActivity

The MainActivity class in MainActivity.kt:45-61 sets up the app with edge-to-edge display and Jetpack Compose:

MainActivity.kt

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        enableEdgeToEdge()
        setContent {
            VoiceTotextTheme {
                Scaffold(
                    modifier = Modifier.fillMaxSize(),
                    topBar = { AppBar() }) { innerPadding ->
                    VoiceRecognitionScreen(
                        modifier = Modifier.padding(innerPadding)
                    )
                }
            }
        }
    }
}

VoiceRecognitionScreen

This composable function (MainActivity.kt:75-152) contains all the UI and voice recognition logic:

Text field: A BasicTextField with a custom decoration box for placeholder text
Voice button: Triggers the speech recognition flow
Result handler: Processes the recognized speech and updates the UI state

RecognizerIntent configuration

The app configures the speech recognizer with these parameters in MainActivity.kt:130-136:

val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
intent.putExtra(
    RecognizerIntent.EXTRA_LANGUAGE_MODEL,
    RecognizerIntent.LANGUAGE_MODEL_FREE_FORM  // Free-form speech
)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speak now...")

The LANGUAGE_MODEL_FREE_FORM is optimized for general dictation and conversational speech, making it ideal for this use case.

Troubleshooting

Permission denied

If you see a “Permission denied” error:

Check that the RECORD_AUDIO permission is declared in AndroidManifest.xml
Ensure you granted the permission when prompted
Go to device Settings > Apps > VoiceTotext > Permissions and verify microphone access

Speech recognition not available

If you get a “Failed to recognize speech” error:

Ensure your device has an active internet connection (Google’s speech recognition requires network access)
Verify that Google app is installed and up-to-date on your device
Check that the device microphone is working properly

Emulator microphone issues

If the emulator isn’t detecting your voice:

Open emulator Settings > Privacy > Microphone
Ensure “VoiceTotext” has microphone permission
In Android Studio, go to Tools > AVD Manager > Edit > Show Advanced Settings
Verify “Enable Device Frame” is checked (enables microphone passthrough)

Next steps

Now that you have the app running, explore these topics:

Speech Recognition

Deep dive into how speech-to-text works

UI Components

Learn about the Jetpack Compose UI components

API Reference

Explore detailed API documentation

Troubleshooting

Solutions to common issues

Get Started

Implementation

API Reference

Resources

Prerequisites

Installation

Using the app

Project structure

Understanding the core components

MainActivity

VoiceRecognitionScreen

RecognizerIntent configuration

Troubleshooting

Permission denied

Speech recognition not available

Emulator microphone issues

Next steps

Speech Recognition

UI Components

API Reference

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Implementation

API Reference

Resources

​Prerequisites

​Installation

​Using the app

​Project structure

​Understanding the core components

​MainActivity

​VoiceRecognitionScreen

​RecognizerIntent configuration

​Troubleshooting

​Permission denied

​Speech recognition not available

​Emulator microphone issues

​Next steps

Speech Recognition

UI Components

API Reference

Troubleshooting

Build docs developers (and LLMs) love

Prerequisites

Installation

Using the app

Project structure

Understanding the core components

MainActivity

VoiceRecognitionScreen

RecognizerIntent configuration

Troubleshooting

Permission denied

Speech recognition not available

Emulator microphone issues

Next steps