Skip to main content

Prerequisites

Before you begin, ensure you have the following installed:
  • Android Studio Hedgehog (2023.1.1) or newer
  • JDK 11 or higher
  • Android SDK with API level 24 or higher
  • An Android device or emulator running Android 7.0 (API 24) or higher
The app requires the RECORD_AUDIO permission. Make sure your device or emulator has microphone access enabled.

Installation

1

Clone the repository

Clone the project to your local machine:
terminal
git clone <repository-url>
cd voice-to-text-android
2

Open in Android Studio

Open Android Studio and select “Open an existing project”. Navigate to the cloned directory and click “OK”.
Android Studio will automatically start syncing Gradle files. This may take a few minutes on the first run.
3

Sync Gradle

Wait for Gradle to sync all dependencies. The app uses standard AndroidX libraries:
dependencies
implementation(libs.androidx.core.ktx)
implementation(libs.androidx.lifecycle.runtime.ktx)
implementation(libs.androidx.activity.compose)
implementation(platform(libs.androidx.compose.bom))
implementation(libs.androidx.material3)
4

Configure device or emulator

Set up your target device:For physical device:
  • Enable Developer Options and USB debugging
  • Connect your device via USB
  • Verify the device appears in Android Studio’s device dropdown
For emulator:
  • Open AVD Manager in Android Studio
  • Create a new virtual device with API 24 or higher
  • Ensure the emulator has microphone access enabled in settings
5

Run the app

Click the Run button (green play icon) or press Shift + F10. Android Studio will build and install the app on your selected device.
The first build may take several minutes as Gradle downloads dependencies and compiles the project.

Using the app

Once the app is running, you’ll see a simple interface with a text field and a “Speak” button.
1

Grant microphone permission

Tap the “Speak” button. On first use, the app will request permission to record audio. Tap “Allow” to grant the permission.
MainActivity.kt:125-143
if (ContextCompat.checkSelfPermission(
        context,
        Manifest.permission.RECORD_AUDIO
    ) == PackageManager.PERMISSION_GRANTED
) {
    // Launch speech recognizer
    val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
    speechRecognizerLauncher.launch(intent)
} else {
    // Request permission
    ActivityCompat.requestPermissions(
        context as Activity,
        arrayOf(Manifest.permission.RECORD_AUDIO),
        100
    )
}
2

Speak your message

After granting permission, tap “Speak” again. A Google speech recognition dialog will appear with the prompt “Speak now…”Speak clearly into your device’s microphone. The recognized text will automatically populate the text field.
3

View results

The recognized speech appears in the text field. You can also manually type or edit the text if needed.
MainActivity.kt:80-91
val speechRecognizerLauncher = rememberLauncherForActivityResult(
    contract = ActivityResultContracts.StartActivityForResult(),
    onResult = { result ->
        val spokenText = result.data
            ?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
            ?.firstOrNull()
        if (spokenText != null) {
            prompt = spokenText  // Update text field
        } else {
            Toast.makeText(context, "Failed to recognize speech", Toast.LENGTH_SHORT).show()
        }
    }
)

Project structure

Here’s an overview of the key files in the project:
app/src/main/
├── java/com/android/example/voicetotext/
│   ├── MainActivity.kt           # Main activity and UI components
│   └── ui/theme/
│       ├── Color.kt             # Material3 color definitions
│       ├── Theme.kt             # App theme configuration
│       └── Type.kt              # Typography styles
├── res/
│   ├── values/
│   │   └── strings.xml          # String resources
│   └── xml/
│       └── backup_rules.xml     # Backup configuration
└── AndroidManifest.xml          # App manifest with permissions

Understanding the core components

MainActivity

The MainActivity class in MainActivity.kt:45-61 sets up the app with edge-to-edge display and Jetpack Compose:
MainActivity.kt
class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        enableEdgeToEdge()
        setContent {
            VoiceTotextTheme {
                Scaffold(
                    modifier = Modifier.fillMaxSize(),
                    topBar = { AppBar() }) { innerPadding ->
                    VoiceRecognitionScreen(
                        modifier = Modifier.padding(innerPadding)
                    )
                }
            }
        }
    }
}

VoiceRecognitionScreen

This composable function (MainActivity.kt:75-152) contains all the UI and voice recognition logic:
  • Text field: A BasicTextField with a custom decoration box for placeholder text
  • Voice button: Triggers the speech recognition flow
  • Result handler: Processes the recognized speech and updates the UI state

RecognizerIntent configuration

The app configures the speech recognizer with these parameters in MainActivity.kt:130-136:
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
intent.putExtra(
    RecognizerIntent.EXTRA_LANGUAGE_MODEL,
    RecognizerIntent.LANGUAGE_MODEL_FREE_FORM  // Free-form speech
)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speak now...")
The LANGUAGE_MODEL_FREE_FORM is optimized for general dictation and conversational speech, making it ideal for this use case.

Troubleshooting

Permission denied

If you see a “Permission denied” error:
  1. Check that the RECORD_AUDIO permission is declared in AndroidManifest.xml
  2. Ensure you granted the permission when prompted
  3. Go to device Settings > Apps > VoiceTotext > Permissions and verify microphone access

Speech recognition not available

If you get a “Failed to recognize speech” error:
  • Ensure your device has an active internet connection (Google’s speech recognition requires network access)
  • Verify that Google app is installed and up-to-date on your device
  • Check that the device microphone is working properly

Emulator microphone issues

If the emulator isn’t detecting your voice:
  1. Open emulator Settings > Privacy > Microphone
  2. Ensure “VoiceTotext” has microphone permission
  3. In Android Studio, go to Tools > AVD Manager > Edit > Show Advanced Settings
  4. Verify “Enable Device Frame” is checked (enables microphone passthrough)

Next steps

Now that you have the app running, explore these topics:

Speech Recognition

Deep dive into how speech-to-text works

UI Components

Learn about the Jetpack Compose UI components

API Reference

Explore detailed API documentation

Troubleshooting

Solutions to common issues

Build docs developers (and LLMs) love