[Kotlin & Android] Android 지속적인 음성인식 기능 구현 (SpeechRecognizer)

개요

지난 포스팅때 Foreground 서비스와 SpeechRecognizer API에 대해서 알아보았다.
Google 에서 제공하는 SpeechRecognizer API는 연속해서 음성인식을 할 수 있는 기능이 따로 없으며
무료 API 이기 때문에, 제한적인 부분이 많다.
이번 포스팅에서는 지난 포스팅때 다뤘던 두 기능으로 음성인식이 끝날 때 마다 다시 시작하는 방식을 사용하여
백그라운드에서 지속적인 음성인식을 하는 방법을 알아보겠다.
기능의 자세한 내용은 지난 포스팅을 참고하길 바란다.

1. 동작 원리

<MainActivity.kt>

        btn_start = findViewById(R.id.btn_start)
        btn_start!!.setOnClickListener(View.OnClickListener {
            Toast.makeText(this@MainActivity, "음성인식 시작", Toast.LENGTH_SHORT).show()
            val intent = Intent(this@MainActivity, Foreground::class.java)
            startService(intent)
        })

포그라운드 서비스를 실행한다.

<Foreground.kt>

    override fun onCreate() {
        super.onCreate()

        // RecognizerIntent 생성
        intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        intent!!.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, packageName)
        intent!!.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "ko-KR")
        createNotification()
        startSTT()
    }

RecognizerIntent를 onCreate() 에 생성해준다.

<Foreground.kt>

    private fun startSTT() {
        stopSTT()
        mRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
        mRecognizer!!.setRecognitionListener(listener)
        mRecognizer!!.startListening(intent)
    }

startSTT() : 음성인식을 시작한다.

<Foreground.kt>

    private fun stopSTT() {
        if (mRecognizer != null) {
            mRecognizer!!.destroy()
            mRecognizer = null
        }
    }

stopSTT() : 만약 음성인식이 실행되고 있으면 destroy() 한 뒤 다시 시작해준다.

<Foreground.kt>

        override fun onError(i: Int) {
// 네트워크 또는 인식 오류가 발생했을 때 호출
            val message: String
            message = when (i) {
                SpeechRecognizer.ERROR_AUDIO -> "오디오 에러"
                SpeechRecognizer.ERROR_CLIENT -> "클라이언트 에러"
                SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS -> "퍼미션 없음"
                SpeechRecognizer.ERROR_NETWORK -> "네트워크 에러"
                SpeechRecognizer.ERROR_NETWORK_TIMEOUT -> "네트웍 타임아웃"
                SpeechRecognizer.ERROR_NO_MATCH -> "찾을 수 없음"
                SpeechRecognizer.ERROR_RECOGNIZER_BUSY -> "RECOGNIZER 가 바쁨"
                SpeechRecognizer.ERROR_SERVER -> "서버 에러"
                SpeechRecognizer.ERROR_SPEECH_TIMEOUT -> "시간초과"
                else -> "알 수 없는 오류"
            }
            Log.d(TAG, "[$message] 에러 발생")
            startSTT()
        }

onError() : 에러가 발생하게 되면 다시 음성인식을 시작한다. (음성이 들어가지 않으면 '찾을 수 없음' 에러가 발생)

<Foreground.kt>

        override fun onResults(results: Bundle) {
            val matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
            Toast.makeText(this@Foreground, matches!![0], Toast.LENGTH_SHORT).show()
            startSTT()
        }

onResults() : 결과를 토스트메시지로 띄워주고 다시 음성인식을 시작한다.

<Foreground.kt>

    override fun onDestroy() {
        super.onDestroy()
        if (mRecognizer != null) {
            mRecognizer!!.stopListening()
            mRecognizer!!.destroy()
            mRecognizer = null
        }
    }

onDestroy() : 서비스를 종료하게 되면 음성인식을 종료한다.

2. 결과

3. 전체 코드

<activity_main.xml>

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity"
    android:orientation="vertical"
    android:gravity="center">

    <Button
        android:id="@+id/btn_start"
        android:layout_width="150dp"
        android:layout_height="wrap_content"
        android:text="Start"/>

    <Button
        android:id="@+id/btn_stop"
        android:layout_width="150dp"
        android:layout_height="wrap_content"
        android:text="STOP"/>

</LinearLayout>

<AndroidManifest.xml>

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.example.continuous_vocie_recognition">

    <uses-permission android:name="android.permission.INTERNET"/>
    <uses-permission android:name="android.permission.RECORD_AUDIO"/>
    <uses-permission android:name="android.permission.FOREGROUND_SERVICE"/>

    <queries>
        <intent>
            <action android:name="android.speech.RecognitionService" />
        </intent>
    </queries>

    <application
        android:allowBackup="true"
        android:icon="@mipmap/ic_launcher"
        android:label="@string/app_name"
        android:roundIcon="@mipmap/ic_launcher_round"
        android:supportsRtl="true"
        android:theme="@style/Theme.Continuousvocierecognition">
        <service
            android:name=".Foreground"
            android:enabled="true"
            android:exported="true"></service>

        <activity android:name=".MainActivity"
            android:exported="true">
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />

                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>

</manifest>

<MainActivity.kt>

package com.example.continuous_voice_recognition

import android.Manifest
import android.content.Intent
import android.content.pm.PackageManager
import android.os.Build
import android.os.Bundle
import android.view.View
import android.widget.Button
import android.widget.Toast
import androidx.appcompat.app.AppCompatActivity
import androidx.core.app.ActivityCompat
import androidx.core.content.ContextCompat

class MainActivity : AppCompatActivity() {

    private val TAG = "MainTag"

    //Button
    var btn_start: Button? = null
    var btn_stop: Button? = null

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        requestPermission()

        btn_start = findViewById(R.id.btn_start)
        btn_start!!.setOnClickListener(View.OnClickListener {
            Toast.makeText(this@MainActivity, "음성인식 시작", Toast.LENGTH_SHORT).show()
            val intent = Intent(this@MainActivity, Foreground::class.java)
            startService(intent)
        })

        btn_stop = findViewById(R.id.btn_stop)
        btn_stop!!.setOnClickListener(View.OnClickListener {
            Toast.makeText(this@MainActivity, "음성인식 종료", Toast.LENGTH_SHORT).show()
            val intent = Intent(this@MainActivity, Foreground::class.java)
            stopService(intent)
        })
    }

    private fun requestPermission() {
        if (Build.VERSION.SDK_INT >= 23 && ContextCompat.checkSelfPermission(
                this,
                Manifest.permission.RECORD_AUDIO
            ) != PackageManager.PERMISSION_GRANTED
        ) {
            ActivityCompat.requestPermissions(
                this, arrayOf(Manifest.permission.RECORD_AUDIO), 0
            )
        }
    }
}

<Foreground.kt>

package com.example.continuous_voice_recognition

import android.app.NotificationChannel
import android.app.NotificationManager
import android.app.PendingIntent
import android.app.Service
import android.content.Intent
import android.graphics.Color
import android.os.Build
import android.os.Bundle
import android.os.IBinder
import android.speech.RecognitionListener
import android.speech.RecognizerIntent
import android.speech.SpeechRecognizer
import android.util.Log
import android.widget.Toast
import androidx.core.app.NotificationCompat

class Foreground : Service() {
    // STT
    private var intent: Intent? = null
    private var mRecognizer: SpeechRecognizer? = null
    override fun onBind(intent: Intent): IBinder? {
        // TODO: Return the communication channel to the service.
        throw UnsupportedOperationException("Not yet implemented")
    }

    override fun onCreate() {
        super.onCreate()

        // RecognizerIntent 생성
        intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        intent!!.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, packageName)
        intent!!.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "ko-KR")
        createNotification()
        startSTT()
    }

    private fun startSTT() {
        stopSTT()
        mRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
        mRecognizer!!.setRecognitionListener(listener)
        mRecognizer!!.startListening(intent)
    }

    private fun stopSTT() {
        if (mRecognizer != null) {
            mRecognizer!!.destroy()
            mRecognizer = null
        }
    }

    private val listener: RecognitionListener = object : RecognitionListener {
        override fun onReadyForSpeech(bundle: Bundle) {
            // 준비
        }

        override fun onBeginningOfSpeech() {
            // 시작
        }

        override fun onRmsChanged(v: Float) {
            // 입력받는 소리의 크기
        }

        override fun onBufferReceived(bytes: ByteArray) {
            // 인식된 단어를 buffer에 담음
        }

        override fun onEndOfSpeech() {
            // 중지
        }

        override fun onError(i: Int) {
// 네트워크 또는 인식 오류가 발생했을 때 호출
            val message: String
            message = when (i) {
                SpeechRecognizer.ERROR_AUDIO -> "오디오 에러"
                SpeechRecognizer.ERROR_CLIENT -> "클라이언트 에러"
                SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS -> "퍼미션 없음"
                SpeechRecognizer.ERROR_NETWORK -> "네트워크 에러"
                SpeechRecognizer.ERROR_NETWORK_TIMEOUT -> "네트웍 타임아웃"
                SpeechRecognizer.ERROR_NO_MATCH -> "찾을 수 없음"
                SpeechRecognizer.ERROR_RECOGNIZER_BUSY -> "RECOGNIZER 가 바쁨"
                SpeechRecognizer.ERROR_SERVER -> "서버 에러"
                SpeechRecognizer.ERROR_SPEECH_TIMEOUT -> "시간초과"
                else -> "알 수 없는 오류"
            }
            Log.d(TAG, "[$message] 에러 발생")
            startSTT()
        }

        override fun onResults(results: Bundle) {
            val matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
            Toast.makeText(this@Foreground, matches!![0], Toast.LENGTH_SHORT).show()
            startSTT()
        }

        override fun onPartialResults(bundle: Bundle) {
            // 부분 인식 결과
        }

        override fun onEvent(i: Int, bundle: Bundle) {
            // 향후 이벤트 추가 예약
        }
    }

    private fun createNotification() {
        val builder = NotificationCompat.Builder(this, "default")
        builder.setSmallIcon(R.mipmap.ic_launcher)
        builder.setContentTitle("STT 변환")
        builder.setContentText("음성인식 중..")
        builder.color = Color.RED
        val notificationIntent = Intent(this, MainActivity::class.java)
        notificationIntent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK or Intent.FLAG_ACTIVITY_SINGLE_TOP)
        val pendingIntent = PendingIntent.getActivity(this, 0, notificationIntent, 0)
        builder.setContentIntent(pendingIntent) // 알림 클릭 시 이동

        // 알림 표시
        val notificationManager = this.getSystemService(NOTIFICATION_SERVICE) as NotificationManager
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
            notificationManager.createNotificationChannel(
                NotificationChannel(
                    "default",
                    "기본 채널",
                    NotificationManager.IMPORTANCE_DEFAULT
                )
            )
        }
        notificationManager.notify(NOTI_ID, builder.build()) // id : 정의해야하는 각 알림의 고유한 int값
        val notification = builder.build()
        startForeground(NOTI_ID, notification)
    }

    override fun onDestroy() {
        super.onDestroy()
        if (mRecognizer != null) {
            mRecognizer!!.stopListening()
            mRecognizer!!.destroy()
            mRecognizer = null
        }
    }

    companion object {
        private const val TAG = "ForegroundTag"

        // Notification
        private const val NOTI_ID = 1
    }
}

마무리

지속적인 음성인식 후 텍스트 변환을 알아보았다.

Github

GitHub - jaemin-Yoo/continuous-voice-recognition: Implementation of continuous speech recognition using SpeechRecognizer api

Implementation of continuous speech recognition using SpeechRecognizer api - GitHub - jaemin-Yoo/continuous-voice-recognition: Implementation of continuous speech recognition using SpeechRecognizer...

github.com

'개발 > Kotlin & Android' 카테고리의 다른 글

[Kotlin & Android] 코틀린 for 문 사용법 (0)	2021.12.12
[Kotlin & Android] 코틀린 전역 변수 선언 (다른 엑티비티 변수 사용하기) (0)	2021.12.01
[Kotlin & Android] 호출한 엑티비티에서 값 받기(startActivityForResult 함수 deprecated) (0)	2021.11.30
[Kotlin & Android] Android SpeechRecognizer API 구현하기 (음성을 텍스트로 변환 - STT변환) (0)	2021.10.28
[Kotlin & Android] 포그라운드 서비스 (Foreground Service) (2)	2021.10.28

개요

1. 동작 원리

2. 결과

3. 전체 코드

마무리

'개발 > Kotlin & Android' 카테고리의 다른 글

티스토리툴바