Android 标准语音识别框架:SpeechRecognizer 的封装和调用

前语

此前,笔者梳理了语音相关的两篇文章:

  1. 怎么打造车载语音交互:Google Voice Interaction 给你答案:介绍的是 3rd Party App 怎么经过 Voice Interaction API 快速调用体系的语音交互服务快速完结承认、挑选的根底语音对话
  2. 直面原理:5 张图完全了解 Android TextToSpeech 机制:侧重于论述 TTS Engine App 怎么供给 Text-to-Speech 文字转语音服务,以及 3rd Party App 又怎么快捷地调用这些服务。

还缺终究一块即怎么向体系供给语音辨认SpeechRecognizer 服务、3rd Party App 怎么运用他们,以及体系和联络这两者?

本篇文章将为你补齐这块知识点。

怎么完结辨认服务?

首要咱们得供给辨认服务的完结,简单来说继承 RecognitionService 完结最重要的几个抽象办法即可:

  1. 首要可以界说抽象的辨认 Engine 的接口 IRecognitionEngine
  2. 在 RecognitionService 发动的时分获取辨认 engine 供给商的完结实例
  3. onStartListening() 里解析辨认恳求 Intent 中的参数,比方言语、最大成果数等信息封装成 json 字符串传递给 engine 的开端辨认。那么 Engine 也需求根据参数进行辨认完结方面的调整,并将辨认进程中相应的状况、成果回来,比方开端说话 beginningOfSpeech() 、完毕说话 endOfSpeech() 、中心成果 partialResults()
  4. onStopListening() 里调用 engine 的中止辨认,相同的需求 engine 回传成果,比方终究辨认成果 results()
  5. onCancel() 里履行 engine 供给的 release() 进行辨认 engine 的解绑、资源释放
interface IRecognitionEngine {
    fun init()
    fun startASR(parameter: String, callback: Callback?)
    fun stopASR(callback: Callback?)
    fun release(callback: Callback?)
}
class CommonRecognitionService : RecognitionService() {
    private val recognitionEngine: IRecognitionEngine by lazy {
        RecognitionProvider.provideRecognition()
    }
    override fun onCreate() {
        super.onCreate()
        recognitionEngine.init()
    }
    override fun onStartListening(intent: Intent?, callback: Callback?) {
        val params: String = "" // Todo parse parameter from intent
        recognitionEngine.startASR(params, callback)
    }
    override fun onStopListening(callback: Callback?) {
        recognitionEngine.stopASR(callback)
    }
    override fun onCancel(callback: Callback?) {
        recognitionEngine.release(callback)
    }
}

当然不要忘记在 Manifest 中声明:

<service
    android:name=".recognition.service.CommonRecognitionService"
    android:exported="true">
    <intent-filter>
        <action android:name="android.speech.RecognitionService"/>
    </intent-filter>
</service>

怎么恳求辨认?

首要得声明 capture audio 的 Runtime 权限,还需补充运行时权限的代码逻辑。

<manifest ... >
    <uses-configuration android:name="android.permission.RECORD_AUDIO"/>
</manifest>

别的,Android 11 以上的话,需求额定增加对辨认服务的包名 query 声明。

<manifest ... >
    ...
    <queries>
        <intent>
            <action
                android:name="android.speech.RecognitionService" />
        </intent>
    </queries>
</manifest>

权限满足之后,最好先查看整个体系里是否有 Recognition 服务可用,NO 的话,直接完毕即可。

class RecognitionHelper(val context: Context) {
    fun prepareRecognition(): Boolean {
        if (!SpeechRecognizer.isRecognitionAvailable(context)) {
            Log.e("RecognitionHelper", "System has no recognition service yet.")
            return false
        }
        ...
    }
}

有可用服务的话,经过 SpeechRecognizer 供给的静态办法创立调用辨认的进口实例,该办法有必要在主线程调用

class RecognitionHelper(val context: Context) : RecognitionListener{
    private lateinit var recognizer: SpeechRecognizer
    fun prepareRecognition(): Boolean {
        ...
        recognizer = SpeechRecognizer.createSpeechRecognizer(context)
        ...
    }
}

当然假如体系搭载的服务不止一个,并且已知了其包名,可指定辨认的完结方:

public static SpeechRecognizer createSpeechRecognizer (Context context,
                ComponentName serviceComponent)

接下来就是设置 Recognition 的监听器,对应着辨认进程中各种状况,比方:

  • onPartialResults() 回来的中心辨认成果,经过 SpeechRecognizer#RESULTS_RECOGNITION key 去 Bundle 中获取辨认字符串 getStringArrayList(String)
  • onResults() 将回来终究辨认的成果,解析办法同上
  • onBeginningOfSpeech():检测到说话开端
  • onEndOfSpeech():检测到说话完毕
  • onError() 将回来各种过错,和 SpeechRecognizer#ERROR_XXX 中各数值相对应,例如没有麦克风权限的话,会回来 ERROR_INSUFFICIENT_PERMISSIONS
  • 等等
class RecognitionHelper(val context: Context) : RecognitionListener{
    ...
    fun prepareRecognition(): Boolean {
        ...
        recognizer.setRecognitionListener(this)
        return true
    }
    override fun onReadyForSpeech(p0: Bundle?) {
        TODO("Not yet implemented")
    }
    override fun onBeginningOfSpeech() {
        TODO("Not yet implemented")
    }
    override fun onRmsChanged(p0: Float) {
        TODO("Not yet implemented")
    }
    override fun onBufferReceived(p0: ByteArray?) {
        TODO("Not yet implemented")
    }
    override fun onEndOfSpeech() {
        TODO("Not yet implemented")
    }
    override fun onError(p0: Int) {
        TODO("Not yet implemented")
    }
    override fun onResults(p0: Bundle?) {
        TODO("Not yet implemented")
    }
    override fun onPartialResults(p0: Bundle?) {
        TODO("Not yet implemented")
    }
    override fun onEvent(p0: Int, p1: Bundle?) {
        TODO("Not yet implemented")
    }
}

之后创立辨认的必要 Intent 信息并发动,信息包括:

  • EXTRA_LANGUAGE_MODEL:必选,希望辨认的偏好模型,比方代码里设置的自由方式的 LANGUAGE_MODEL_FREE_FORM 模型,还有依赖网络查找的 LANGUAGE_MODEL_WEB_SEARCH 模型等
  • EXTRA_PARTIAL_RESULTS:可选,是否要求辨认服务回传辨认途中的成果,默认 false
  • EXTRA_MAX_RESULTS:可选,设置答应服务回来的最多成果数值,int 类型
  • EXTRA_LANGUAGE:可选,设置辨认言语,默认情况下是 Locale.getDefault() 的地区言语(笔者运用的是 Google Assistant 供给的辨认服务,暂不支撑中文,所以此处装备的 Locale 为 ENGLISH)

别的,需求留心两点:1. 此办法有必要在上述监听器设置之后进行,2. 该办法得在主线程建议

class RecognitionHelper(val context: Context) : RecognitionListener{
    ...
    fun startRecognition() {
        val intent = createRecognitionIntent()
        recognizer.startListening(intent)
    }
    ...
}
fun createRecognitionIntent() = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
    putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
    putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true)
    putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3)
    putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.ENGLISH)
}

下面咱们增加一个布局调用上述的 RecognitionHelper 进行辨认的初始化和发动,并将成果进行展现。

Android 标准语音识别框架:SpeechRecognizer 的封装和调用

同时增加和 UI 交互的中心辨认成果和终究辨认成果的 interface,将 RecognitionListener 的数据带回。

interface ASRResultListener {
    fun onPartialResult(result: String)
    fun onFinalResult(result: String)
}
class RecognitionHelper(private val context: Context) : RecognitionListener {
    ...
    private lateinit var mResultListener: ASRResultListener
    fun prepareRecognition(resultListener: ASRResultListener): Boolean {
        ...
        mResultListener = resultListener
        ...
    }
    ...
    override fun onPartialResults(bundle: Bundle?) {
        bundle?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)?.let {
            Log.d(
                "RecognitionHelper", "onPartialResults() with:$bundle" +
                        " results:$it"
            )
            mResultListener.onPartialResult(it[0])
        }
    }
    override fun onResults(bundle: Bundle?) {
        bundle?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)?.let {
            Log.d(
                "RecognitionHelper", "onResults() with:$bundle" +
                        " results:$it"
            )
            mResultListener.onFinalResult(it[0])
        }
    }
}

接着,Activity 完结该托言,将数据展现到 TextView,为了可以让肉眼可以分辩中心成果的辨认进程,在更新 TextView 前进行 300ms 的等候。

class RecognitionActivity : AppCompatActivity(), ASRResultListener {
    private lateinit var binding: RecognitionLayoutBinding
    private val recognitionHelper: RecognitionHelper by lazy {
        RecognitionHelper(this)
    }
    private var updatingTextTimeDelayed = 0L
    private val mainHandler = Handler(Looper.getMainLooper())
    override fun onCreate(savedInstanceState: Bundle?) {
        ...
        if (!recognitionHelper.prepareRecognition(this)) {
            Toast.makeText(this, "Recognition not available", Toast.LENGTH_SHORT).show()
            return
        }
        binding.start.setOnClickListener {
            Log.d("RecognitionHelper", "startRecognition()")
            recognitionHelper.startRecognition()
        }
        binding.stop.setOnClickListener {
            Log.d("RecognitionHelper", "stopRecognition()")
            recognitionHelper.stopRecognition()
        }
    }
    override fun onStop() {
        super.onStop()
        Log.d("RecognitionHelper", "onStop()")
        recognitionHelper.releaseRecognition()
    }
    override fun onPartialResult(result: String) {
        Log.d("RecognitionHelper", "onPartialResult() with result:$result")
        updatingTextTimeDelayed += 300L
        mainHandler.postDelayed(
            {
                Log.d("RecognitionHelper", "onPartialResult() updating")
                binding.recoAsr.text = result
            }, updatingTextTimeDelayed
        )
    }
    override fun onFinalResult(result: String) {
        Log.d("RecognitionHelper", "onFinalResult() with result:$result")
        updatingTextTimeDelayed += 300L
        mainHandler.postDelayed(
            {
                Log.d("RecognitionHelper", "onFinalResult() updating")
                binding.recoAsr.text = result
            }, updatingTextTimeDelayed
        )
    }
}

咱们点击“START RECOGNITION” button,然后可以看到手机右上角显示了 mic 录音中,当咱们说出“Can you introduce yourself” 后,TextView 可以逐渐上屏,呈现打字机的效果。

下面是进程中的 log,也反映了辨认进程:

// 初始化
08-15 22:43:13.963  6879  6879 D RecognitionHelper: onCreate()
08-15 22:43:14.037  6879  6879 E RecognitionHelper: audio recording permission granted
08-15 22:43:14.050  6879  6879 D RecognitionHelper: onStart()
// 开端辨认
08-15 22:43:41.491  6879  6879 D RecognitionHelper: startRecognition()
08-15 22:43:41.577  6879  6879 D RecognitionHelper: onReadyForSpeech()
08-15 22:43:41.776  6879  6879 D RecognitionHelper: onRmsChanged() with:-2.0
...
08-15 22:43:46.532  6879  6879 D RecognitionHelper: onRmsChanged() with:-0.31999993
// 检测到开端说话
08-15 22:43:46.540  6879  6879 D RecognitionHelper: onBeginningOfSpeech()
// 第 1 个辨认成果:Can
08-15 22:43:46.541  6879  6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can]
08-15 22:43:46.541  6879  6879 D RecognitionHelper: onPartialResult() with result:Can
// 第 2 个辨认成果:Can you
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you]
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResult() with result:Can you
// 第 3 个辨认成果:Can you in
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you in], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you in]
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResult() with result:Can you in
// 第 4 个辨认成果:Can you intro
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you intro], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you intro]
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResult() with result:Can you intro
// 第 n 个辨认成果:Can you introduce yourself
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you introduce yourself], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you introduce yourself]
08-15 22:43:46.542  6879  6879 D RecognitionHelper: onPartialResult() with result:Can you introduce yourself
// 检测到中止说话
08-15 22:43:46.543  6879  6879 D RecognitionHelper: onEndOfSpeech()
08-15 22:43:46.543  6879  6879 D RecognitionHelper: onEndOfSpeech()
08-15 22:43:46.545  6879  6879 D RecognitionHelper: onResults() with:Bundle[{results_recognition=[Can you introduce yourself], confidence_scores=[0.0]}] results:[Can you introduce yourself]
// 辨认到终究成果:Can you introduce yourself
08-15 22:43:46.545  6879  6879 D RecognitionHelper: onFinalResult() with result:Can you introduce yourself

体系怎么调度?

SpeechRecognizer 没有像 Text-to-speech 相同在设置中供给独立的设置进口,其默认 App 由 VoiceInteraction 联动设置。

但如下指令可以 dump 出体系默认的辨认服务。

adb shell settings get secure voice_recognition_service

当在模拟器中 dump 的话,可以看到默认搭载的是 Google 的辨认服务。

com.google.android.tts/com.google.android.apps.speech.tts.googletts.service.GoogleTTSRecognitionService

在三星设备中 dump 的话,则是 Samsung 供给的辨认服务。

com.samsung.android.bixby.agent/.mainui.voiceinteraction.RecognitionServiceTrampoline

咱们从恳求辨认中提及的几个 API 下手探究一下辨认服务的完结原理。

检测辨认服务

查看服务是否可用的完结很简单,即是用 Recognition 专用的 Action(“android.speech.RecognitionService”) 去 PackageManager 中检索,可以发动的 App 存在 1 个的话,即以为体系有辨认服务可用。

    public static boolean isRecognitionAvailable(final Context context) {
        final List<ResolveInfo> list = context.getPackageManager().queryIntentServices(
                new Intent(RecognitionService.SERVICE_INTERFACE), 0);
        return list != null && list.size() != 0;
    }

初始化辨认服务

正如【怎么恳求辨认?】章节中讲述的,调用静态办法 createSpeechRecognizer() 完结初始化,内部将查看 Context 是否存在、根据是否指定辨认服务的包名决议是否记录方针的服务称号。

    public static SpeechRecognizer createSpeechRecognizer(final Context context) {
        return createSpeechRecognizer(context, null);
    }
    public static SpeechRecognizer createSpeechRecognizer(final Context context,
            final ComponentName serviceComponent) {
        if (context == null) {
            throw new IllegalArgumentException("Context cannot be null");
        }
        checkIsCalledFromMainThread();
        return new SpeechRecognizer(context, serviceComponent);
    }
    private SpeechRecognizer(final Context context, final ComponentName serviceComponent) {
        mContext = context;
        mServiceComponent = serviceComponent;
        mOnDevice = false;
    }

得到 SpeechRecognizer 之后调用 setRecognitionListener() 则稍微复杂些:

  1. 查看调用源头是否归于主线程
  2. 创立专用 Message MSG_CHANGE_LISTENER
  3. 假如体系处理 Recognition 恳求的服务 SpeechRecognitionManagerService 没有建立衔接,先将该 Message 排入 Pending Queue,等后续建议辨认的时分创立衔接后会将 Message 发往 Handler
  4. 反之直接放入 Handler 等候调度
    public void setRecognitionListener(RecognitionListener listener) {
        checkIsCalledFromMainThread();
        putMessage(Message.obtain(mHandler, MSG_CHANGE_LISTENER, listener));
    }
    private void putMessage(Message msg) {
        if (mService == null) {
            mPendingTasks.offer(msg);
        } else {
            mHandler.sendMessage(msg);
        }
    }

而 Handler 经过 handleChangeListener() 将 Listener 实例更新。

    private Handler mHandler = new Handler(Looper.getMainLooper()) {
        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                ...
                case MSG_CHANGE_LISTENER:
                    handleChangeListener((RecognitionListener) msg.obj);
                    break;
                ...
            }
        }
    };
    private void handleChangeListener(RecognitionListener listener) {
        if (DBG) Log.d(TAG, "handleChangeListener, listener=" + listener);
        mListener.mInternalListener = listener;
    }

开端辨认

startListening() 首要将保证辨认恳求的 Intent 不为空,否则弹出 “intent must not be null” 的提示,接着查看调用线程是否是主线程,反之抛出 “SpeechRecognizer should be used only from the application’s main thread” 的 Exception。

然后就是保证服务是准备稳当的,否则的话调用 connectToSystemService() 建立辨认服务的衔接。

    public void startListening(final Intent recognizerIntent) {
        if (recognizerIntent == null) {
            throw new IllegalArgumentException("intent must not be null");
        }
        checkIsCalledFromMainThread();
        if (mService == null) {
            // First time connection: first establish a connection, then dispatch #startListening.
            connectToSystemService();
        }
        putMessage(Message.obtain(mHandler, MSG_START, recognizerIntent));
    }

connectToSystemService() 的第一步是调用 getSpeechRecognizerComponentName() 获取辨认服务的组件称号,一种是来自于恳求 App 的指定,一种是来自 SettingsProvider 中寄存的当时辨认服务的包名 VOICE_RECOGNITION_SERVICE,其实就是和 VoiceInteraction 的 App 一致。假如包名不存在的话完毕。

包名确实存在的话,经过 IRecognitionServiceManager.aidl 向 SystemServer 中办理语音辨认的 SpeechRecognitionManagerService 体系服务发送创立 Session 的恳求。

    /** Establishes a connection to system server proxy and initializes the session. */
    private void connectToSystemService() {
        if (!maybeInitializeManagerService()) {
            return;
        }
        ComponentName componentName = getSpeechRecognizerComponentName();
        if (!mOnDevice && componentName == null) {
            mListener.onError(ERROR_CLIENT);
            return;
        }
        try {
            mManagerService.createSession(
                    componentName,
                    mClientToken,
                    mOnDevice,
                    new IRecognitionServiceManagerCallback.Stub(){
                        @Override
                        public void onSuccess(IRecognitionService service) throws RemoteException {
                            mService = service;
                            while (!mPendingTasks.isEmpty()) {
                                mHandler.sendMessage(mPendingTasks.poll());
                            }
                        }
                        @Override
                        public void onError(int errorCode) throws RemoteException {
                            mListener.onError(errorCode);
                        }
                    });
        } catch (RemoteException e) {
            e.rethrowFromSystemServer();
        }
    }

SpeechRecognitionManagerService 的处理是调用 SpeechRecognitionManagerServiceImpl 完结。

// SpeechRecognitionManagerService.java
    final class SpeechRecognitionManagerServiceStub extends IRecognitionServiceManager.Stub {
        @Override
        public void createSession(
                ComponentName componentName,
                IBinder clientToken,
                boolean onDevice,
                IRecognitionServiceManagerCallback callback) {
            int userId = UserHandle.getCallingUserId();
            synchronized (mLock) {
                SpeechRecognitionManagerServiceImpl service = getServiceForUserLocked(userId);
                service.createSessionLocked(componentName, clientToken, onDevice, callback);
            }
        }
        ...
    }

SpeechRecognitionManagerServiceImpl 则是交给 RemoteSpeechRecognitionService 类完结和 App 辨认服务的绑定,可以看到 RemoteSpeechRecognitionService 将负责和辨认服务的通讯。

// SpeechRecognitionManagerServiceImpl.java
    void createSessionLocked( ... ) {
        ...
        RemoteSpeechRecognitionService service = createService(creatorCallingUid, serviceComponent);
        ...
        service.connect().thenAccept(binderService -> {
            if (binderService != null) {
                try {
                    callback.onSuccess(new IRecognitionService.Stub() {
                        @Override
                        public void startListening( ... )
                                        throws RemoteException {
                            ...
                            service.startListening(recognizerIntent, listener, attributionSource);
                        }
                        ...
                    });
                } catch (RemoteException e) {
                    tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
                }
            } else {
                tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
            }
        });
    }

当和辨认服务 App 的衔接建立成功或许现已存在的话,发送 MSG_START 的 Message,Main Handler 则是调用 handleStartListening() 持续。其首要会再度查看 mService 是否存在,防止引发 NPE。

接着,向该 AIDL 接口代理目标发送开端聆听的恳求。

    private Handler mHandler = new Handler(Looper.getMainLooper()) {
        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                case MSG_START:
                    handleStartListening((Intent) msg.obj);
                    break;
                ...
            }
        }
    };
    private void handleStartListening(Intent recognizerIntent) {
        if (!checkOpenConnection()) {
            return;
        }
        try {
            mService.startListening(recognizerIntent, mListener, mContext.getAttributionSource());
        }
        ...
    }

该 AIDL 的界说在如下文件中:

// android/speech/IRecognitionService.aidl
oneway interface IRecognitionService {
    void startListening(in Intent recognizerIntent, in IRecognitionListener listener,
            in AttributionSource attributionSource);
    void stopListening(in IRecognitionListener listener);
    void cancel(in IRecognitionListener listener, boolean isShutdown);
    ...
}

该 AIDL 的完结在体系的辨认办理类 SpeechRecognitionManagerServiceImpl 中:

// com/android/server/speech/SpeechRecognitionManagerServiceImpl.java
    void createSessionLocked( ... ) {
        ...
        service.connect().thenAccept(binderService -> {
            if (binderService != null) {
                try {
                    callback.onSuccess(new IRecognitionService.Stub() {
                        @Override
                        public void startListening( ...) {
                            attributionSource.enforceCallingUid();
                            if (!attributionSource.isTrusted(mMaster.getContext())) {
                                attributionSource = mMaster.getContext()
                                        .getSystemService(PermissionManager.class)
                                        .registerAttributionSource(attributionSource);
                            }
                            service.startListening(recognizerIntent, listener, attributionSource);
                        }
                        ...
                    });
                } ...
            } else {
                tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
            }
        });
    }

此后还要经过一层 RemoteSpeechRecognitionService 的中转:

// com/android/server/speech/RemoteSpeechRecognitionService.java
void startListening(Intent recognizerIntent, IRecognitionListener listener,
            @NonNull AttributionSource attributionSource) {
        ...
        synchronized (mLock) {
            if (mSessionInProgress) {
                tryRespondWithError(listener, SpeechRecognizer.ERROR_RECOGNIZER_BUSY);
                return;
            }
            mSessionInProgress = true;
            mRecordingInProgress = true;
            mListener = listener;
            mDelegatingListener = new DelegatingListener(listener, () -> {
                synchronized (mLock) {
                    resetStateLocked();
                }
            });
            final DelegatingListener listenerToStart = this.mDelegatingListener;
            run(service ->
                    service.startListening(
                            recognizerIntent,
                            listenerToStart,
                            attributionSource));
        }
    }

终究调用详细服务的完结,天然位于 RecognitionService 中,该 Binder 线程向主线程发送 MSG_START_LISTENING Message:

/** Binder of the recognition service */
    private static final class RecognitionServiceBinder extends IRecognitionService.Stub {
        ...
        @Override
        public void startListening(Intent recognizerIntent, IRecognitionListener listener,
                @NonNull AttributionSource attributionSource) {
            final RecognitionService service = mServiceRef.get();
            if (service != null) {
                service.mHandler.sendMessage(Message.obtain(service.mHandler,
                        MSG_START_LISTENING, service.new StartListeningArgs(
                                recognizerIntent, listener, attributionSource)));
            }
        }
        ...
    }
    private final Handler mHandler = new Handler() {
        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                case MSG_START_LISTENING:
                    StartListeningArgs args = (StartListeningArgs) msg.obj;
                    dispatchStartListening(args.mIntent, args.mListener, args.mAttributionSource);
                    break;
                ...
            }
        }
    };

Handler 接受相同将详细事情交由 dispatchStartListening() 持续,最重要的内容是查看建议辨认的 Intent 中是否供给了 EXTRA_AUDIO_SOURCE 活泼音频来历,或许恳求的 App 是否具备 RECORD_AUDIO 的 permission。

private void dispatchStartListening(Intent intent, final IRecognitionListener listener,
            @NonNull AttributionSource attributionSource) {
        try {
            if (mCurrentCallback == null) {
                boolean preflightPermissionCheckPassed =
                        intent.hasExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE)
                        || checkPermissionForPreflightNotHardDenied(attributionSource);
                if (preflightPermissionCheckPassed) {
                    mCurrentCallback = new Callback(listener, attributionSource);
                    RecognitionService.this.onStartListening(intent, mCurrentCallback);
                }
                if (!preflightPermissionCheckPassed || !checkPermissionAndStartDataDelivery()) {
                    listener.onError(SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS);
                    if (preflightPermissionCheckPassed) {
                        // If we attempted to start listening, cancel the callback
                        RecognitionService.this.onCancel(mCurrentCallback);
                        dispatchClearCallback();
                    }
                }
                ...
            }
        } catch (RemoteException e) {
            Log.d(TAG, "onError call from startListening failed");
        }
    }

任一条件满足的话,调用服务完结的 onStartListening 办法建议辨认,详细逻辑由各自的服务决议,其终究将调用 Callback 回来辨认状况和成果,对应着【怎么恳求辨认?】章节里对应的 RecognitionListener 回调。

protected abstract void onStartListening(Intent recognizerIntent, Callback listener);

中止辨认 & 撤销服务

后续的中止辨认 stopListening()、撤销服务 cancel() 的完结链路和开端辨认基本一致,终究别离抵达 RecognitionService 的 onStopListening() 以及 onCancel() 回调。

唯一区别的地方在于 stop 仅仅暂时中止辨认,辨认 App 的衔接还在,而 cancel 则是断开了衔接、并重置了相关数据

void cancel(IRecognitionListener listener, boolean isShutdown) {
        ...
        synchronized (mLock) {
            ...
            mRecordingInProgress = false;
            mSessionInProgress = false;
            mDelegatingListener = null;
            mListener = null;
            // Schedule to unbind after cancel is delivered.
            if (isShutdown) {
                run(service -> unbind());
            }
        }
    }

结语

Android 标准语音识别框架:SpeechRecognizer 的封装和调用

终究咱们结合一张图整体了解一下 SpeechRecognizer 机制的链路:

  1. 需求辨认的 App 经过 SpeechRecognizer 发送 Request
  2. SpeechRecognizer 在建议辨认的时分经过 IRecognitionServiceManager.aidl 奉告 SystemServer 的 SpeechRecognitionManagerService 体系服务,去 SettingsProvider 中获取默认的 Recognition 服务包名
  3. SpeechRecognitionManagerService 并不直接负责绑定,而是交由 SpeechRecognitionManagerServiceImpl 调度
  4. SpeechRecognitionManagerServiceImpl 则是交给 RemoteSpeechRecognitionService 专门绑定和办理
  5. RemoteSpeechRecognitionService 经过 IRecognitionService.aidl 和详细的辨认服务 RecognitionService 进行交互
  6. RecognitionService 则会经过 Handler 切换到主线程,调用辨认 engine 开端处理辨认恳求,并经过 Callback 内部类完结辨认状况、成果的回来
  7. 后续则是 RecognitionService 经过 IRecognitionListener.aidl 将成果传递至 SystemServer,以及进一步抵达宣布恳求的 App 源头

推荐阅读

  • 怎么打造车载语音交互:Google Voice Interaction 给你答案
  • 直面原理:5 张图完全了解 Android TextToSpeech 机制

参考资料

  • SpeechRecognizer
  • RecognitionService
  • 体系 Sample Project