Cov txheej txheem:

Kev Paub Hais Lus Siv Google Speech API thiab Python: 4 Cov Kauj Ruam
Kev Paub Hais Lus Siv Google Speech API thiab Python: 4 Cov Kauj Ruam

Video: Kev Paub Hais Lus Siv Google Speech API thiab Python: 4 Cov Kauj Ruam

Video: Kev Paub Hais Lus Siv Google Speech API thiab Python: 4 Cov Kauj Ruam
Video: Google Colab - Searching for News with Python! 2024, Kaum ib hlis
Anonim
Kev Paub Hais Lus Siv Google Hais Lus API thiab Python
Kev Paub Hais Lus Siv Google Hais Lus API thiab Python

Paub Hais Lus

Kev Paub Hais Lus yog ib feem ntawm Kev Hais Lus Ntuj uas yog lub hauv paus ntawm Artificial Intelligence. Txhawm rau muab nws yooj yim, kev hais lus paub yog lub peev xwm ntawm lub khoos phis tawj siv los txheeb xyuas cov lus thiab kab lus hauv kev hais lus thiab hloov lawv mus rau tib neeg nyeem cov ntawv. Nws tau siv hauv ntau daim ntawv thov xws li lub tshuab pabcuam suab, kev siv lub tsev, lub suab raws li chatbots, lub suab cuam tshuam nrog neeg hlau, kev txawj ntse txawj ntse thiab lwm yam.

Muaj ntau qhov APIs (Application Programming Interface) rau kev lees paub hais lus. Lawv muab cov kev pabcuam pub dawb lossis them nyiaj. Cov no yog:

  • CMU Sphinx: koj puas xav tau ntau tus thwjtim?
  • Google Hais Lus Paub
  • Google Cloud Speech API
  • Wit.ai
  • Microsoft Bing Lub Suab Kev Paub
  • Houndify API
  • IBM Hais Lus Rau Cov Ntawv
  • Kev Tshawb Nrhiav Snowboy Hotword

Peb yuav siv Google Kev Hais Lus Hais Lus ntawm no, vim nws tsis xav tau ib tus lej API. Phau ntawv qhia no lub hom phiaj los muab cov lus qhia yuav ua li cas siv Google Kev Hais Lus Kev lees paub lub tsev qiv ntawv ntawm Python nrog kev pab ntawm lub microphone sab nraud zoo li ReSpeaker USB 4-Mic Array los ntawm Pom Studio. Txawm hais tias nws tsis tas yuav siv lub microphone sab nraud, txawm tias siv lub microphone ntawm lub khoos phis tawj tuaj yeem siv tau.

Kauj Ruam 1: ReSpeaker USB 4-Mic Array

ReSpeaker USB 4-Mic Array
ReSpeaker USB 4-Mic Array
ReSpeaker USB 4-Mic Array
ReSpeaker USB 4-Mic Array
ReSpeaker USB 4-Mic Array
ReSpeaker USB 4-Mic Array

ReSpeaker USB Mic yog lub tshuab ntaus plaub lub microphone tsim los rau AI thiab lub suab thov, uas tau tsim los ntawm Seeed Studio. Nws muaj 4 qhov ua tau zoo, ua-nyob rau hauv omnidirectional microphones tsim los khaws koj lub suab los ntawm txhua qhov chaw hauv chav thiab 12 qhov programmable RGB LED ntsuas. ReSpeaker USB mic txhawb Linux, macOS, thiab Windows operating systems. Cov ntsiab lus tuaj yeem pom ntawm no.

ReSpeaker USB Mic los hauv pob zoo uas muaj cov khoom hauv qab no:

  • Ib tus neeg siv phau ntawv qhia
  • ReSpeaker USB Mic Array
  • Micro USB rau USB Cable

Yog li peb npaj tau pib.

Kauj Ruam 2: Txhim Kho Cov Tsev Qiv Ntawv

Txog qhov kev qhia no, Kuv xav tias koj siv Python 3.x.

Cia peb nruab cov tsev qiv ntawv:

pip3 nruab SpeechRecognition

Rau macOS, ua ntej koj yuav tsum teeb tsa PortAudio nrog Homebrew, thiab tom qab ntawd nruab PyAudio nrog pip3:

brew nruab portaudio

Peb khiav hauv qab cov lus txib rau nruab pyaudio

pip3 nruab pyaudio

Rau Linux, koj tuaj yeem nruab PyAudio nrog apt:

sudo apt-tau nruab python-pyaudio python3-pyaudio

Rau Windows, koj tuaj yeem nruab PyAudio nrog pip:

pip nruab pyaudio

Tsim cov ntaub ntawv nab npawb tshiab

nano tau_index.py

Muab tshuaj txhuam rau get_index.py hauv qab cov lej snippet:

ntshuam pyaudio

p = pyaudio. PyAudio () info = p.get_host_api_info_by_index (0) numdevices = info.get ('deviceCount') rau kuv hauv ntau (0, numdevices): yog (p.get_device_info_by_host_api_device_index (0, i) '))> 0: print ("Input Device id", i, " -", p.get_device_info_by_host_api_device_index (0, i).get (' lub npe '))

Khiav cov lus txib hauv qab no:

python3 get_index.py

Hauv kuv qhov xwm txheej, hais kom ua cov hauv qab no rau qhov screen:

Cov Khoom Siv ID 1 - ReSpeaker 4 Mic Array (UAC1.0)

Cov Ntaus Ntaus ID 2 - MacBook Air Microphone

Hloov cov cuab yeej_index rau tus lej lej raws li qhov koj xaiv hauv qab cov lej ntu.

ntshuam speech_recognition li sr

r = sr. Recognizer () hais lus = sr. Microphone (device_index = 1) nrog hais lus raws li qhov chaw: luam tawm ("hais qee yam! …") suab = r.adjust_for_ambient_noise (qhov chaw) suab = r.listen (qhov chaw) sim: rov ua dua = r.recognize_google (audio, language = 'en-US') print ("Koj hais tias:" + recog) except sr. UnknownValueError: print ("Google Speech Recognition could not understand audio") except sr. RequestError as e: print ("Tsis tuaj yeem thov cov txiaj ntsig los ntawm Google Kev Pabcuam Hais Lus Paub; {0}". Hom ntawv (e))

Cov cuab yeej ntsuas tau xaiv 1 vim ReSpeaker 4 Mic Array yuav yog lub hauv paus tseem ceeb.

Kauj Ruam 3: Text-to-speech hauv Python Nrog Pyttsx3 Library

Muaj ntau qhov APIs muaj los hloov cov ntawv rau hais lus hauv nab hab sej. Ib ntawm cov APIs yog pyttsx3, uas yog qhov zoo tshaj plaws muaj cov ntawv-rau-hais lus pob hauv kuv lub tswv yim. Cov pob no ua haujlwm hauv Windows, Mac, thiab Linux. Txheeb xyuas cov ntaub ntawv raug cai los saib seb qhov no ua tiav li cas.

Nruab pob pob Siv cov pip los nruab pob.

pip nruab pyttsx3

Yog tias koj nyob hauv Windows, koj yuav xav tau pob ntxiv, pypiwin32 uas nws yuav xav tau nkag mus rau Windows ib txwm hais lus API.

pip nruab pypiwin32

Hloov cov ntawv rau hais lus nab hab sej tsab ntawv hauv qab no yog tus lej snippet rau cov ntawv rau kev hais lus siv pyttsx3:

ntshuam pyttsx3

engine = pyttsx3.init ()

engine.setProperty ('tus nqi', 150) # Ceev feem pua

engine.setProperty ('ntim', 0.9) # Ntim 0-1

engine.say ("Nyob zoo, ntiaj teb!")

engine.runAndWait ()

Kauj Ruam 4: Muab Nws Tag Nrho Ua Ke: Txhim Kho Kev Hais Lus Nrog Python Siv Google Speech Recognition API thiab Pyttsx3 Library

Cov cai hauv qab no yog lub luag haujlwm lees paub tib neeg hais lus siv Google Kev Paub Hais Lus, thiab hloov cov ntawv los ua lus hais siv pyttsx3 lub tsev qiv ntawv.

ntshuam speech_recognition li sr

ntshuam pyttsx3 cav = pyttsx3.init () engine.setProperty ('tus nqi', 200) engine.setProperty ('ntim', 0.9) r = sr. Recognizer () hais lus = sr. Microphone (device_index = 1) nrog hais lus ua qhov chaw: audio = r.adjust_for_ambient_noise (source) audio = r.listen (source) sim: recog = r.recognize_google (audio, language = 'en-US') print ("Koj hais tias:" + recog) engine.say (" Koj hais tias: " + recog) engine.runAndWait () tshwj tsis yog sr. UnknownValueError: engine.say (" Google Kev Hais Lus Tsis tuaj yeem nkag siab lub suab ") engine.runAndWait () tshwj tsis yog sr. RequestError li e: engine.say (" Ua tsis tau thov kom tau txais txiaj ntsig los ntawm Google Kev Pab Hais Lus Hais Lus; {0} ". format (e)) engine.runAndWait ()

Nws luam tawm cov zis ntawm lub davhlau ya nyob twg. Tsis tas li, nws yuav hloov pauv mus rau kev hais lus ib yam.

Koj tau hais tias: London yog lub peev ntawm Great Britain

Kuv vam tias tam sim no koj muaj kev nkag siab zoo ntawm kev hais lus paub ua haujlwm li cas thiab qhov tseem ceeb tshaj plaws, yuav ua li cas thiaj li siv Google Kev Hais Lus Paub API nrog Python.

Yog tias koj muaj lus nug lossis tswv yim? Tawm lus tawm hauv qab no. Nyob twj ywm!

Pom zoo: