Coding Challenge 188: Voice Chatbot
In this coding challenge, I build a conversational sound chatbot wholly successful the browser pinch p5.js. I harvester 3 pieces: speech-to-text pinch OpenAI's Whisper model, text-to-speech pinch Kokoro TTS, and a "brain" for the bot. I besides research the transformers.js pipeline API and the Web Audio API. For the bot's brain, I commencement pinch a elemental ELIZA-style therapist, past incorporated a RiveScript number-guessing game, and yet a section LLM. Code: https://thecodingtrain.com/challenges/188-voice-chatbot ๐ Watch this video ad-free connected Nebula https://nebula.tv/videos/codingtrain-coding-challenge-188-voice-chatbot p5.js Web Editor Sketches: ๐น๏ธ LLM Chatbot: https://editor.p5js.org/codingtrain/sketches/RHhT9I4Nm ๐น๏ธ Number Guessing Bot: https://editor.p5js.org/codingtrain/sketches/AJw7zMN9q ๐น๏ธ Therapy Bot: https://editor.p5js.org/codingtrain/sketches/37LFEPUVV ๐น๏ธ Model Loading Bars: https://editor.p5js.org/codingtrain/sketches/E9Ob3x8eJ ๐น๏ธ Waveform of Recording: https://editor.p5js.org/codingtrain/sketches/cck49wDub ๐น๏ธ Real Time Waveform: https://editor.p5js.org/codingtrain/sketches/aaRIT-x6a ๐ฅ Previous: https://youtu.be/g3-PXyF8U70?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH ๐ฅ All: https://www.youtube.com/playlist?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH References: ๐ p5.2 Reference: https://beta.p5js.org ๐ Introducing Whisper: https://cdn.openai.com/papers/whisper.pdf ๐ Model Cards for Model Reporting: https://arxiv.org/abs/1810.03993 ๐ Open Neural Network Exchange: https://onnx.ai ๐ Onnx-community Whisper-tiny.en model: https://huggingface.co/onnx-community/whisper-tiny.en ๐ Xenova: https://github.com/xenova ๐ Transformers.js: https://huggingface.co/docs/transformers.js/installation ๐ Announcing the caller p5.sound.js library!: https://medium.com/processing-foundation/announcing-the-new-p5-sound-js-library-42efc154bed0 ๐ getUserMedia() documentation: https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia ๐ MediaRecorder() documentation: https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder ๐ Kokoro Repo: https://github.com/hexgrad/kokoro ๐ KokoroTTS Model: https://huggingface.co/hexgrad/Kokoro-82M ๐ ELIZA: https://en.wikipedia.org/wiki/ELIZA ๐ Rivescript: https://www.rivescript.com ๐ SmolLM3: https://huggingface.co/HuggingFaceTB/SmolLM3-3B ๐ Running models connected WebGPU: https://huggingface.co/docs/transformers.js/guides/webgpu ๐ Using quantized models (dtypes): https://huggingface.co/docs/transformers.js/v3.8.1/guides/dtypes Videos: ๐ https://youtu.be/0Ad5Frf8NBM ๐ https://youtu.be/KR61bXsPlLU Live Stream Archives: ๐ด https://www.youtube.com/watch?v=KRDJAHArqaw Related Coding Challenges: ๐ https://youtu.be/eGFJ8vugIWA ๐ https://youtu.be/8Z9FRiW2Jlc ๐ https://youtu.be/iFTgphKCP9U Timestamps: 0:00:00 Hello! 0:00:35 Mapping retired the pieces: speech-to-text, text-to-speech, and the brain 0:01:07 Thoughts connected AI and imaginative exploration 0:02:44 Choosing the tools: Whisper and Kokoro TTS 0:04:06 Building a push-to-talk UI successful p5.js 0:04:51 Finding models connected Hugging Face pinch Transformers.js 0:05:36 About the Whisper exemplary and exemplary cards 0:06:55 Loading the Whisper pipeline successful p5.js 0:09:04 Accessing the microphone pinch getUserMedia 0:10:44 Capturing audio pinch MediaRecorder 0:12:05 Processing audio chunks into a waveform 0:15:55 Speech-to-text working! 0:16:36 Building the chatbot encephalon (ELIZA-style therapist) 0:18:50 Setting up Kokoro TTS for text-to-speech 0:21:07 Playing synthesized audio pinch AudioBufferSource 0:23:41 Text-to-speech working! 0:25:32 Handling playback events 0:26:56 Swapping successful a RiveScript number-guessing brain 0:31:22 Adding a connection exemplary (SmolLM2) arsenic the brain 0:38:33 Final demo: the random number chatbot 0:39:03 Goodbye! Editing by Mathieu Blanchette Animations by Jason Heglund Music from Epidemic Sound ๐ Website: https://thecodingtrain.com/ ๐พ Share Your Creation! https://thecodingtrain.com/guides/passenger-showcase-guide ๐ฉ Suggest Topics: https://github.com/CodingTrain/Suggestion-Box ๐ก GitHub: https://github.com/CodingTrain ๐ฌ Discord: https://thecodingtrain.com/discord ๐ Membership: http://youtube.com/thecodingtrain/join ๐ Store: https://standard.tv/codingtrain ๐๏ธ Twitter: https://twitter.com/thecodingtrain ๐ธ Instagram: https://www.instagram.com/the.coding.train/ ๐ฅ https://www.youtube.com/playlist?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH ๐ฅ https://www.youtube.com/playlist?list=PLRqwX-V7Uu6Zy51Q-x9tMWIv9cueOFTFA ๐ p5.js: https://p5js.org ๐ p5.js Web Editor: https://editor.p5js.org/ ๐ Processing: https://processing.org ๐ Code of Conduct: https://github.com/CodingTrain/Code-of-Conduct This explanation was auto-generated. If you spot a problem, please unfastened an issue: https://github.com/CodingTrain/thecodingtrain.com/issues/new #texttospeech #speechtotext #chatbot #rivescript #llms #agents #ai #transformersjs #webaudioapi #javascript #p5js
English (US) ·
Indonesian (ID) ·