Coding Challenge 188: Voice Chatbot

Apr 27, 2026 11:20 PM - 19 hours ago 958


In this coding challenge, I build a conversational sound chatbot wholly successful the browser pinch p5.js. I harvester 3 pieces: speech-to-text pinch OpenAI's Whisper model, text-to-speech pinch Kokoro TTS, and a "brain" for the bot. I besides research the transformers.js pipeline API and the Web Audio API. For the bot's brain, I commencement pinch a elemental ELIZA-style therapist, past incorporated a RiveScript number-guessing game, and yet a section LLM. Code: https://thecodingtrain.com/challenges/188-voice-chatbot ๐Ÿš€ Watch this video ad-free connected Nebula https://nebula.tv/videos/codingtrain-coding-challenge-188-voice-chatbot p5.js Web Editor Sketches: ๐Ÿ•น๏ธ LLM Chatbot: https://editor.p5js.org/codingtrain/sketches/RHhT9I4Nm ๐Ÿ•น๏ธ Number Guessing Bot: https://editor.p5js.org/codingtrain/sketches/AJw7zMN9q ๐Ÿ•น๏ธ Therapy Bot: https://editor.p5js.org/codingtrain/sketches/37LFEPUVV ๐Ÿ•น๏ธ Model Loading Bars: https://editor.p5js.org/codingtrain/sketches/E9Ob3x8eJ ๐Ÿ•น๏ธ Waveform of Recording: https://editor.p5js.org/codingtrain/sketches/cck49wDub ๐Ÿ•น๏ธ Real Time Waveform: https://editor.p5js.org/codingtrain/sketches/aaRIT-x6a ๐ŸŽฅ Previous: https://youtu.be/g3-PXyF8U70?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH ๐ŸŽฅ All: https://www.youtube.com/playlist?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH References: ๐Ÿ““ p5.2 Reference: https://beta.p5js.org ๐Ÿ““ Introducing Whisper: https://cdn.openai.com/papers/whisper.pdf ๐Ÿ““ Model Cards for Model Reporting: https://arxiv.org/abs/1810.03993 ๐Ÿ““ Open Neural Network Exchange: https://onnx.ai ๐Ÿ““ Onnx-community Whisper-tiny.en model: https://huggingface.co/onnx-community/whisper-tiny.en ๐Ÿ““ Xenova: https://github.com/xenova ๐Ÿ““ Transformers.js: https://huggingface.co/docs/transformers.js/installation ๐Ÿ““ Announcing the caller p5.sound.js library!: https://medium.com/processing-foundation/announcing-the-new-p5-sound-js-library-42efc154bed0 ๐Ÿ““ getUserMedia() documentation: https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia ๐Ÿ““ MediaRecorder() documentation: https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder ๐Ÿ““ Kokoro Repo: https://github.com/hexgrad/kokoro ๐Ÿ““ KokoroTTS Model: https://huggingface.co/hexgrad/Kokoro-82M ๐Ÿ““ ELIZA: https://en.wikipedia.org/wiki/ELIZA ๐Ÿ““ Rivescript: https://www.rivescript.com ๐Ÿ““ SmolLM3: https://huggingface.co/HuggingFaceTB/SmolLM3-3B ๐Ÿ““ Running models connected WebGPU: https://huggingface.co/docs/transformers.js/guides/webgpu ๐Ÿ““ Using quantized models (dtypes): https://huggingface.co/docs/transformers.js/v3.8.1/guides/dtypes Videos: ๐Ÿš‚ https://youtu.be/0Ad5Frf8NBM ๐Ÿš‚ https://youtu.be/KR61bXsPlLU Live Stream Archives: ๐Ÿ”ด https://www.youtube.com/watch?v=KRDJAHArqaw Related Coding Challenges: ๐Ÿš‚ https://youtu.be/eGFJ8vugIWA ๐Ÿš‚ https://youtu.be/8Z9FRiW2Jlc ๐Ÿš‚ https://youtu.be/iFTgphKCP9U Timestamps: 0:00:00 Hello! 0:00:35 Mapping retired the pieces: speech-to-text, text-to-speech, and the brain 0:01:07 Thoughts connected AI and imaginative exploration 0:02:44 Choosing the tools: Whisper and Kokoro TTS 0:04:06 Building a push-to-talk UI successful p5.js 0:04:51 Finding models connected Hugging Face pinch Transformers.js 0:05:36 About the Whisper exemplary and exemplary cards 0:06:55 Loading the Whisper pipeline successful p5.js 0:09:04 Accessing the microphone pinch getUserMedia 0:10:44 Capturing audio pinch MediaRecorder 0:12:05 Processing audio chunks into a waveform 0:15:55 Speech-to-text working! 0:16:36 Building the chatbot encephalon (ELIZA-style therapist) 0:18:50 Setting up Kokoro TTS for text-to-speech 0:21:07 Playing synthesized audio pinch AudioBufferSource 0:23:41 Text-to-speech working! 0:25:32 Handling playback events 0:26:56 Swapping successful a RiveScript number-guessing brain 0:31:22 Adding a connection exemplary (SmolLM2) arsenic the brain 0:38:33 Final demo: the random number chatbot 0:39:03 Goodbye! Editing by Mathieu Blanchette Animations by Jason Heglund Music from Epidemic Sound ๐Ÿš‚ Website: https://thecodingtrain.com/ ๐Ÿ‘พ Share Your Creation! https://thecodingtrain.com/guides/passenger-showcase-guide ๐Ÿšฉ Suggest Topics: https://github.com/CodingTrain/Suggestion-Box ๐Ÿ’ก GitHub: https://github.com/CodingTrain ๐Ÿ’ฌ Discord: https://thecodingtrain.com/discord ๐Ÿ’– Membership: http://youtube.com/thecodingtrain/join ๐Ÿ›’ Store: https://standard.tv/codingtrain ๐Ÿ–‹๏ธ Twitter: https://twitter.com/thecodingtrain ๐Ÿ“ธ Instagram: https://www.instagram.com/the.coding.train/ ๐ŸŽฅ https://www.youtube.com/playlist?list=PLRqwX-V7Uu6ZiZxtDDRCi6uhfTH4FilpH ๐ŸŽฅ https://www.youtube.com/playlist?list=PLRqwX-V7Uu6Zy51Q-x9tMWIv9cueOFTFA ๐Ÿ”— p5.js: https://p5js.org ๐Ÿ”— p5.js Web Editor: https://editor.p5js.org/ ๐Ÿ”— Processing: https://processing.org ๐Ÿ“„ Code of Conduct: https://github.com/CodingTrain/Code-of-Conduct This explanation was auto-generated. If you spot a problem, please unfastened an issue: https://github.com/CodingTrain/thecodingtrain.com/issues/new #texttospeech #speechtotext #chatbot #rivescript #llms #agents #ai #transformersjs #webaudioapi #javascript #p5js
More
โ†‘