DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
-
Updated
Nov 6, 2022 - C++
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
kaldi-asr/kaldi is the official location of the Kaldi project.
Port of OpenAI's Whisper model in C/C++
Speech recognition module for Python, supporting several engines and APIs, online and offline.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
NeMo: a toolkit for conversational AI
A PyTorch-based Speech Toolkit
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Lingvo
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0
Kalliope is a framework that will help you to create your own personal assistant.
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Free, easy, portable audio engine for games
the open-source virtual assistant for Ubuntu based Linux distributions
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."