huggingface / transformers

Star

Open

Add error message to Wav2Vec2 & Hubert if labels > vocab_size

1

patrickvonplaten commented Jun 20, 2021

🚀 Feature request

Add better error message to HubertForCTC, Wav2Vec2ForCTC if labels are bigger than vocab size.

Motivation

Following this issue: huggingface/transformers#12264 it is clear that an error message should be thrown if any of the any of the labels are > self.config.vocab_size or else silent errors can sneak into the training script.

So w

Good First Issue

Open

[Performance] Tracking open Issues and PRs (pytorch transformers)

Open

Getting time offsets of beginning and end of each word in Wav2Vec2

17

Find more good first issues →

mozilla / DeepSpeech

Star

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

machine-learning embedded deep-learning offline tensorflow speech-recognition neural-networks speech-to-text deepspeech on-device

Updated Jun 27, 2021
C++

kaldi-asr / kaldi

Star

kaldi-asr/kaldi is the official location of the Kaldi project.

shell c-plus-plus cuda speech speech-recognition speech-to-text kaldi speaker-verification speaker-id

Updated Jul 1, 2021
Shell

kmario23 / deep-learning-drizzle

Star

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

machine-learning natural-language-processing deep-neural-networks reinforcement-learning computer-vision deep-learning optimization machine-translation deep-reinforcement-learning medical-imaging speech-recognition artificial-neural-networks pattern-recognition probabilistic-graphical-models bayesian-statistics artificial-intelligence-algorithms visual-recognition graph-neural-networks

Updated May 21, 2021

leon-ai / leon

Star

Open

Fedora & apt-get

2

AsterYujano commented Oct 5, 2019

Specs

Leon version: latest
OS (or browser) version: Fedora 30
Node.js version: 10.16.3
Complete "npm run check" output:

➡ Here is the diagnosis about your current setup
✔ Run
✔ Run modules
✔ Reply you by texting
❗ Amazon Polly text-to-speech
❗ Google Cloud text-to-speech
❗ Watson text-to-speech
❗ Offline text-to-speech
❗ Google Cloud speech-to-text
❗ Watson spee

bug good first issue

Open

Can I contribute to crypto package

6

Open

How old are you package

6

Find more good first issues →

TalAter / annyang

Star

💬 Speech recognition for your site

demo gui tutorial voice speech speech-recognition speech-to-text hacktoberfest

Updated Mar 26, 2021
JavaScript

flashlight / wav2letter

Star

Facebook AI Research's Automatic Speech Recognition Toolkit

deep-learning cpp end-to-end speech-recognition wav2letter

Updated Jul 1, 2021
C++

Uberi / speech_recognition

Star

Speech recognition module for Python, supporting several engines and APIs, online and offline.

audio python speech-recognition speech-to-text

Updated Feb 28, 2021
Python

nl8590687 / ASRT_SpeechRecognition

Star

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

python tensorflow keras cnn speech-recognition speech-to-text ctc chinese-speech-recognition asrt

Updated May 16, 2021
Python

espnet / espnet

Star

End-to-End Speech Processing Toolkit

deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speech-separation speech-enhancement speech-translation

Updated Jul 4, 2021
Python

NVIDIA / NeMo

Star

NeMo: a toolkit for conversational AI

nlp text-to-speech deep-learning neural-network machine-translation speech-synthesis speech-recognition speech-to-text nmt nlp-machine-learning

Updated Jul 4, 2021
Jupyter Notebook

cmusphinx / pocketsphinx

Star

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

python c speech-recognition

Updated Jan 8, 2021
C

zzw922cn / Automatic_Speech_Recognition

Star

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

audio deep-learning tensorflow paper end-to-end evaluation cnn lstm speech-recognition rnn automatic-speech-recognition feature-vector data-preprocessing phonemes timit-dataset layer-normalization rnn-encoder-decoder chinese-speech-recognition

Updated May 25, 2021
Python

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

audio transformers pytorch voice-recognition speech-recognition speech-to-text language-model speaker-recognition speaker-verification speech-processing audio-processing asr speaker-diarization speechrecognition speech-separation speech-enhancement spoken-language-understanding huggingface speech-toolkit speechbrain

Updated Jul 4, 2021
Python

tensorflow / lingvo

Star

Lingvo

nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asr

Updated Jul 4, 2021
Python

pannous / tensorflow-speech-recognition

Star

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

deep-learning neural-network tensorflow speech-recognition speech-to-text stt

Updated Nov 20, 2018
Python

mravanelli / pytorch-kaldi

Star

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

deep-neural-networks deep-learning speech dnn pytorch recurrent-neural-networks lstm gru speech-recognition rnn kaldi rnn-model asr lstm-neural-networks multilayer-perceptron-network timit dnn-hmm

Updated Mar 15, 2021
Python

alphacep / vosk-api

Star

Open

Compress symbol table

3

nshmyrev commented Aug 4, 2020

One can use https://github.com/s-yata/marisa-trie to save a lot of space for symbols.

good first issue help wanted

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Star

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

roadmap neural-network cnn dnn tts speech-synthesis speech-recognition rnn seq2seq automatic-speech-recognition papers language-model attention-mechanism speaker-verification timit-dataset acoustic-model recognition-synthesis

Updated Jun 30, 2021

yanshengjia / ml-road

Sponsor Star

Machine Learning Resources, Practice and Research

nlp machine-learning computer-vision deep-learning tensorflow pytorch speech-recognition

Updated Apr 11, 2021
Python

syhw / wer_are_we

Star

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

speech-recognition wer deep-neural-network

Updated Dec 21, 2020

astorfi / lip-reading-deeplearning

Sponsor Star

🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

computer-vision deep-learning tensorflow speech-recognition 3d-convolutional-network

Updated Mar 3, 2020
Python

bjoernkarmann / project_alias

Star

Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.

raspberry-pi machine-learning hack smarthome microphone speech-recognition classification alias sound-synthesis wakeword