-
Updated
Mar 28, 2022 - Python
#
speech-processing
Here are 370 public repositories matching this topic...
A PyTorch-based Speech Toolkit
audio
deep-learning
transformers
pytorch
voice-recognition
speech-recognition
speech-to-text
language-model
speaker-recognition
speaker-verification
speech-processing
audio-processing
asr
speaker-diarization
speechrecognition
speech-separation
speech-enhancement
spoken-language-understanding
huggingface
speech-toolkit
Reading list for research topics in multimodal machine learning
machine-learning
natural-language-processing
reinforcement-learning
computer-vision
deep-learning
robotics
healthcare
reading-list
representation-learning
speech-processing
multimodal-learning
-
Updated
Mar 23, 2022
WaveNet vocoder
-
Updated
Nov 2, 2020 - Python
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
tutorial
detection
extraction
citation
pytorch
pretrained-models
speaker-recognition
speaker-verification
speech-processing
speaker-diarization
voice-activity-detection
speech-activity-detection
speaker-change-detection
speaker-embedding
pyannote-audio
overlapped-speech-detection
speaker-diarization-pipeline
-
Updated
Mar 25, 2022 - Python
SincNet is a neural architecture for efficiently processing raw audio samples.
audio
python
deep-learning
signal-processing
waveform
cnn
pytorch
artificial-intelligence
speech-recognition
neural-networks
convolutional-neural-networks
digital-signal-processing
filtering
speaker-recognition
speaker-verification
speech-processing
audio-processing
asr
timit
speaker-identification
-
Updated
Apr 28, 2021 - Python
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
machine-learning
awesome
deep-learning
speech-recognition
awesome-list
speech-processing
speaker-diarization
-
Updated
Mar 16, 2022
manrajgrover
commented
Jul 16, 2020
What?
Currently, API manually throws its own messages and errors. We should move them to werkzeug
exceptions.
good first issue
Good for newcomers
text-to-speech
tts
speech-synthesis
voice-recognition
speech-recognition
speech-to-text
stt
speech-processing
voice-activity-detection
speech-separation
speech-emotion-recognition
voice-cloning
-
Updated
Jan 25, 2022
A neural network for end-to-end speech denoising
machine-learning
deep-learning
end-to-end
speech
neural-networks
wavenet
speech-processing
speech-denoising
-
Updated
Jul 24, 2019 - Python
Speech recognition toolkit for the arduino
-
Updated
May 5, 2021 - C++
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
deep-neural-networks
signal-processing
machine-learning-algorithms
speech-processing
speech-enhancement
-
Updated
Dec 1, 2020 - MATLAB
Problem Agnostic Speech Encoder
deep-learning
pytorch
unsupervised-learning
speech-processing
multi-task-learning
waveform-analysis
self-supervised-learning
-
Updated
May 20, 2020 - Python
Novoic's audio feature extraction library
audio
python
machine-learning
statistics
signal-processing
waveform
healthcare
feature-extraction
dimension
speech-processing
audio-processing
docstrings
alzheimers-disease
parkinsons-disease
-
Updated
Mar 4, 2022 - Python
Library to build speech synthesis systems designed for easy and fast prototyping.
-
Updated
Jan 4, 2022 - Python
A python wrapper for Speech Signal Processing Toolkit (SPTK).
-
Updated
Jan 4, 2022 - Python
This repository has implementation for "Neural Voice Cloning With Few Samples"
deep-learning
voice
tts
speech-processing
voice-synthesis
saidl
speaker-adaptation
voice-cloning
speaker-encodings
mel-spectogram
-
Updated
Feb 23, 2021 - Python
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
deep-learning
neural-network
speech
speech-recognition
neural-networks
deeplearning
speech-to-text
speaker-recognition
speaker-verification
speech-processing
speech-recognizer
beamforming
speech-analysis
timit
speechrecognition
speech-api
speech-separation
librispeech
speech-emotion-recognition
speaker-identification
-
Updated
Mar 9, 2022 - HTML
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
audio
raspberry-pi
deep-learning
tensorflow
keras
speech-processing
dns-challenge
noise-reduction
audio-processing
real-time-audio
speech-enhancement
speech-denoising
onnx
tf-lite
noise-suppression
dtln-model
-
Updated
Mar 9, 2022 - Python
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
deep-neural-networks
deep-learning
signal-processing
speech-processing
speech-analysis
speech-separation
-
Updated
Jan 9, 2021 - MATLAB
On-device speech-to-text engine powered by deep learning
deep-learning
voice-commands
voice-recognition
speech-to-text
transcription
voice-control
speech-processing
asr
voice-assistant
edge-computing
on-device
speech-recoginition
-
Updated
Mar 22, 2022 - Java
Real-time GCC-NMF Blind Speech Separation and Enhancement
machine-learning
real-time
gcc
speech
ipython-notebook
low-latency
dictionary-learning
speaker
speech-processing
cross-correlation
nmf
real-time-processing
unsupervised-machine-learning
speech-separation
speech-enhancement
gcc-nmf
generalized-cross-correlation
tdoa
-
Updated
Apr 8, 2019 - Python
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
-
Updated
Feb 4, 2022 - Python
Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
natural-language-processing
machine-translation
artificial-intelligence
speech-recognition
natural-language-generation
speech-processing
-
Updated
Feb 12, 2022
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
speech
speech-synthesis
encodings
speech-processing
speaker-embeddings
mel-spectrogram
voice-cloning
speaker-encodings
-
Updated
Feb 23, 2021 - Python
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
audio
reproducible-research
paper
speech
pytorch
band
speech-processing
noise-reduction
denoising
speech-separation
speech-enhancement
narrow-band
single-channel
pretrained-model
band-fusion-model
full-band
sub-band
-
Updated
Jan 21, 2022 - Python
PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
-
Updated
Aug 13, 2019 - Python
6
ngragaei
commented
Jul 27, 2020
frames[-1] = np.append(frames[-1], np.array([0]*(frame_length - len(frames[0]))))
TypeError: can't multiply sequence by non-int of type 'float'
bug
Something isn't working
good first issue
Good for newcomers
spafe.utils
scipts under spafe/utils
UniSpeech - Large Scale Self-Supervised Learning for Speech
speech
pytorch
speech-recognition
speaker-verification
speech-processing
speech-separation
diarization
speech-diarization
-
Updated
Jan 10, 2022 - Python
Improve this page
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."
I'd like to train this model on 8 V100 GPUs - does it support multi GPU training?