A PyTorch-based Speech Toolkit
-
Updated
Jul 27, 2023 - Python
A PyTorch-based Speech Toolkit
Reading list for research topics in multimodal machine learning
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
WaveNet vocoder
Foundation Architecture for (M)LLMs
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
SincNet is a neural architecture for efficiently processing raw audio samples.
Open source audio annotation tool for humans
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A neural network for end-to-end speech denoising
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
General Speech Restoration
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Speech recognition toolkit for the arduino
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Problem Agnostic Speech Encoder
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."