#

speech-processing

Here are 370 public repositories matching this topic...

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

audio deep-learning transformers pytorch voice-recognition speech-recognition speech-to-text language-model speaker-recognition speaker-verification speech-processing audio-processing asr speaker-diarization speechrecognition speech-separation speech-enhancement spoken-language-understanding huggingface speech-toolkit

Updated Mar 28, 2022
Python

pliang279 / awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

machine-learning natural-language-processing reinforcement-learning computer-vision deep-learning robotics healthcare reading-list representation-learning speech-processing multimodal-learning

Updated Mar 23, 2022

r9y9 / wavenet_vocoder

Sponsor

WaveNet vocoder

python speech pytorch speech-synthesis wavenet speech-processing wavenet-vocoder neural-vocoder

Updated Nov 2, 2020
Python

r9y9 / deepvoice3_pytorch

Sponsor

Open

Multi GPU Support

4

tanmayb123 commented Mar 4, 2018

I'd like to train this model on 8 V100 GPUs - does it support multi GPU training?

Read more

enhancement help wanted good first issue

pyannote-audio

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

tutorial detection extraction citation pytorch pretrained-models speaker-recognition speaker-verification speech-processing speaker-diarization voice-activity-detection speech-activity-detection speaker-change-detection speaker-embedding pyannote-audio overlapped-speech-detection speaker-diarization-pipeline

Updated Mar 25, 2022
Python

mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

audio python deep-learning signal-processing waveform cnn pytorch artificial-intelligence speech-recognition neural-networks convolutional-neural-networks digital-signal-processing filtering speaker-recognition speaker-verification speech-processing audio-processing asr timit speaker-identification

Updated Apr 28, 2021
Python

awesome-diarization

wq2012 / awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

machine-learning awesome deep-learning speech-recognition awesome-list speech-processing speaker-diarization

Updated Mar 16, 2022

midas-research / audino

Open

Move error handling to Flask

manrajgrover commented Jul 16, 2020

What?

Currently, API manually throws its own messages and errors. We should move them to werkzeug exceptions.

Read more

good first issue

Open

Add filename to annotation dashboard

Open

Convert tutorial assets to drawio

Find more good first issues

coqui-ai / open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

text-to-speech tts speech-synthesis voice-recognition speech-recognition speech-to-text stt speech-processing voice-activity-detection speech-separation speech-emotion-recognition voice-cloning

Updated Jan 25, 2022

drethage / speech-denoising-wavenet

A neural network for end-to-end speech denoising

machine-learning deep-learning end-to-end speech neural-networks wavenet speech-processing speech-denoising

Updated Jul 24, 2019
Python

arjo129 / uSpeech

Speech recognition toolkit for the arduino

arduino speech-recognition signal speech-processing

Updated May 5, 2021
C++

nanahou / Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

deep-neural-networks signal-processing machine-learning-algorithms speech-processing speech-enhancement

Updated Dec 1, 2020
MATLAB

santi-pdp / pase

Problem Agnostic Speech Encoder

deep-learning pytorch unsupervised-learning speech-processing multi-task-learning waveform-analysis self-supervised-learning

Updated May 20, 2020
Python

novoic / surfboard

Novoic's audio feature extraction library

audio python machine-learning statistics signal-processing waveform healthcare feature-extraction dimension speech-processing audio-processing docstrings alzheimers-disease parkinsons-disease

Updated Mar 4, 2022
Python

r9y9 / nnmnkwii

Sponsor

Library to build speech synthesis systems designed for easy and fast prototyping.

python machine-learning text-to-speech speech-synthesis voice-conversion speech-processing

Updated Jan 4, 2022
Python

r9y9 / pysptk

Sponsor

A python wrapper for Speech Signal Processing Toolkit (SPTK).

python dsp speech speech-synthesis python-wrapper digital-signal-processing speech-processing sptk

Updated Jan 4, 2022
Python

Ryuk17 / SpeechAlgorithms

Speech Algorithms Collections

speech-processing

Updated Mar 21, 2022
C

SforAiDl / Neural-Voice-Cloning-With-Few-Samples

This repository has implementation for "Neural Voice Cloning With Few Samples"

deep-learning voice tts speech-processing voice-synthesis saidl speaker-adaptation voice-cloning speaker-encodings mel-spectogram

Updated Feb 23, 2021
Python

speechbrain / speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

deep-learning neural-network speech speech-recognition neural-networks deeplearning speech-to-text speaker-recognition speaker-verification speech-processing speech-recognizer beamforming speech-analysis timit speechrecognition speech-api speech-separation librispeech speech-emotion-recognition speaker-identification

Updated Mar 9, 2022
HTML

breizhn / DTLN

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

audio raspberry-pi deep-learning tensorflow keras speech-processing dns-challenge noise-reduction audio-processing real-time-audio speech-enhancement speech-denoising onnx tf-lite noise-suppression dtln-model

Updated Mar 9, 2022
Python

gemengtju / Tutorial_Separation

This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

deep-neural-networks deep-learning signal-processing speech-processing speech-analysis speech-separation

Updated Jan 9, 2021
MATLAB

Picovoice / leopard

On-device speech-to-text engine powered by deep learning

deep-learning voice-commands voice-recognition speech-to-text transcription voice-control speech-processing asr voice-assistant edge-computing on-device speech-recoginition

Updated Mar 22, 2022
Java

seanwood / gcc-nmf

Real-time GCC-NMF Blind Speech Separation and Enhancement

machine-learning real-time gcc speech ipython-notebook low-latency dictionary-learning speaker speech-processing cross-correlation nmf real-time-processing unsupervised-machine-learning speech-separation speech-enhancement gcc-nmf generalized-cross-correlation tdoa

Updated Apr 8, 2019
Python

rishikksh20 / VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

text-to-speech speech-synthesis gan speech-processing vocoder melgan vocgan

Updated Feb 4, 2022
Python

kahne / NonAutoregGenProgress

Tracking the progress in non-autoregressive generation (translation, transcription, etc.)

natural-language-processing machine-translation artificial-intelligence speech-recognition natural-language-generation speech-processing

Updated Feb 12, 2022

Sharad24 / Neural-Voice-Cloning-with-Few-Samples

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

speech speech-synthesis encodings speech-processing speaker-embeddings mel-spectrogram voice-cloning speaker-encodings

Updated Feb 23, 2021
Python

haoxiangsnr / FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

audio reproducible-research paper speech pytorch band speech-processing noise-reduction denoising speech-separation speech-enhancement narrow-band single-channel pretrained-model band-fusion-model full-band sub-band

Updated Jan 21, 2022
Python

swasun / VQ-VAE-Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

speech pytorch wavenet speech-processing vq-vae vq-vae-wavenet

Updated Aug 13, 2019
Python

spafe

SuperKogito / spafe

Sponsor

Open

Error in preprocessing

6

ngragaei commented Jul 27, 2020

frames[-1] = np.append(frames[-1], np.array([0]*(frame_length - len(frames[0]))))

TypeError: can't multiply sequence by non-int of type 'float'

Read more

bug good first issue spafe.utils

Open

add a VAD

Open

missing tests for utils.cepstral.py

Find more good first issues

microsoft / UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

speech pytorch speech-recognition speaker-verification speech-processing speech-separation diarization speech-diarization

Updated Jan 10, 2022
Python

Improve this page

Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."