-
Updated
Mar 9, 2022 - C++
speech-to-text
Here are 1,523 public repositories matching this topic...
-
Updated
Apr 29, 2022 - Shell
Fedora & apt-get
-
Updated
Mar 26, 2021 - JavaScript
-
Updated
Apr 20, 2022 - Python
-
Updated
Apr 28, 2022 - Python
-
Updated
May 1, 2022 - Jupyter Notebook
-
Updated
Apr 29, 2022 - Python
目前的多音字使用 pypinyin 或者 g2pM,精度有限,想做一个基于 BERT (或者 ERNIE) 多音字预测模型,简单来说就是假设某语言有 100 个多音字,每个多音字最多有 3 个发音,那么可以在 BERT 后面接 100 个 3 分类器(简单的 fc 层即可),在预测时,找到对应的分类器进行分类即可。
参考论文:
tencent_polyphone.pdf
数据可以用 https://github.com/kakaobrain/g2pM 提供的数据
进阶:多任务的 BERT
, Chorus, One Poll, One Zero, Pole Zero, Two Pole, Two Zero, etc
a library exists called stk under zlib license which already implemented these maybe we can implement some of these out
Seek performance
-
Updated
Mar 21, 2022 - Python
Creating CSV files manually is a lot of work. This could be automated by a script if the name of the WAV file is the same as the transcript.
The same could be done for creating a language model input text file. A script could pull the transcript from the WAV file name.
-
Updated
Jun 7, 2018 - Python
-
Updated
Sep 29, 2021 - JavaScript
-
Updated
Mar 11, 2022 - Python
-
Updated
Nov 4, 2020 - Python
-
Updated
Jan 25, 2022
-
Updated
May 23, 2019 - C++
-
Updated
Jan 20, 2019 - Python
-
Updated
Apr 13, 2022 - TypeScript
-
Updated
Mar 8, 2022 - Python
-
Updated
Mar 11, 2022 - Python
Improve this page
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."
Specs