natural-language-processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
Here are 9,521 public repositories matching this topic...
-
Updated
Mar 17, 2022 - Python
-
Updated
Feb 26, 2022 - Python
-
Updated
Feb 7, 2022 - Jupyter Notebook
-
Updated
Mar 19, 2022 - Python
-
Updated
Mar 19, 2022 - Python
-
Updated
Mar 15, 2022 - Python
-
Updated
Mar 20, 2022
-
Updated
Jun 12, 2017
Change tensor.data
to tensor.detach()
due to
pytorch/pytorch#6990 (comment)
tensor.detach()
is more robust than tensor.data
.
-
Updated
Mar 18, 2022 - Python
In gensim/models/fasttext.py:
model = FastText(
vector_size=m.dim,
vector_size=m.dim,
window=m.ws,
window=m.ws,
epochs=m.epoch,
epochs=m.epoch,
negative=m.neg,
negative=m.neg,
# FIXME: these next 2 lines read in unsupported FB FT modes (loss=3 softmax or loss=4 onevsall,
# or model=3 supervi
-
Updated
Dec 7, 2021
-
Updated
Mar 11, 2022
-
Updated
Mar 20, 2022 - Python
Although the results look nice and ideal in all TensorFlow plots and are consistent across all frameworks, there is a small difference (more of a consistency issue). The result training loss/accuracy plots look like they are sampling on a lesser number of points. It looks more straight and smooth and less wiggly as compared to PyTorch or MXNet.
It can be clearly seen in chapter 6([CNN Lenet](ht
-
Updated
Mar 18, 2022 - Python
Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict
command opens the file and reads lines for the Predictor
. This fails when it tries to load data from my compressed files.
Rather than simply caching nltk_data
until the cache expires and it's forced to re-download the entire nltk_data
, we should perform a check on the index.xml
which refreshes the cache if it differs from some previous cache.
I would advise doing this in the same way that it's done for requirements.txt
:
https://github.com/nltk/nltk/blob/59aa3fb88c04d6151f2409b31dcfe0f332b0c9ca/.github/wor
-
Updated
Jul 25, 2021 - Jupyter Notebook
-
Updated
Dec 22, 2020 - Python
-
Updated
Feb 1, 2022 - HTML
-
Updated
Jul 1, 2021 - Python
Add T9 decoder
Hey Hackers of this spoopy month!
Welcome to the Ciphey repo(s)!
This issue requires you to add a decoder.
This wiki section walks you through EVERYTHING you need to know, and we've added some more links at the bottom of this issue to detail more about the decoder.
https://github.com/Ciphey/Ciphey/wiki#adding-your-own-crackers--decoders
-
Updated
Mar 20, 2022 - Java
-
Updated
Jul 9, 2021 - Python
-
Updated
Mar 20, 2022 - Python
-
Updated
Mar 14, 2022 - Python
-
Updated
Mar 18, 2022 - Python
Created by Alan Turing
- Wikipedia
- Wikipedia
First good issue
A current error is that a user forwards a batched tensor of
input_ids
that include a padding token, e.g.input_ids = torch.tensor([["hello", "this", "is", "a", "long", "string"], ["hello", "<pad>", "<pad>", "<pad>", "<pad>"]]
In this case, the
attention_mask
should be provided as well. Otherwise the output hidden_states will be incorrectly computed. This is