All Questions
Tagged with punctuation nltk
13 questions
2
votes
1
answer
354
views
how do i solve AttributeError: 'float' object has no attribute 'encode'
this is the code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('ggplot')
import nltk
df = pd.read_csv('/kaggle/input/starbucks-review-...
0
votes
0
answers
254
views
Tokenizing Strings without Punctuation in Python and putting punctuation back subsequently
After reading here for a while already, I have decided to make a post because I am not getting anywhere with my problem. Unfortunately, I am just a "finance guy" and need some help in coding ...
-1
votes
4
answers
605
views
Remove punctuation marks from tokenized text using for loop
I'm trying to remove punctuations from a tokenized text in python like so:
word_tokens = ntlk.tokenize(text)
w = word_tokens
for e in word_tokens:
if e in punctuation_marks:
w.remove(e)
...
1
vote
0
answers
270
views
Removing punctuation with an exception in python
I am trying to remove punctuation from a given string in python.
It works well, however the data I am using includes lots of ":D" or ":)" or ":(".
Therefore when I ...
0
votes
1
answer
1k
views
Error when using string.punctuation to remove punctuation for a string
Quick question:
I'm using string and nltk.stopwords to strip a block of text of all its punctuation and stopwords as part of data pre-processing before feeding it into some natural language ...
-2
votes
2
answers
2k
views
Python program to put proper punctuations in a given string
I want to put proper punctuation marks in a given paragraph having many punctuationless sentences.
E.g:
input: hey how are you can you come today
output: hey, how are you? can you come today?
I just ...
1
vote
3
answers
14k
views
removing stop words and string.punctuation
i can't figured out why this doesn't works:
import nltk
from nltk.corpus import stopwords
import string
with open('moby.txt', 'r') as f:
moby_raw = f.read()
stop = set(stopwords.words('...
6
votes
1
answer
7k
views
How to preserve punctuation marks in Scikit-Learn text CountVectorizer or TfidfVectorizer?
Is there any way for me to preserve punctuation marks of !, ?, " and ' from my text documents using text CountVectorizer or TfidfVectorizer parameters in scikit-learn?
0
votes
0
answers
112
views
Double quote does not recognised as punctuation in Python 2.7? [duplicate]
I have a question about string.punctuation.
I'm using NLTK and I need to clear my text from punctuation (the text is already divided in tokens with function word_tokenize(my_str)).
I wrote simple ...
0
votes
1
answer
593
views
Python NLTK not taking out punctuations correctly
I have defined the following code
exclude = set(string.punctuation)
lmtzr = nltk.stem.wordnet.WordNetLemmatizer()
wordList= ['"the']
answer = [lmtzr.lemmatize(word.lower()) for word in list(set(...
0
votes
1
answer
132
views
Remove selected punctuation from list of sentences
I have a list of sentences like :
[' no , 2nd main 4th a cross, uas layout, near ganesha temple/ bsnl exchange, sanjaynagar, bangalore',
' grihalakshmi apartments flat , southend road basavangudi ...
0
votes
2
answers
162
views
Insert spaces next to punctuation when writing to .txt file
I have written a function that uses an nltk tokenizer to preprocess .txt files. Basically, the function takes a .txt file, modifies it so that each sentence appears on a separate line, and overwrites ...
0
votes
1
answer
580
views
Splitting a string after punctuation while including punctuation
I'm trying to split a string of words into a list of words via regex. I'm still a bit of a beginner with regular expressions.
I'm using nltk.regex_tokenize, which is yielding results that are close, ...