Skip to main content
Filter by
Sorted by
Tagged with
-1 votes
1 answer
69 views

extraction of redline of word document converted to pdf using python library

I have a set of documents that were edited in Microsoft Word. Take for example a document that says "This is a test document." The document is then edited, with track changes on, to read &...
Anthony Tomasic's user avatar
2 votes
1 answer
90 views

how can i extract the form information from a .mdb file?

I am after the form specification data - all the parameters you would type in in the design view for a form. I have used jackess to access the .mdb file. I fiddled with permissions on MSysObjects ...
Dave's user avatar
  • 73
1 vote
0 answers
49 views

Need raw text segregated into just title and content , from wikipedia dump (English) [duplicate]

I am working on a Full Text Search Implementation (sort of a matching algorithm) in a tool called Tantivy_py , I tried with a small text source and it worked smoothly , Now i want to test it on a very ...
chetxn04's user avatar
1 vote
1 answer
104 views

Extracting phylogenetic tree information from images using machine learning

There are various machine learning models (Claude, chatGPT, etc) which can be used to extract machine-readable information from images. Has anyone seen cases of successfully extracting Newick format ...
user2667066's user avatar
  • 2,149
1 vote
1 answer
151 views

Is there any OCR or technique that can recognize/identify radio buttons printed out in the form of pdf document?

I have a pdf document with radio responses like attached screenshot. I want to extract the selected response only through python or any OCR technique. Is there any way of doing it? (https://i.sstatic....
riyagarg0597's user avatar
0 votes
0 answers
20 views

Extracting conditional numeric values from character data in R

Stuck on a data tidying problem, and not sure how to work around it. I have messy character data on whisky, which I'm looking to organise so that I can conduct some analyses. Specifically, I'm looking ...
Rhys Maredudd Davies's user avatar
0 votes
0 answers
135 views

Remove Bg fill from tables in pdf using pymupdf/fitz or pdfminer/pdfplumber

I want to remove background fill in cells of table. tired using get_drawings() form fitz, I'm able to change the fill value in drawing object but It reset back to original value before saving the pdf. ...
Devashish Ojha's user avatar
0 votes
2 answers
366 views

Is there a way to extract unmatched data from a cell string in excel?

I have been given a excel file which contain columns, and within each cell of the column there are multiple entries separated by commas, as Column 1 Column 2 Column 3 A1, A7, A11, B12, B15 A1, A7, A11,...
beanie42's user avatar
1 vote
3 answers
102 views

Identify sequences in alphanumeric strings in R

I am attempting to create a flag for when transaction IDs are sequential. For reasons that I will not get into here, these can be a red flag. The problem that I am having is that the IDs are not ...
coult's user avatar
  • 137
0 votes
0 answers
17 views

Extraction particular portion from text file using python

As mention below I have A1 to A300 Specific set of information in a single text file named full change.txt If ***** Begin ****A1 End Go If ***** Begin ****A2 End Go ……….. I have 300 files and each ...
Ganny Entmt's user avatar
-1 votes
2 answers
90 views

how to get number from string without regex

Without using regex, is there a way to get number from a string in JavaScript? For example, if input is "id is 12345", then I want 12345 number in output. I know a lot of regex solutions, ...
Anonymous's user avatar
0 votes
1 answer
158 views

Is there a way to tell spaCy that certain words are related to a certain number? e.g. Feed rate and aspirator rate were 3l/hr and 100% respectively

I'm very new to Python, spaCy, and even stack overflow in general. So forgive me if my question is too vague. I would like to ask if there's a way to tell spaCy that certain words in a sentence are ...
faz's user avatar
  • 1
0 votes
1 answer
174 views

Using GPT-3 to identify relationships in a corpus

I have a corpus of 15K news articles. I would like to train a GPT model (3 or 4) to ingest these texts and then output how the locations, events, actions, participants, and things described in the ...
Steve's user avatar
  • 1,001
0 votes
1 answer
146 views

How can i extract one value from this .xml to a string? c#

The xml file <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <MPL Version="2.0" Title="JRSidecar" PathSeparator="\"> &...
Stegau's user avatar
  • 5
1 vote
1 answer
204 views

Is there a way to return multiple values from csv file in function with statistics? (string & float)

I'm pretty new to this - I'm working on a basic sales csv file to extract multiple values.. The csv contains a list of months and the number of sales for that month as well as other columns but these ...
Alice's user avatar
  • 11

15 30 50 per page
1
2 3 4 5
23