6,281 questions
1
vote
0
answers
46
views
Performing OCR of Seven Segment Display Multimeter
Firstly, I am very very new to these things, and I have come this far with the help of ChatGPT.
We recorded some videos of two multimeters that have seven-segment displays. I want to OCR them to use ...
0
votes
0
answers
19
views
Tesseract OCR misreads coloured labels a scaned image despite correct recognition of other colored text in the same image
I'm working on processing OCT scan images using Tesseract OCR. My goal is to extract patient information, including eye labels "OD" (right eye) and "OS" (left eye), from these ...
1
vote
1
answer
67
views
How to extract bold text from a PDF file [closed]
I'm working on a project where I need to extract only the bold text from PDF files using Python.
At first, I tried using libraries like PyMuPDF (fitz) and pdfminer, extracting the PDF as HTML and ...
-2
votes
0
answers
47
views
Pytesseract not able to extract vehicle number plate text
I have designed code to detect number plates succesfully,But problem is arising when i need to extract number plate information using pytesseteract and store it in excel,It is not extracting number ...
-2
votes
0
answers
11
views
Are there any affordable alternatives to MathPix with similar accuracy? [closed]
We want to convert educational PDFs containing mathematical formulas and text (Gujarati, Hindi, and English) into LaTeX. We need to scan thousands of pages, so we are looking for an affordable ...
0
votes
0
answers
24
views
lstm-unicharset file is unable to be created during tesseract training
I am trying to fine-tune an Optical Character Recognition (OCR) model on Tesseract's provided tesstrain repository for Japanese . I tried encoding the bash commands into Python in VSCode as I wanted ...
-3
votes
0
answers
22
views
Getting confidence scores for fields extracted from image-to-text LLM inference [closed]
I am extracting details from cheques using fine-tuned Qwen 2.5 VL 7B model. I want to have confidence scores for the fields as in azure document intelligence.
A few ways I think I can do it is using ...
0
votes
0
answers
28
views
Error Running OCR with Qwen2.5-VL in Colab
I am trying to run the OCR functionality of Qwen2.5-VL by following the tutorial provided in this notebook: OCR Tutorial Notebook
However, I am encountering an error when attempting to execute the ...
-4
votes
0
answers
84
views
How to improve handwriting recognition in an image [closed]
I'm trying to develop a system to read the handwriting in a chart within a written page, using a multimodal LLM. I'm using Google apps script So far I've experimented:
function openRouterApiRequest() {...
-4
votes
0
answers
26
views
Need help regarding fine-tuning of mistral-small-latest model [closed]
I wanted to finetune mistral small model on custom data set. My pc has 16 gb ram and 8 gb shared gpu which is not enough for training the model.
Another option is google colab. The free tier one has ...
2
votes
1
answer
119
views
Text recognition with VNRecognizeTextRequest not working
Trying to implement text recognition using Vision Kit in my app but can't get text recognized. The input is using Apple Pencil or finger drawing on a PKCanvasView. Here's my code extracted into a ...
0
votes
0
answers
49
views
PyMuPDF - Extract table contents
I try to extract the table text of a PDF:
With the following code code i get:
page 0 of page-1-ocr.pdf
Tables rowsasf 49
texysdft [['', '', 'Staatlic', 'he Fische', 'rprüfung', 'in Bayern - Prü', '...
-2
votes
1
answer
41
views
Why is my TensorFlow CNN OCR model outputting incorrect characters for Persian license plates? [closed]
I’m building a FastAPI web API to detect Persian car license plates using YOLOv8 and extract their text with a custom TensorFlow CNN OCR model. YOLOv8 correctly detects the plate’s bounding box, but ...
0
votes
1
answer
40
views
How to merge split text blocks from multi-line cells in OCR table?
I'm working on OCR processing for image-based PDF files using the Google Vision API in a Python 3.11.4 environment.
The documents are structured as tables, and I need to extract the text from each ...
1
vote
1
answer
49
views
Not able to use google ml kit for Indian languages OCR
I'm trying to build an app for kannada (An Indian language) OCR to flashcard conversion with help of cursor AI. I first created the android studio project for devanagari (A more widely used indian ...