Skip to main content
Filter by
Sorted by
Tagged with
1 vote
0 answers
46 views

Performing OCR of Seven Segment Display Multimeter

Firstly, I am very very new to these things, and I have come this far with the help of ChatGPT. We recorded some videos of two multimeters that have seven-segment displays. I want to OCR them to use ...
AP3X's user avatar
  • 11
0 votes
0 answers
19 views

Tesseract OCR misreads coloured labels a scaned image despite correct recognition of other colored text in the same image

I'm working on processing OCT scan images using Tesseract OCR. My goal is to extract patient information, including eye labels "OD" (right eye) and "OS" (left eye), from these ...
User901036845's user avatar
1 vote
1 answer
67 views

How to extract bold text from a PDF file [closed]

I'm working on a project where I need to extract only the bold text from PDF files using Python. At first, I tried using libraries like PyMuPDF (fitz) and pdfminer, extracting the PDF as HTML and ...
Marco Floriano's user avatar
-2 votes
0 answers
47 views

Pytesseract not able to extract vehicle number plate text

I have designed code to detect number plates succesfully,But problem is arising when i need to extract number plate information using pytesseteract and store it in excel,It is not extracting number ...
Raj's user avatar
  • 19
-2 votes
0 answers
11 views

Are there any affordable alternatives to MathPix with similar accuracy? [closed]

We want to convert educational PDFs containing mathematical formulas and text (Gujarati, Hindi, and English) into LaTeX. We need to scan thousands of pages, so we are looking for an affordable ...
Viren Gothadiya's user avatar
0 votes
0 answers
24 views

lstm-unicharset file is unable to be created during tesseract training

I am trying to fine-tune an Optical Character Recognition (OCR) model on Tesseract's provided tesstrain repository for Japanese . I tried encoding the bash commands into Python in VSCode as I wanted ...
Jiansen Chan's user avatar
-3 votes
0 answers
22 views

Getting confidence scores for fields extracted from image-to-text LLM inference [closed]

I am extracting details from cheques using fine-tuned Qwen 2.5 VL 7B model. I want to have confidence scores for the fields as in azure document intelligence. A few ways I think I can do it is using ...
Azim Ahmed Bijapur's user avatar
0 votes
0 answers
28 views

Error Running OCR with Qwen2.5-VL in Colab

I am trying to run the OCR functionality of Qwen2.5-VL by following the tutorial provided in this notebook: OCR Tutorial Notebook However, I am encountering an error when attempting to execute the ...
JS3's user avatar
  • 1,879
-4 votes
0 answers
84 views

How to improve handwriting recognition in an image [closed]

I'm trying to develop a system to read the handwriting in a chart within a written page, using a multimodal LLM. I'm using Google apps script So far I've experimented: function openRouterApiRequest() {...
user1592380's user avatar
  • 36.6k
-4 votes
0 answers
26 views

Need help regarding fine-tuning of mistral-small-latest model [closed]

I wanted to finetune mistral small model on custom data set. My pc has 16 gb ram and 8 gb shared gpu which is not enough for training the model. Another option is google colab. The free tier one has ...
Azim Ahmed Bijapur's user avatar
2 votes
1 answer
119 views

Text recognition with VNRecognizeTextRequest not working

Trying to implement text recognition using Vision Kit in my app but can't get text recognized. The input is using Apple Pencil or finger drawing on a PKCanvasView. Here's my code extracted into a ...
Phantom59's user avatar
  • 1,088
0 votes
0 answers
49 views

PyMuPDF - Extract table contents

I try to extract the table text of a PDF: With the following code code i get: page 0 of page-1-ocr.pdf Tables rowsasf 49 texysdft [['', '', 'Staatlic', 'he Fische', 'rprüfung', 'in Bayern - Prü', '...
Marc's user avatar
  • 3,934
-2 votes
1 answer
41 views

Why is my TensorFlow CNN OCR model outputting incorrect characters for Persian license plates? [closed]

I’m building a FastAPI web API to detect Persian car license plates using YOLOv8 and extract their text with a custom TensorFlow CNN OCR model. YOLOv8 correctly detects the plate’s bounding box, but ...
Saman Zare's user avatar
0 votes
1 answer
40 views

How to merge split text blocks from multi-line cells in OCR table?

I'm working on OCR processing for image-based PDF files using the Google Vision API in a Python 3.11.4 environment. The documents are structured as tables, and I need to extract the text from each ...
Matthew Lee's user avatar
1 vote
1 answer
49 views

Not able to use google ml kit for Indian languages OCR

I'm trying to build an app for kannada (An Indian language) OCR to flashcard conversion with help of cursor AI. I first created the android studio project for devanagari (A more widely used indian ...
Kutsit's user avatar
  • 139

15 30 50 per page
1
2 3 4 5
419