🔍Image OCR (Text Recognition)
Extract text from images using Tesseract.js with Korean model support. Digitize receipts, book pages, screenshots, business cards, and more.
How to use
- 1Upload an image with text.
- 2Pick a language (Korean / English / Korean+English / Japanese, etc.).
- 3Click Start OCR.
- 4Copy the recognized text or save as .txt.
FAQ
How accurate is it?+
High-resolution printed Korean documents: 90%+ accuracy. Handwriting, tilted photos, or low resolution drop to 50–70%. Confidence scores are shown alongside results.
Why is it slow the first time?+
It downloads the Korean training data (~10MB) on first use. Subsequent runs are cached and much faster.
Can it extract text from PDFs?+
Not directly. Convert with [PDF to images] first, then OCR. For digital PDFs, [PDF text extract] is more accurate than OCR.
Can I batch multiple images?+
One at a time for now. Bulk OCR is slow in-browser, so desktop tools are better.
Does it support Hanja or Japanese?+
You can pick Japanese, Simplified Chinese, or Traditional Chinese models. For Hanja, the Japanese or Chinese model usually works best.
Is my image uploaded?+
No. Tesseract.js runs in WebAssembly inside your browser — image and result both stay local.
What kinds of images OCR well?+
(1) Text 12pt or bigger, (2) clear contrast between background and text, (3) shot straight on, (4) no blur. For books, lay the page flat and shoot from above.