A community-supported supercharged version of paperless: scan, index and archive all your physical documents
-
Updated
Jun 2, 2024 - Python
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
6 MB Tesseract (with English training data) to fit inside AWS Lambda
Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
Online Handwritten Text Recognition (HTR) system implemented with PyTorch. Based on https://doi.org/10.1007/s10032-020-00350-4.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
為了《中國哲學書電子化計劃》輸入用
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Binarize, normalize, segment images and train models.
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and translating text.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
OCR离线图片文字识别命令行windows程序,以JSON字符串形式输出结果,方便别的程序调用。提供各种语言API。由 PaddleOCR C++ 编译。
Streamlit project with Tesseract
CCExtractor - Official version maintained by the core team
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
AI powered Spendenraid evaluation.
WindowTextExtractor allows you to get a text from any window of an operating system including asterisk passwords
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Add a description, image, and links to the ocr topic page so that developers can more easily learn about it.
To associate your repository with the ocr topic, visit your repo's landing page and select "manage topics."