Optical Character Recognition

YOLO-OCR: Read Text with Custom Models

Test, download, or fine-tune open-source OCR datasets and models on Roboflow. Detect and read text with RF-DETR, ship with commercial-safe licensing, and deploy at the edge or in the cloud.

From dataset to deployed OCR pipeline in an afternoon

YOLO-OCR is the Roboflow Universe collection of open-source OCR datasets and pre-trained models. Fork one, fine-tune it, add a reading step, and ship.

1

Start from a dataset

Browse the YOLO-OCR collection and fork a project, or label your own images in Annotate. Decide your scheme up front: for fixed sets (digits, plates), label each character as its own class; for free-form text, label the text regions and read them downstream.

2

Train the detector

In Roboflow Train, we recommend RF-DETR, Roboflow's state-of-the-art real-time detection architecture, for the detection stage. It leads current YOLO releases on accuracy and latency and ships under a commercial-friendly license. YOLO models are supported too if you need them.

3

Add the reading step

For individual characters, the detector already gives you the string by reading its class labels left to right. For free-form text, chain the detector with a vision-language model in Workflows to read each detected region, then validate the result against a format or a database.

4

Deploy where you run

Serve the model with Inference on the cloud or the edge: detect text, read it, validate the format, and send the result to a database or dashboard. Use active learning to fold the frames your model is unsure about into the next version.

Try RF-DETR live in the model playground Open in new tab

OCR has two stages. Roboflow handles both.

Detect (find the text)

A detection model draws a bounding box around each piece of text, whether that is a whole text region, a single word, or an individual character. This is the foundation YOLO-OCR powers, and it is what everything downstream is built on. Get reliable text detection first and the reading step becomes tractable.

Read (turn it into a string)

Turn those detections into usable data. Train the detector to classify each character as its own class for fixed sets like digits on a meter or a plate, or crop the detected region and read it with a vision-language model or OCR engine for free-form text. Then validate the result against a format or a database.

Both stages live in one platform. Roboflow chains detection, reading, and validation into a single Workflow, which is where most production OCR value lives.

Your models and data stay yours

Commercial-safe by license, secure by architecture, and backed by 80+ open datasets to start from.

Commercial-safe licensing by default

Train and ship on RF-DETR, released under the permissive Apache 2.0 license, free to use commercially with no copyleft obligations. The Ultralytics YOLO family ships under AGPL-3.0, which in practice can require open-sourcing your application or buying a commercial license. Build an OCR model you can actually ship.

Enterprise security and data sovereignty

Roboflow is a US-based platform with SOC 2 Type II compliance, encryption in transit and at rest, and an uptime SLA. Deploy on the edge, on-prem, in your VPC, or fully air-gapped, so receipts, invoices, plates, and the documents you read never have to leave your infrastructure.

80+ open datasets and models to fork

The YOLO-OCR collection on Roboflow Universe spans receipts, invoices, document layout, table extraction, digits and meters, words, braille, and signatures. Test any project in the browser, download it as a labeled dataset, or fine-tune it into your own model.

Detect, read, and validate in one Workflow

Chain detection with a vision-language model and business logic in Roboflow Workflows: find the text, read it, validate the format, and push structured values into accounting, ERP, a database, or a dashboard, without wiring the orchestration yourself.

Vision AI is already running in production

Half the Fortune 100 build computer vision with Roboflow, with OCR and document models deployed in finance, logistics, utilities, and on the edge.

80+
open OCR datasets and models in the YOLO-OCR collection
1M+
engineers and 16,000+ organizations building on the platform
55B+
model inferences run in production across critical industries

Trusted by teams at BNSF, Rivian, GE Vernova, Cummins, USG, Pella, and Peer Robotics.

Frequently asked questions

What is YOLO-OCR?

YOLO-OCR is a Roboflow Universe collection of open-source OCR datasets and pre-trained models, more than 80 community projects for reading text, numbers, receipts, and invoices, plus document layout, table extraction, digits and meters, words, and even braille and signatures. Each one is testable in the browser, downloadable as a labeled dataset, and deployable via API. OCR has two stages, and YOLO-OCR powers the first one: locating text in an image by drawing a bounding box around each text region, word, or character. Turning those detections into a usable string is the second stage, reading.

How do I train a custom OCR model on Roboflow?

Start from a dataset by forking a project from the YOLO-OCR collection on Universe, or upload and label your own images in Annotate. Decide your labeling scheme up front: for a fixed character set like digits or plates, label each character as its own class; for free-form text, label the text regions and read them downstream. Train RF-DETR in Roboflow Train for the detection stage. Add the reading step, either by reading character class labels left to right, or by chaining the detector with a vision-language model in Workflows. Evaluate on real images, then deploy with Inference on the cloud or the edge.

How does the detect-then-read pipeline work?

OCR has two stages. A detection model finds and crops the text, then a reading step turns each crop into a string. If you are detecting individual characters, the trained detector already gives you the string by reading its class labels left to right, which is great for fixed sets like digits on a meter or a license plate. For free-form text, you chain the detector with a vision-language model or OCR engine in a Roboflow Workflow to read each detected region, then add a logic step that validates the result against a format or a database. That detect, read, and validate chain is where most production OCR value lives.

Is the licensing safe for commercial OCR products?

RF-DETR is released under the Apache 2.0 license, free to use commercially with no copyleft obligations, which is one reason it is the recommended model for a custom OCR detector you intend to ship. The Ultralytics YOLO family is distributed under AGPL-3.0, a strong copyleft license that in practice requires open-sourcing the application you build around the model or buying a commercial license, even for many commercial uses. If you build on a YOLO model, confirm the license before you ship.

Build your OCR model today

Explore the YOLO-OCR collection, fork a dataset, and fine-tune a custom detect-and-read pipeline you can ship.

Roboflow mascot

Have a question about OCR?

Ask the Roboflow assistant about forking a dataset, training RF-DETR, and chaining a read-and-validate Workflow.

Ask the Roboflow agent

Suggested resources