How It Works

How It Works?

Our industry leading OCR technology solutions provide the most efficient on-demand translation services. Whether it’s a scanned PDF document, JPG or PNG file with text, our fully automated OCR translation solutions get it translated fast and hassle-free.

Simply upload your scanned documents or image files containing text using our online project portal to obtain an instant quote. Upon your confirmation, a project manager will review the processed OCR file to ensure that the scanned content is clean and error free. Periodically, auto recognized documents contain broken sentences with misplaced hard returns, typos, and other OCR anomalies due to background colors and pictures.  The Stepes manual review process eliminates optical character recognition issues so our translators work with a cleaned text content for the highest quality translation output.

In cases where the OCR process produces poor results, Stepes will automatically revert to the transcription process so our linguists translate your documents while looking at the scanned file. Although this process is slower compared to working with editable text, it’s Stepes way to ensure the highest linguistic quality.


Best OCR Translation Applications

Using OCR for translating scanned documents has a number of distinct advantages. In addition to automatically extracting translatable text, Stepes OCR technology also converts the extracted content to the format and layout matching the original scanned document. This automatic DTP (desktop publishing) dramatically improves localization efficiency because it eliminates the need to recreate the source document, a time-consuming process. Stepes OCR translation solutions are ideal for the following documents:

  • Scanned PDF documents
  • Hardcopy business contacts
  • Photocopied legal documents
  • Scanned medical records
  • Electronic images of brochures


What is OCR?

OCR stands for optical character recognition which is a software application that converts scanned documents into documents with a live text—aka editable, searchable text that you can change, copy, edit, and translate. There are a number of different OCR technologies such as matrix matching and intelligent recognition. Matrix matching compares the shape and strokes of scanned characters against a library of character templates corresponding to ASCII characters. Intelligent recognition is OCR that uses artificial intelligence to look for and analyze character features such as open spaces, character shapes, diagonal lines, line intersections, etc. and is more versatile.

Frequently Asked Questions


How can I translate a scanned document?

There are three ways to translate your scanned documents with professional quality.

  1. Transcription Process. If the document contains one or two pages, it can be translated using a manual transcription process. During this process, a human translator looks at the scanned document and translates the content directly into a Word or text file. While this approach is more time consuming and often loses the original formatting, the quality will be good.
  2. OCR with Machine Translation. If translation quality is not your top consideration, once the document is scanned with OCR to extract the content a machine translation program then automatically converts the content to your desired language. This process is fast and the most cost efficient while retaining its original format. However, linguistic quality may be unacceptable if the content is customer facing.
  3. OCR with Professional Translator. Similar to option 2, this approach needs to first extract the text using OCR technology, but the content is then translated by a professional human translator for the highest quality. In order to ensure the best result, the text scanned through OCR should first be reviewed to remove mechanical errors (such as typos and broken up sentences introduced during optical character recognition.) This option will also retain the format of the original document, but human translation is often more time consuming and therefore more costly.


How do I translate a PDF?

PDF (portable document format) is easy to distribute but it’s hard to translate. This is because a PDF file no longer retains the internal content organization and format structures of the source document. This can be especially challenging if the PDF document contains many pages with an elaborate layout. However, PDF files are among the most common file types that people often need translated (either because they don’t own the original source document or they don’t know where to get it.) Stepes on-demand translation solutions converts PDF files to either Adobe InDesign or Word format automatically so they can be easily translated. However, if you have a protected PDF, it will take considerably more effort as automatic text extraction is no longer effective.


How can I translate a picture?

Picture translation is becoming increasingly in demand because a large portion of today’s digital information exists in scanned documents or pictures people take with their phones. For example, Stepes image translation page gets tons of picture translation requests on a daily basis. The process of translating images is similar to that of scanned documents. If the image is a screenshot of an app or user interface which contains only a small amount of text, it can be translated quite easily using the transcription method. However, if the picture contains a lot of text, it’s best to scan it with OCR first to extract the content before translating with a professional translator for the best results.

Need help with Scanned Document Translations?