Optical Character Recognition (OCR)

Optical Character Recognition - OCR
Optical Character Recognition (OCR) in progress

Optical Character Recognition (OCR)

Max is able to convert all typewritten text using optical character recognition (“OCR”). We convert paper-based records, microfilm or existing digital images into a searchable .pdf format. Our specialised methods give exceptional accuracy, turning your off-line material into a searchable on-line resource.

We are able to handle jobs of all sizes and can work with all kinds of original materials, including bound volumes and broadsheet newspapers. We can output to a variety of formats, including PDF/A, text, MS-Word, XML and HTML.

Our sophisticated OCR system uses pattern recognition algorithms, which identify individual characters. A dictionary-based analysis then enables the system to deduce the content on a word-by-word basis, even where individual characters have not been picked up correctly. The OCR process recognises and retains content layout such as columns, tables and illustrations. This means that the document can be displayed in its original layout on the PDF whilst still being a fully searchable archive.

Some of the clients for whom we have undertaken large-scale OCR projects include:

  • London School of Economics
  • British Universities Film & Video Council
  • Anti-Slavery International Library
  • Greenwich University

Testimonials

I have worked with Max Communications and the team for some years. In recent work with their Archivematica iteration in the College Archives and Corporate Records Unit, and other digital work required by varying cohorts in Imperial, Max Communications have been responsive, innovative and demonstrated great problem-solving abilities. The Max Communications team, from the top down, is approachable, friendly and keen to help.

An example is in their engagement for some major confidential scanning projects. The projects were discussed, scoped and agreed to a high standard of hand scanning. Benchmarking was agreed to our satisfaction when the project started, with a fast progress time for the work. They provided rapid access to files, including digitising out of sequence, e.g. one such turn around for hand scanning, editing and proofing a large file was within 3 hours. The digital delivery was by secure online transfer, and the hard disks and hard copy delivered securely by courier.

On a lighter note, digitisation of 1960s academic cine film was carried out promptly and to a high standard, such that it can be reshown at a major conference

--Anne Barrett | College Archivist & Corporate Records Manager | Imperial College London
Optical Character Recognition - OCR
Optical Character Recognition (OCR) in progress