- 6. OCR
- 6.1 Are 2 OCR runs needed when generating searchable PDF and ALTO/MyBib eL format?
- 6.2 OCR is not performed on entire job – problems with many or very large images
- 6.3 Can Fraktur and Antiqua clicks be carried over from ABBYY 10 to ABBYY 12?
- 6.4 Why aren’t existing Black Letter clicks retrievable and usable?
6.1 Are 2 OCR runs needed when generating searchable PDF and ALTO/MyBib eL format?add section
For the OCR engines, the OCR procedures in BCS-2 are implemented as follows: For searchable PDFs, a separate OCR run with click debit is required at ABBYY. Additional full texts, e.g. for catalog enrichment, are extracted from the PDF as a derivative, so that there are no additional OCR runs and clicks. For all other OCR outputs, e.g. formats with geometric information, such as the ALTO or MyBib eL format, a separate OCR run with its own click debit (ABBYY) is currently required. However, the different output formats can be generated in one OCR run if the batch option has been selected (see description: https://www.imageware.de/poly/bcs-2/bcs-2-pro/handbuecher/bcs-2- pro-manual/menu-job/#7.3). If other formats are subsequently generated, another OCR run with click billing at ABBYY is necessary. If you create searchable PDFs or need ALTO or MyBib eL formats, you always have 2 ABBYY clicks per A4 page! The ALTO format can currently only be generated via ABBYY.
6.2 OCR is not performed on entire job – problems with many or very large imagesadd section
When using ABBYY-OCR, the OCR run is aborted and BCS-2 is terminated if you are processing many jobs or jobs with very large and many images at the same time. The cause is that the memory is not fully released by the OCR engine. With smaller (up to 100 pages) and few OCR jobs per day, this does not lead to restrictions.
Are you editing
- large and many OCR jobs,
- large images with Black Letter/Fraktur OCR
- or OCR for bad/difficult originals (e.g. yellowed paper, smeared print, stains or microfilm scans)
the application crashes occasionally.
To avoid this, we recommend:
- Close all other programs when processing such jobs.
- Only process a limited number of batch OCR jobs at a time. Split large OCR jobs into small chunks of 100 or 250 pages for batch processing.
- Make sure that your PC has “at least” twice the amount of storage space available for the largest job so that ABBYY can swap out the interim result.
- Start BCS-2 or have it restarted regularly.
If you frequently carry out data-intensive job operations, please contact our support. We would be happy to advise you on automatic OCR overnight without operators. In general, ABBYY does not recommend a maximum job size. However, the OCR engine was designed for clients and typical client applications. As a rule of thumb, jobs with up to 100 pages run without problems. ABBYY offers a server solution for data-intensive jobs.
6.3 Can Fraktur and Antiqua clicks be carried over from ABBYY 10 to ABBYY 12?add section
When upgrading from ABBYY Runtime Engine version 10 to version 12, according to ABBYY regulations, Black Letter/Fraktur clicks or volume licenses are generally non-transferrable. Purchased Black Letter/Fraktur clicks can only be consumed with the version for which the clicks were purchased.
6.4 Why aren’t existing Black Letter clicks retrievable and usable?add section
For ABBYY OCR dongles with a fixed number of Antiqua and Black Letter/Fraktur clicks, make sure the Antiqua click count does not reach zero. As soon as there are no more Antiqua clicks available on the dongle, the Black Letter/Fraktur clicks can no longer be called up or used. In order to use the remaining Black Letter/Fraktur clicks, you must first replenish the quota of Antiqua clicks. This is a technical problem from ABBYY and not a BCS-2 problem.