Settings → Postprocess OCR

_images/voorbewerking.png

The following settings can be set here:

  • Set document language on ...
    This will set the document language to the specified language. Any languages from the Microsoft Locale list can be chosen.
  • Remove frames
    This option removes frames in the source document before processing it further. These frames can be a consequence of incorrect exporting out of ABBYY or by incorrect interpretation by ABBYY.
  • Remove shapes
    This options removes all images and shapes when present (this includes text-boxes created by ABBYY). The quality of images in Word are very often bad quality, especially when exported. It is recommended to add the images to the ePUB later from another source.
  • Convert double quotation marks to single
    With this option all double quotes will be converted to single quotes.
  • Use color-coding
    This option controls the use of colors for special cases. The first paragraph after a page break in the original source is colored red and the first paragraph after a scene detection is colored yellow. This makes it easier spot potential issues.
  • Scene detection
    This option tries to find all scene changes in a document. These changes are often identified by a white space or line. The minimum distance between the lines to be detected can be set. The default value is 2.5, the value is in points. The default value should work in most cases. The first paragraph after such a mark is colored yellow by default. This can be turned off.
    The scene change can be replaced by:
    • an empty paragraph (just an enter)
    • a tag
  • Page marker color
    The color can be selected to be used for this color coding. There are fifteen color to choose from.
  • Break marker color
    The color can be selected to be used for this color coding. There are fifteen color to choose from.

Note

The tag is by default [scbreak], but can be changed. The tag would make it easier to replace it later by the correct syntax.