Skip to main content

Orientation Correction

Orientation Correction

Horizontal alignment of text lines is important getting for good OCR results with Tesseract. If input scanned text-based images can come vertically aligned and have to be rotated before performing OCR. Tesseract has page segmentation modes (PSM) 11 and 12 that were specifically designed to recognize text in any orientation, including vertical text. However, character recognition models themselves are still trained on horizontally oriented text. So vertical text may not achieve quite as high accuracy as horizontal. While vertical text recognition is possible, the fields tagging for information extraction is significantly more difficult compared to horizontal text layouts. Therefore, pre-processing to normalize text orientation can be very helpful.

To rotate images clockwise in Imagemagick you can use the -rotate option:

convert /path/to/input/file/input.jpg -rotate 90 /path/to/output/file/output.jpg

To rotate an image counterclockwise in ImageMagick, you can use the -rotate option with a negative angle value.

convert /path/to/input/file/input.jpg -rotate -90 /path/to/output/file/output.jpg

If the width of your input images with vertically aligned text is larger than height, you can use -rotate "90<" and only images where width exceeds height will be rotated. Please see Imagemagick documentation.

convert /path/to/input/file/input.jpg -rotate "90<" /path/to/output/file/output.jpg

If image orientation is unnown or it is necessary to automatically detect image orientation it can be done through Tessearct psm 0 mode. Page segmentation mode PSM 0 allows Tesseract to analyze image content and determine if rotation is needed to align text horizontally. This provides an automated way to handle different orientations.

tesseract /path/to/input/file/input.jpg stdout --psm 0

Run the following script to get the orientation information on all the files in a folder. Make sure to provide valid path to input foler and filetype.

#!/bin/bash
tiff_dir="/path/to/input/folder" 
files=( "$tiff_dir"/*.TIFF )
for f in "${files[@]}"
do
filename=${f##*/}	
tesseract "$f" stdout --psm 0
echo "Processed $filename"
done
echo "Tesseract OCR completed for all files"

For example, for the image above Tesseract output says :

  • the orientation of the text/objects in the image is currently 270 degrees clockwise from horizontal.
  • the image should be rotated by 90 more degrees clockwise.

The higher the quality on an input image the more accurate Tesseract orientation output.

Resources: 

https://pyimagesearch.com/2022/01/31/correcting-text-orientation-with-tesseract-and-python/