Skew Correction
Skew Correction
Deskewing is a critical preprocessing step for text-based images when using Tesseract OCR. Even slight skew angles of 5-10 degrees can have a significant impact on line detection and character recognition accuracy. Correcting the skew of the image helps ensure that the lines of text are horizontal, which improves the OCR results by allowing Tesseract to accurately recognize and interpret the text. Besides, leaving text in a skewed state can make it challenging to tag the values that need to be extracted from the text accurately. Therefore, deskewing not only improves OCR accuracy but also makes it feasible to effectively tag the fields within the text image.
If your documents seem slightly skewed to one side you can employ ImageMagick to rectify skewed text and enhance its legibility for subsequent OCR processing. ImageMagick provides options like -deskew or -rotate for such operations.
In ImageMagick, -deskew is an automatic process. It typically works for skewed images of about 5 deg or less. An ImageMagic command like "-deskew threshold{%}" allows to accurately determine the skew angle and rotate the image to the required degree. According to ImageMagick documentation -deskew 40% should work for most the skewed images (https://imagemagick.org/script/command-line-options.php#deskew) :
So first you can attempt to deskew your image with the following ImageMagick command. You can try 40% as a start.
convert /path/to/input/image/input.jpg -deskew 40% +repage /path/to/output/image/output.jpg
ImageMagick -deskew 40% works well on good quality images with sufficient lineheight and spacing. If it doen't work for small text (less than 10pt) consider preprocessing and resizing the image fist. As an example see Crop Text Area of an Image.
However, as some exploration of the area shows, for some images -deskew 40% gives a much lower rotation angle, for example, of -0.03 degrees. On the contrary -deskew -10% gives a rotation of 1.59 degrees, which proved to be the correct deskew angle for the particular test image.
Therefore, you might need to run -deskew with a different threshold value (other than 40%) to rectify skew of your image.
By adding the +repage option after the -deskew operation, you ensure that the virtual canvas is correctly reset to match the newly cropped image dimensions. If you do not add +repage after -deskew or -rotate you might get an error: “negative image positions unsupported”.
The following command can be useful to find out the amount of skew correction (rotation angle) applied to your image during the deskewing process with the given threshold value.
convert /path/to/input/image/input.jpg -background white -deskew 40% -print '%[deskew:angle]\n' null:
If you run the command above with different threshold values and then compare the output rotation angles you can determine the optimal rotation angle and then rotate your image with either -deskew threshold{%} or -rotate degrees command.
When processing bulk images of different text size and skew angle we employed a logic of downsizing all images, checking different deskew threshold values, finding the optimal rotation angle and then rotating hte image.
In our case of preprocessing bulk images we came to the following conclusions:
- The correct rotation angle is normally the largest positive or negative value depending on whether your image needs to be rotated clockwise or counterclockwise (if you stay in the recommended deskew threshold range).
- The range of threshold values that can give you the correct rotation angle usually lies between -20% and 60%. (If you go below -20% threshold some large angles might appear which can erroneously be chosen as rotation angles and can cause even worse text skew, going above 60% normally doesn’t bring any new angle values).
- ImageMagick might not deskew large images efficiently (-deskew threshold{%}). ImageMagick -rotate option works equally well for small and large images (-rotate degrees).
In the example below you can see the output of running -deskew for one image of different sizes. Thus if you have an image larger than 1000x1000 and you are experiencing difficulties applying Imagemagick -deskew to it (get 0 result) consider downsizing your image first.
In our can of bulk image skew correction we had a simple yet effective method using ImageMagick's -deskew and -rotate functions. First, the image is downsized because ImageMagick does not always efficiently apply the deskew operation to large, high-resolution images. Then we run -deskew and iterate over a range of deskew thresholds (for example from -20% to 60% in increments of 10). This helps to identify the optimal skew angle for the text. With the ideal rotation angle determined, we can then rotate the original image or the image resampled to any size utilizing -rotate degrees option to correct the identified skew.
Here is a sample powershell script that was used to deskew a batch of images:
$inputFolder = "path\to\input\images" $outputFolder = "path\to\output\images" $angles = "-20%","-15%","-10%","-5%","5%","10%","15%","20%","25%","30%","35%","40%","45%","50%","55%","60%" Get-ChildItem $inputFolder -Filter *.tiff | ForEach-Object { $file = $_.FullName $fileName = $_.BaseName Write-Host "Processing file: $fileName" $allAngles = @() foreach($angle in $angles) { $command = "magick convert '$file' -units PixelsPerInch -resample 140 -density 140 -background white -deskew $angle +repage -print '%[deskew:angle]\n' null:" $output = Invoke-Expression $command 2> $null if($output -match "([\-]?\d+(\.\d+)?)") { $allAngles += $matches[1] } } $largestAngle = ($allAngles | Measure-Object -Maximum).Maximum $rotateCmd = "magick convert '$file' -rotate '$largestAngle' '$outputFolder\$fileName-rotated.jpg'" Invoke-Expression $rotateCmd Write-Host "File rotated by: $largestAngle" Write-Host "Angles found: $allAngles" }
Please make sure that you adjust the following variables and values before running the script:
- paths for the input folder and output folder (inputFolder, outputFolder);
- input and output file extension (fileName, rotateCmd);
- range of deskew thresholds (angles);
- resample and density
The provided script has the following steps:
- Loop through each TIFF file in an input folder
- Downsize each file to 140dpi and deskew it with threshold values from -20% to 60% in Imagemagick deskew command
- Extract the deskew angle value and add it on an array of output angles.
- Get the largest angle.
- Rotate the original TIFF by the largest angle (the documents are rotated both clockwise and counterclockwise).
- Save the rotated image as JPG in the same folder.
As a result you images will be deskewed to the optimal deskew angle:
Additional resources on skew correction:
http://www.fmwconcepts.com/imagemagick/textdeskew/index.php
https://felix.abecassis.me/2011/10/opencv-rotation-deskewing/
https://pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
https://stackoverflow.com/questions/63164341/improving-image-deskew-using-python-and-opencv