ssocr -T to recognize the above image.
Seven Segment Optical Character Recognition or ssocr for short is a program to recognize digits of a seven segment display. An image of one row of digits is used for input and the recognized number is written to the standard output. The programm runs on GNU/Linux (and Mac OS X), and uses Imlib2 to access image data.
Source code ssocr-2.13.4.tar.bz2 (licensed under the terms of the GNU GPL version 3 or later).
The image is optionally filtered and then transformed into a monochrome representation with the digits as foreground using some form of thresholding. This image is segmented to find the digits and then each digit is recognized individually.
Starting at the left margin a column containing some foreground pixels is searched, marking the start of the first digit. After that a column containing only background pixels is searched to find the horizontal stretch of the digit. This process is repeated to find the specified number of digits, or until no more digits are found.
The vertical segmentation works similar, but gaps in digits are allowed, because in some digits the middle segment is unset.
This segmentation technique works for a single row of digits only.
Every digit found by the segmentation is classified as follows: A vertical scan is started in the center top pixel of the digit to find the three horizontal segments. Any foreground pixel in the upper third is counted as part of the top segment, those in the second third as part of the middle and those in the last third as part of the bottom segment.
To examine the vertical segments two horizontal scanlines starting on the left margin of the digit are used. The first starts a quarter of the digit height from the top, the other from a quarter of the digit height from the bottom. Foreground pixels in the left resp. right half represent left resp. right segments.
The recognized segments are then used to identify the displayed digit
using a table lookup (implemented as a
Since the above algorithm cannot recognize the digit one, a digit that has a width of less than one quarter of it's height is recognized as a one.
To recognize a decimal point, e.g. of a digital scale, the size of each digit (that was not recognized as a one already) is compared with the maximum digit width and height. If a digit is significantly smaller than that, it is assumed to be a decimal point. The decimal point or thousands separators count towards the number of digits to recognize.
In this image the left border of a digit is represented by a red column, the right border as a blue column. Horizontal green lines of digit width show connected vertical digit parts. The gray rectangles represent the digit dimensions.
Pixels found by the vertical scanline are shown in red, green and blue for the top, middle and bottom third. Those found by the horizontal scanlines are shown in red and green for the left and right half of the digit. No scanlines are used to recognize a one.
Seven Segment Optical Character Recognition Version 2.13.4 Copyright (C) 2004-2013 by Erik Auerswald <firstname.lastname@example.org> This program comes with ABSOLUTELY NO WARRANTY This is free software, and you are welcome to redistribute it under the terms of the GNU GPL (version 3 or later) Usage: ssocr [OPTION]... [COMMAND]... IMAGE Options: -h, --help print this message -v, --verbose talk about program execution -V, --version print version information -t, --threshold=THRESH use THRESH (in percent) to distinguish black from white -a, --absolute-threshold don't adjust threshold to image -T, --iter-threshold use iterative thresholding method -n, --number-pixels=# number of pixels needed to recognize a segment -i, --ignore-pixels=# number of pixels ignored when searching digit boundaries -d, --number-digits=# number of digits in image (-1 for auto) -r, --one-ratio=# height/width ratio to recognize a 'one' -o, --output-image=FILE write processed image to FILE -O, --output-format=FMT use output format FMT (Imlib2 formats) -p, --process-only do image processing only, no OCR -D, --debug-image[=FILE] write a debug image to FILE or testbild.png -P, --debug-output print debug information -f, --foreground=COLOR set foreground color (black or white) -b, --background=COLOR set foreground color (black or white) -I, --print-info print image dimensions and used lum values -g, --adjust-gray use T1 and T2 as percentages of used values -l, --luminance=KEYWORD compute luminance using formula KEYWORD use -l help for list of KEYWORDS Commands: dilation dilation algorithm (with mask of 1 pixel) erosion erosion algorithm (with mask of 9 pixels) closing [N] closing algorithm ([N times] dilation then [N times] erosion) opening [N] opening algorithm ([N times] erosion then [N times] dilation) remove_isolated remove isolated pixels make_mono make image monochrome grayscale transform image to grayscale invert make inverted monochrome image gray_stretch T1 T2 stretch luminance values from [T1,T2] to [0,255] dynamic_threshold W H make image monochrome w. dynamic thresholding with a window of width W and height H rgb_threshold make image monochrome by setting every pixel with any values of red, green or blue below below the threshold to black r_threshold make image monochrome using only red channel g_threshold make image monochrome using only green channel b_threshold make image monochrome using only blue channel white_border [WIDTH] make border of WIDTH (or 1) of image have background color shear OFFSET shear image OFFSET pixels (at bottom) to the right rotate THETA rotate image by THETA degrees crop X Y W H crop image with upper left corner (X,Y) with width W and height H set_pixels_filter MASK set pixels that have at least MASK neighbor pixels set (including checked position) keep_pixels_filter MASK keeps pixels that have at least MASK neighbor pixels set (not counting the checked pixel) Defaults: needed pixels = 1 ignored pixels = 0 no. of digits = 6 threshold = 50.00 foreground = black background = white luminance = Rec709 height/width threshold = 3 Operation: The IMAGE is read, the COMMANDs are processed in the sequence they are given, in the resulting image the given number of digits are searched and recognized, after which the recognized number is written to STDOUT. The recognition algorithm works with set or unset pixels and uses the given THRESHOLD to decide if a pixel is set or not. Use - for IMAGE to read the image from STDIN. Exit Codes: 0 if correct number of digits have been recognized 1 if a different number of digits have been found 2 if one of the digits could not be recognized 3 if successful image processing only 42 if -h, -V, or -l help 99 otherwise
Imlib2 (and therefore ssocr) does not work well with Netpbm images.
This program was developed as a proof of concept to test the recognition algorithm (this still shows in the source code...).
ssocr crop 230 195 220 60 -t 20 to get the token from the
Once upon a time a fellow member of the UNIX-AG got issued an RSA SecurID 600 token, but did not want to carry it around all the time. The available general OCR software was not able to recognize the digits, mainly because the segments are not connected. This gap was filled by ssocr.
Since then, a usb camera points to the token inside a cookie box and ssocr is used to get the number into the computer. A script using this info and a password is then used for login.
This setup means that the user has no need to carry the token with him and it can even be easily shared with co-workers. The complicated login procedure requiring a password and in this case two token numbers (which means a one minute wait for the next number) was incentive enough to replace the two factor authentication by traditional authentication measures. The security of the system is determined by the weakest link, which is not the one-time passcode provided by the token.
The first versions of ssocr did not contain the image manipulation algorithms. A seperate program called ssocrpp (seven segment OCR preprocessor) was used instead. Since this program used Imlib2 as well, an intermediate image file had to be used. To overcome this, versions 2.x.x of ssocr include all functionality of ssocrpp.
The second major version of ssocr integrated all functionality in one binary. This was the first publicly released ssocr version. Development concentrated on adding image manipulation functions. No external image manipulation programs were needed any more, thus easing use of ssocr on differing Linux distributions. Since version 2.9.0 the image can be read from a pipe, easing the use of external image manipulation programs. Since version 2.11.0 a decimal point can be detected. Since version 2.12.0 hexadecimal digits can be detected. Since version 2.13.0 the number of digits can be determined automatically. Recognition of a decimal point and an arbitrary number of digits has been added to read the display of digital scales.
A similar Project in Perl, published by the German Linux Magazin.
LimID, another project using specialized OCR to read a seven segment display. This actually includes some hardware to push a button on the token.
RoastLogger, another project that uses OCR to read seven segment displays (non-free).
Alex Samorukov blogged about using ssocr for the original use case of reading the number shown by an RSA SecurID token. :-)
Matt Kirchstein imported ssocr version 2.9.7 into a
and made two minor changes to compile ssocr on Mac OS X.
Equivalent changes have been incorporated into ssocr version 2.13.3.
I learned about this from someone trying to use ssocr from that site
and having problems, not from Matt. :-(
If you have to change something to make ssocr work for you, please tell me so I can improve ssocr for everyone. Thanks.
The FobCAM shows the RSA SecurID
token of someone.
This is page is currently (2009-07-17) offline, but you can try the Wayback Machine.
My simple image grabber for linux.
A comparison of formulas to create grayscale images.
Image processing can be done with Netpbm, ImageMagick, GREYC's Magic Image Converter, GraphicsMagick, ExactImage, or cvtool, among others (consider GIMP or ImageJ for interactive use).
leptonlib, found at
is a C library for image processing and analysis.
The web site contains some good reading besides library documentation as well.
The GD Graphics Library
(new site) is a comprehensive graphics
GEGL is a newcomer from the GIMP community.
GFXprim is a recent 2D bitmap graphics
library with C and Python APIs.
Anyway, I'd like a simple wrapper around the different librarys for working
with image formats, to easily load an image into memory and access the
individual pixels (and nothing else), but have not found this yet.
There are too many C++ libraries for image processing and computer vision. Some of them are: CImg, FreeImage, GIL (part of Boost), libCVD, LTI-Lib, Mimas toolkit, NASA Vision Workbench (github repository), OpenCV, OpenImageIO, ORFEO Tool Box (alternative link), VIPS, VSIPL++, VXL.
A few free OCR programs are Clara OCR, Conjecture, Cuneiform (with YAGF GUI), GNU Ocrad, GOCR, ocre, OCRFeeder, ocropus, Tesseract OCR.
unpaper is an interesting program to process scanned paper sheets. Scan Tailor is an interactive program to post-process scanned images.
ssocr uses Imlib2 for image I/O.
back to my homepage.