Scanning Terms & Definitions
Automatic Document Feeder. This is the means by which a scanner feeds the paper document.
Backfile Conversion
The process of converting large volumes of repository documents into digital images. This is typically done when an organization first starts to use a digital imaging system so that their older documents and future documents will all be stored in the same digital format.
Black & White
Also known as Bitonal. The scanned image is made up of pixels that are either black or white. There are no shades of gray or color.
Day Forward
In a document scanning project, only scanning documents that were created after a defined cut-off point.
The processing of straightening an image that was scanned off centered.
Removal of excess noise in a scanned image.
A broadly used term that refers to word-processing files, e-mail messages, spreadsheets, database tables, faxes, business forms, images, or any other collection of organized data. A document can consist of a single image or several thousand images. Documents are also referred to as 'records.
Dots Per Inch. Also known as resolution. The fineness or coarseness of an image. The greater the DPI the more fine the image will appear. Documents are typically scanned at 200DPI for archival and 300DPI when OCR needs to be completed on the document. Greater DPI results in higher image sizes.
Duplex Scanning
Scanning both sides of a sheet of paper.
Electronic Document
A document that has been scanned, or was originally created on a computer. Documents become more useful when stored electronically because they can be widely distributed instantly, and allow searching. HTML and PDF are well known electronic document formats.
Flatbed Scanner
a scanner where the document is placed on a glass window for scanning. A Flatbed Scanner does not have an ADF (Automatic Document Feeder).
An image type that uses black, white, and a range of shades of gray. A Grayscale image file size is much larger than that of a Black & White image.
Intelligent Character Recognition. A process that recognizes handwriting or printed text and converts it to editable alphanumeric data. Similar to OCR.
Index Field
Database fields used
Refers to a single digital image. If a single sheet of paper has information on both sides, when scanned it will be represented by two individual images.
A commonly used method of lossy compression for images scanned in color. The degree of compression can be adjusted, allowing a selectable trade-off between storage size and image quality.
Lossy Compression
A term describing a data compression algorithm which actually reduces the amount of information in the data, rather than just the number of bits used to represent that information. The lost information is usually removed because it is subjectively less important to the quality of the data or because it can be recovered reasonably by interpolation from the remaining data. JPEG is an example of a lossy compression technique.
Lossless Compression
A term describing a data compression algorithm which retains all the information in the data, allowing it to be recovered perfectly by decompression. TIFF can use Lossless Compression
(Optical Character Recognition) refers to the process by which scanned images are electronically "read" to convert them into editable text. This conversion is performed after scanning, and may output formatted text or text-only files (flat ASCII files). Text generated by OCR is often input into text search databases, allowing retrieval of the original scanned image based on its content.
One sheet of paper. If information is on the front and back of a page, it will be represented by two images when scanned.
Portable Document Format. PDF is a multi-platform file format developed by Adobe Systems.
PDF – Image over Unedited Text
A Searchable PDF. A document that has been processed by an OCR engine to convert the image into searchable text. The results of the OCR engine are not reviewed for accuracy.
Refers to a single point in a raster image. The pixel is the smallest addressable screen element; it is the smallest unit of picture that can be controlled.
Raster Image
A type of digital image composed of individual pixels of various colors. A raster image may be put in many file formats such as GIF, JPEG, TIFF, BMP, PICT, and PCX.
Simplex Scanning
Scanning only one side of a sheet of paper. If a page has information on the front and back then Duplex Scanning is required.
Separator Sheets
Barcoded sheets used in production scanning to indicate where one document or batch begins. Separator Sheets can be used to help automate indexing.
TIFF (Tagged Image File Format)
An industry standard file format developed for the purpose of storing high-resolution bit-mapped, gray-scale, and color images.
