Proceedings of the
Seventh International Workshop on Frontiers in Handwriting Recognition
September 11-13 2000, Amsterdam

Edited by L.R.B. Schomaker and L.G. Vuurpijl



Full Papers


Full Paper #1

CROSS­DOMAIN SEARCHING USING HANDWRITTEN QUERIES

D. LOPRESTI, G. WILFONG

Bell Labs, Lucent Technologies Inc.,
600 Mountain Avenue,
Murray Hill, NJ 07974, USA
E­mail: [dpl,gtw]@research.bell­labs.com

In this paper, we show how cross­domain approximate string matching can be applied to searching a database of scanned typeset documents using handwritten queries without requiring the correction of recognition errors. We present preliminary experimental results that suggest this approach can significantly improve retrieval effectiveness.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 3-12.

Download PostScript
Download PDF


Full Paper #2

ADAPTIVE CHARACTER RECOGNIZER FOR A HAND­HELD DEVICE: IMPLEMENTATION AND EVALUATION SETUP

Vuokko VUORI, Matti AKSELA, Jorma LAAKSONEN, Erkki OJA

Helsinki University of Technology
Laboratory of Computer and Information Science
P.O.Box 5400, FIN­02015 HUT, Finland
email: {vuokko.vuori,matti.aksela,jorma.laaksonen}@hut.fi

JARI KANGAS

Nokia Research Center
P.O.Box 100, FIN­33701 Tampere, Finland
email: jari.a.kangas@nokia.com

In this work, we describe a character recognition system we have implemented for experimenting with self­supervised adaptation method. The Dynamic Time Warping algorithm is used for matching input characters to prototypes and recognition is carried out according to the k­nearest neighbor rule. The prototype set is adapted by adding new prototypes into the prototype set and reshaping existing ones with a method based on the Learning Vector Quantization. The adaptation process is supervised by the user's reactions to the recognition results and other indirect information obtained from the user interface of text input. We also discuss the practical problems encountered in the implementation of a computationally heavy recognition method into a device with limited resources.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 13-22

Download PostScript
Download PDF


Full Paper #3

ON­LINE CHARACTER RECOGNITION ADAPTIVELY CONTROLLED BY HANDWRITING QUALITY

Masahiko HAMANAKA and Keiji YAMADA

Computer & Communication Media Research, NEC Corporation
4­1­1 Miyazaki, Miyamae­ku, Kawasaki, Kanagawa 216­8555, Japan
E­mail: fm­hamanaka@az, kg­yamada@cpg.jp.nec.com

On­line character recognition which can adapt to handwriting quality is proposed. In character recognition, it is difficult to recognize both clearly and roughly written characters accurately. For Japanese characters, the number of strokes is often slightly varied when characters are written roughly. In a previous method, the ranges of the number of strokes were set widely enough for recognition; however, these ranges were not optimal for clearly written characters. The proposed method controls a distribution model of the number of strokes adaptively according to handwriting quality, and it uses this model for pre­candidate selection and fine classification. Recognition experiments demonstrated that the proposed method has greater recognition accuracy and speed than the previous method. In particular, accuracy was improved from 91.4% to 94.3% and speed was increased by about 50% when recognizing clearly written data.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 23-32.

Download PostScript
Download PDF


Full Paper #4

INTER-LINE DISTANCE ESTIMATION AND TEXT LINE EXTRACTION FOR UNCONSTRAINED ONLINE HANDWRITING

Eugene H. RATZLAFF

IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598, USA
E-mail: ratzlaff@us.ibm.com

Methods for detecting and extracting whole text lines from unconstrained online handwritten text are described. The general approach is a ``bottom-up'' clustering of discrete strokes into small groups that are then merged into isolated lines of text. Initial clustering of strokes into groups is based on combined temporal and spatial stroke proximity. Spatial stroke proximity is gauged relative to estimated inter-line distance and mean character height. Two methods applicable to off-line or on-line data are described for estimating the inter-line distance: autocorrelation (self-convolution) of the Y-axis projection histogram, and a fitting function. Inter-line distance is accurately determined for 99% of all text pages. Text line extraction accuracy on letters (correspondence) is 98.7% and on tables is 94.9%.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 33-42.

Download PostScript
Download PDF


Full Paper #5

LINE REMOVAL AND CHARACTER RESTORATION USING BAG REPRESENTATION OF FORM IMAGES

Soo Hyung KIM, Seon H. JEON and Hee K. KWAG

Department of Computer Science, Chonnam National University
300 Yongbongdong, Bukgu, Kwagju, 500-757, Korea
E-mail: {shkim, swjong, hkkwag}@chonnam.chonnam.ac.kr}

This paper proposes a new algorithm for text/lines separation in forms processing. It can detect and remove various styles of horizontal lines, such as lines rotated up to ± 45, lines that are curved a little, dashed lines, lines with non-uniform thickness, and so on. After removing the line, it recovers character strokes distorted by the deleted lines. All the operations are performed efficiently with a BAG (Block Adjacency Graph) representation of the input binary image. An experiment with 200 samples, in which handwritten Korean characters are written on a guiding line, show a superiority of our algorithm - 96.5% accuracy and about 1 second of processing time per sample.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 43-52.

Download PostScript
Download PDF


Full Paper #6

SLANT ESTIMATION FOR HANDWRITTEN WORDS BY DIRECTIONALLY REFINED CHAIN CODE

Yimei DING, Fumitaka KIMURA, Yasuji MIYAKE

Faculty of Engineering, Mie University, 1515 Kamihama,
Tsu 514-8507, JAPAN
E-mail: tei@hi.info.mie-u.ac.jp

Malayappan SHRIDHAR

ECE Dept., University of Michigan-Dearborn,
Dearborn, MI 48128-1491, USA

The authors earlier proposed a chain code method for the slant estimation and correction. However the method can usually gives good estimate of the word slant simply, the slant tends to be underestimated when the absolute of the slant is close or greater than 45o. To solve the problem, we propose an 8-directional method which can suppress the underestimate and improves the accuracy of the slant estimation effectively without sacrificing the processing speed and the simplicity. The relationship between the slant estimation accuracy and the directional refinement of the chain code is also discussed. Although the range of linear estimation is extended widely with the increase of directional resolution, the slant tends to be overestimated. However, we find if we neglect the chain elements close to the horizontal line, the overestimate can be suppressed properly.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 53-62.

Download PostScript
Download PDF


Full Paper #7

A GENERIC SYSTEM TO EXTRACT AND CLEAN HANDWRITTEN DATA FROM BUSINESS FORMS

Xiangyun YE1,2 Mohamed CHERIET1,2 and Ching Y. SUEN1

1Centre for Pattern Recognition and Machine Intelligence
Concordia University, Suite GM606, 1455 de Maisonneuve Blvd. West
Montréal, Québec H3G 1M8, Canada

2Imagery, Vision and Artificial Intelligence Laboratory
École de Technologie Supérieure, University of Québec
1100, Notre­Dame West, Montréal, Québec H3C 1K3, Canada
E­mail: {xyye, suen@cenparmi.concordia.ca}, cheriet@gpa.etsmtl.ca

A generic system is proposed to automatically extract and clean handwritten items from business forms. Handwritten data usually touch or cross preprinted form frames and texts. Having assumed that the item­of­interest can be located roughly by existing form registration methods, we focus only on the extraction and cleaning of the filled­in items. The proposed system includes training and cleaning phases. In the training phase, a model template is generated automatically from a blank form. Features such as the position and stroke width of the preprinted entities (including form frames and instructions) are extracted. In the cleaning phase, the system registers the template to the input form by landmark alignment. The form frames are removed and the handwritings are restored by morphological operations. When the handwritings are found touching or crossing preprinted texts, morphological operations based on statistical features are used to clean them. Both subjective and objective evaluations show promising results of the proposed system.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 63-72.

Download PostScript
Download PDF


Full Paper #8

A FRAMEWORK FOR DOCUMENT PRE-PROCESSING IN FORENSIC HANDWRITING ANALYSIS

Katrin FRANKE and Mario KÖPPEN

Fraunhofer Institute for Production Systems and Design Technology IPK,
Pascalstr. 8-9, D-10587 Berlin, Germany,
Fon: +49 30 39006194, Fax: +49 30 3917517,
E-mail: fkatrin.franke, mario.koeppeng@ipk.fhg.de

We propose an open layered framework, which might be adapted to fulfill sophisticated demands in forensic handwriting analysis. Due to the contradicting requirements of processing a huge amount of different document types as well as providing high quality processed images of singular document classes, neither a standardized queue of processing stages and fixed parameter sets nor fixed image operations are qualified for such a framework concept. The open layered framework, proposed in this paper, provides adaptation abilities at the parameter level, the operator level and the algorithm level. Moreover, an embedded module that uses genetic programming might generate specific filters for background removal on the fly. In this paper the layered framework will be presented, aspects of the implementation and results of its application will be given.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 73-82.

Download PostScript
Download PDF


Full Paper #9

BACKGROUND ELIMINATION IN BANK CHECKS USING GREYSCALE MORPHOLOGY

Santhosh SHETTY1, M. SHRIDHAR1 and Gilles HOULE2

1University of Michigan - Dearborn, USA
2Computer Sciences Corp, USA
E-mail: mals@umich.edu, ghoule2@csc.com

In this paper we propose a new method of background elimination in personal bank checks to facilitate machine recognition of user entered information. One of the key problems that affect the extraction of user entered information is the wide diversity of the backgrounds of checks. They have different patterns including scenes from nature, cartoon characters and sometimes even paintings. These backgrounds add artifacts, which are picked up by the pre-processing systems and produce errors in the later stages, culminating in erroneous recognition. This method does not require a pre-stored image of a blank check to be used as a template. Instead it generates a ``pseudo template'' from a filled in input image which is then used as a reference to eliminate the background. The input image is subtracted from the generated reference image. The result is then separated into foreground pixels and background pixels based on a relative threshold that takes into account the magnitude of the pixels.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 83-91.

Download PostScript
Download PDF


Full Paper #10

STROKE LEVEL MODELING OF ON LINE HANDWRITING THROUGH MULTI­MODAL SEGMENTAL MODELS

T. ARTIÈRES , J­M. MARCHAND, P. GALLINARI

LIP6
E­mail: Thierry.Artieres@lip6.fr, Patrick.Gallinari@lip6.fr

B. DORIZZI

INT
E­mail: dorizzi@int­evry.fr

Hidden Markov Models (HMMs) have become within a few years the main technology for on line handwritten word recognition (HWR). We consider here segment models which generalize HMMs, these models aim at modeling the signal at a global level rather than at the frame level and have been shown to overcome standard HMMs in their modeling ability. We propose a new segment model which allows to automatically handle different writing styles. We compare our system on the isolated character set of the UNIPEN database to a reference system and a baseline segment model.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 93-102.

Download PostScript
Download PDF


Full Paper #11

FEATURE SELECTION USING GENETIC ALGORITHMS FOR HANDWRITTEN CHARACTER RECOGNITION

Gyeonghwan KIM1 and Sekwang KIM2

1Dept. of Electronic Engineering, Sogang University
CPO Box 1142, Seoul 100-611, Korea
E-mail: gkim@ccs.sogang.ac.kr
2Turbo Tek, 16-6 Sunae, Pundang, Sungnam Kyungki, Korea
E-mail: skim@turbotek.co.kr

A feature selection method using genetic algorithms which are suitable means for selecting appropriate set of features from ones with huge dimension is proposed. SGA (Simple Genetic Algorithm) and its modified methods are applied to improve the recognition speed as well as the recognition accuracy. Experimental results show that the proposed methods improve the recognition performance with significant reduction in feature dimension. Several trials also have been made to investigate how the outcome of feature selection is affected as the feature dimension is changed.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 103-112.

Download PostScript
Download PDF


Full Paper #12

A NEW STRATEGY FOR IMPROVING FEATURE SETS IN A DISCRETE HMM­BASED HANDWRITING RECOGNITION SYSTEM

F. GRANDIDIER AND R. SABOURIN

CENPARMI, Concordia University, 1455 de Maisonneuve Blvd West, Montréal H3G 1M8, Canada LIVIA, Ecole de Technologie Supérieure, 1100 rue Notre Dame Ouest, Montréal H3C 1K3, Canada E­mail: frede@cenparmi.concordia.ca, sabourin@gpa.etsmtl.ca

C.Y. SUEN

CENPARMI, Concordia University,
1455 de Maisonneuve Blvd West, Montréal H3G 1M8, Canada

M. GILLOUX

RMO, Service de Recherche Technique de La Poste,
10, rue de l'Ile Mabon, BP86334, 44263 Nantes Cedex 2, France

In this paper we introduce a new strategy for improving a discrete HMM­based handwriting recognition system, by integrating several information sources from specialized feature sets. For a given system, the basic idea is to keep the most discriminative features, and to replace the others with new ones obtained from new feature spaces. After evaluating the individual discriminative power of each single feature, the set is divided into two subsets: one containing the discriminative features, and the second the others. Considering feature classes in the non­ discriminative feature subset allows the specialization of new feature sets on specific problems. The application of this strategy to an existing system showed an improvement of 16% in the recognition rate when a lexicon of 1000 city names was used.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 113-122.

Download PostScript
Download PDF


Full Paper #13

ON THE REJECTION ABILITY REQUIRED IN MULTIPLE HYPOTHESIS TECHNIQUES

Hiroshi SAKO, Tatsuhiko KAGEHIRO and Hiromichi FUJISAWA

Hitachi Central Research Laboratory
1­280 Higashi­Koigakubo, Kokubunji, Tokyo 185­8601, JAPAN
E­mail: sakou@crl.hitachi.co.jp

The so­called multiple hypothesis technique is applied to solve a recognition problem that can be divided into at least two sub­problems. The principle of the technique is to solve the sub­problems by recognisers, a pre­recogniser and a post­recogniser, and to allow the pre­recogniser to leave several possible solutions to the post­recogniser. The pre­recogniser uses several hypotheses based on information or a priori knowledge. The post­recogniser tries to solve its assigned sub­problem by using the solutions from the pre­recogniser and different a priori knowledge. Therefore, there must be co­operation between the recognisers in order to achieve better total performance in solving the recognition problem. In this study, required abilities of the pre­ and the post­recognisers are analysed in order to attain better recognition performance. This analysis gives guidelines for two special factors: number of outputs of the pre­recogniser and required recognition rate of each recogniser. These guidelines are applied to an actual mail­address reading system using a multiple hypothesis technique.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 123-132.

Download PostScript
Download PDF


Full Paper #14

RECOGNITION METHOD FOR CURSIVE JAPANESE WORD WRITTEN IN LATIN CHARACTERS

Kenichi MARUYAMA and Yasuaki NAKANO

Dept. of Information Engineering, Shinshu University
4--17--1 Wakasato, Nagano, 380--8553, Japan
E­mail: fzmaru, nakanog@cs.shinshu­u.ac.jp

This paper proposes a recognition method for cursive Japanese words written in Latin characters. The method integrates multiple classifiers using duplicated can­ didates in multiple classifiers and orders of classifiers to improve the word recog­ nition rate combining their results. In experiments using two classifiers, the word recognition rate was 68.4%, and the cumulative recognition rate among the ten best candidates was 92.5%.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 133-142.

Download PostScript
Download PDF


Full Paper #15

CLASSIFIER COMBINATION: THE ROLE OF A­PRIORI KNOWLEDGE

V.DI LECCE1, G.DIMAURO2, A.GUERRIERO1, S.IMPEDOVO2, G.PIRLO2, A.SALZO2

1Dipartimento di Ing. Elettronica ­Politecnico di Bari­ via Re David ­70126 Bari­ Italy
2Dipartimento di Informatica ­ Università di Bari ­ Via Orabona, 4 ­ 70126 Bari -- Italy

The aim of this paper is to investigate the role of the a­priori knowledge in the process of classifier combination. For this purpose three combination methods are compared which use different levels of a­priori knowledge. The performance of the methods is measured under different working conditions by simulating sets of classifier with different characteristics. For this purpose, a random variable is used to simulate each classifier and an estimator of stochastic correlation is used to measure the agreement among classifiers. The experimental results, which clarify the conditions under which each combination method provides better performance, show to what extend the a­priori knowledge on the characteristics of the set of classifiers can improve the effectiveness of the process of classifier combination.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 143-152.

Download PostScript
Download PDF


Full Paper #16

IMPROVEMENT OF RECOGNITION ACCURACY USING 2­STAGE CLASSIFICATION

Kr. IANAKIEV and V. GOVINDARAJU

CEDAR, Department of Computer Science
UB Commons, Suite 202
Amherst, NY 14228, USA
E­mail: {ianakiev,govind}@cedar.buffalo.edu

Typical digit recognizers classify an unknown digit pattern by computing its distance from the cluster centers in a feature space. The K­Nearest Neighbor (KNN) Rule assigns the unknown pattern to the class belonging to the majority of its K neighbors. These and other traditional methods adopt a uniform rule irrespective of the "difficulty" of the unknown pattern. In this paper, we propose a method­ ology which uses a multiple classification scheme. The classification rules of each stage are dependent on the "difficulty" of the unknown sample. Samples "far" from the center which tend to fall on the boundaries of classes are error­prone and hence "difficult". An "overlapping zone" is defined in the feature space to identify such difficult samples. We have tested this methodology on a large set (30,398) of handwritten digit images. The method described in this paper has improved the performance of the GSC digit recognizer 7 . Our method successfully reduces its error rate from 2.85% to 1.96%, i.e by 0.89%, which is more than 30% of the initial error. We have tested our method on other available classifiers and have obtained similar results.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 153-165.

Download PostScript
Download PDF


Full Paper #17

RECOGNITION AND VERIFICATION OF TOUCHING HANDWRITTEN NUMERALS

Jie ZHOU

IBM Toronto Laboratory
330 University Avenue, Toronto Canada M5G 1R7
E-mail: jiez@ca.ibm.com

Adam KRZYZAK, Ching Y. SUEN

Centre for Pattern Recognition and Machine Intelligence
Concordia University, Montreal, Canada H3G 1M8
E-mail: {krzyzak,suen}@cenparmi.concordia.ca

In the field of financial document processing, recognition of touching handwritten numerals has been limited by lack of good benchmarking databases and low reliability of algorithms. This paper addresses the efforts toward solving the two problems. Two databases IRIS-Bell'98 and TNIST are built/organized to serve as standard data sets. Working with the samples from these databases, we proposed a Recognition & Verification system measured by precision rate, which reflects the system reliability in a class-specific manner. The graph-based recognizer combines the segmentation-based and segmentation-free approaches, while the verifier incorporates both general and domain specific verification schemes. Results supported the effectiveness of the proposed verification scheme.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 179-188.

Download PostScript
Download PDF


Full Paper #18

RESULTS FROM A PERFORMANCE EVALUATION OF HANDWRITTEN ADDRESS RECOGNITION SYSTEMS FOR THE UNITED STATES POSTAL SERVICE

Donald P. D'AMATO

Mitretek Systems, Inc., 7525 Colshire Drive, McLean, VA, 22102­7400 USA
E­mail: ddamato@mitretek.org

EDWARD J. KUEBERT AND ALFRED LAWSON

United States Postal Service, 8403 Lee Highway, Merrifield, VA, 22082­8101 USA
E­mail: ekuebert@email.usps.gov, alawson@email.usps.gov

For a cost­incentive­based procurement (known as HIP), the U.S. Postal Service (USPS) developed a methodology to predict the recognition performance of Remote Computer Reader (RCR) systems for handwritten letter mail. Very high volumes of mail in the United States mean that slight changes in mail piece finalization and error rates have substantial cost consequences. Thus, high measurement precision and carefully truthed data are required. Because of considerable regional and seasonal variability in address quality, the HIP evaluation required large, representative databases of images and confirmation using high volumes of live­mail. At least four RCR versions were evaluated in HIP. In comparison to a baseline RCR system, the final HIP RCR system achieved the considerable gain of approximately 33 percent in the finalization rate for an image database, while reducing the error rate to about 2.5 percent. Live­mail measurements from 25 diverse sites corroborated the database results and illustrated the high variability in address quality and consequent recognition performance. USPS' testing confirmed that evaluation with sufficiently large and representative databases is an effective means for predicting performance on live­mail.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 189-198.

Download PostScript
Download PDF


Full Paper #19

A NEW HYBRID APPROACH FOR LEGAL AMOUNT RECOGNITION

V.DI LECCE1, G.DIMAURO2, A.GUERRIERO1, S.IMPEDOVO2 , G.PIRLO2 , A.SALZO2

1Dipartimento di Ing. Elettronica ­ Politecnico di Bari­Via Re David ­ 70126 Bari ­ Italy
2Dipartimento di Informatica ­ Università di Bari­ Via Orabona, 4 ­ 70126 Bari -- Italy

This paper presents a new hybrid approach for legal amount recognition on Italian bankchecks. It exploits the consideration that a legal amount can be described as a sequence of 'core' groups of words separated by suitable 'separator' words. Therefore, an analytical strategy is used to perform amount segmentation into 'core' groups of words that are then recognized according to a global approach. For this purpose, lexical and syntactical a­priori knowledge of the domain of application is used both for amount segmentation and recognition. The experimental results demonstrate the effectiveness of the new approach.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 199-208.

Download PostScript
Download PDF


Full Paper #20

HMM BASED HIGH ACCURACY OFF-LINE CURSIVE HANDWRITING RECOGNITION BY A BASELINE DETECTION ERROR TOLERANT FEATURE EXTRACTION APPROACH

W. WANG, A. BRAKENSIEK, A. KOSMALA, G. RIGOLL

Dept. of Computer Science, Faculty of Electrical Engineering
Gerhard-Mercator-University Duisburg
Bismarckstr. 90, 47057 Duisburg, Germany
E-mail: {wwwang, anja, kosmala, rigoll}@fb9-ti.uni-duisburg.de

Hidden Markov Models (HMMs) can model the similarity and variation among samples of a class through a doubly stochastic process. The main difficulty of its application to off-line recognition of cursive words is to produce a consistent sequence of feature vectors from the input word image. In conventional HMM based methods, a sequence of thin fixed-width vertical frames are extracted as feature vectors from the image. The extracted feature is sensitive to the error of the pre-processing step, e.g., baseline detection. In this paper we present an HMM based modeling approach together with an extended sliding window feature extraction method to decrease the influence of the baseline detection error. Experiments have been carried out and show that our novel approach can achieve better recognition performance and reduce the relative error rate significantly compared with traditional methods.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 209-218.

Download PostScript
Download PDF


Full Paper #21

VERIFICATION OF GRAPHEMES USING NEURAL NETWORKS IN AN HMM­BASED ON­LINE KOREAN HANDWRITING RECOGNITION SYSTEM

Sung J. CHO, Jahwan KIM, AND Jin H. KIM

Department of Electrical Engineering & Computer Science. KAIST
373­1, Kusong­dong, Yusong­ku,Taejon, 305­701, Korea
E­mail: {sjcho, jahwan, jkim}@ai.kaist.ac.kr

This paper presents a neural network based verification method in an HMM­based on­line Korean handwriting recognition system. It penalizes unreasonable grapheme hypotheses and complements global and structural information to the HMM­based recognition system, which is intrinsically based on local information. In the proposed system, each grapheme has one neural network verifier as well as one HMM recognizer. The verifier takes as an input the grapheme hypothesis generated by the HMM and outputs a posteriori probability as its validity. This probability is then incorporated into the search process by Viterbi algorithm during recognition. The global and structural information to the verifier is obtained from the relationship between primitive strokes in each grapheme by analyzing their correspondence with the HMM states. The experimental result shows that the recognition error of the baseline HMM network can be reduced by 39.2% with the proposed verification scheme.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 219-228.

Download PostScript
Download PDF


Full Paper #22

K­MEANS CLUSTERING FOR HIDDEN MARKOV MODELS

Michael P. PERRONE and Scott D. CONNELL

Pen Technologies Group,
IBM T.J. Watson Research Center
mpp@us.ibm.com

An unsupervised k­means clustering algorithm for hidden Markov models is described and applied to the task of generating subclass models for individual handwritten character classes. The algorithm is compared to a related clustering method and shown to give a relative change in the error rate of as much as 8% on a 30,000­word vocabulary, unconstrained­ style, on­line, writer­independent handwriting recognition task.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 229-238.

Download PostScript
Download PDF


Full Paper #23

DATA DRIVEN DESIGN OF HMM TOPOLOGY FOR ON­LINE HANDWRITING RECOGNITION

Jay J. LEE, Jahwan KIM, and Jin H. KIM

Dept. of Electrical Engineering & Computer Science, KAIST,
373­1, Kusong­dong, Yusong­gu, Taejon 305­701, Korea
E­mail: {joony, jahwan, jkim}@ai.kaist.ac.kr

Although HMM is widely used for on­line handwriting recognition, there is no simple and well­established way of designing the HMM topology. We propose a data­driven systematic method to design HMM topology. Data samples in a single pattern class are structurally simplified into a sequence of straight­line segments, and then these simplified representations of the samples are clustered. An HMM is constructed for each of these clusters, by assigning a state to each straight­line segments. Then the resulting multiple models of the class are combined to form an architecture of a multiple parallel­path HMM, which behaves as a single HMM. To avoid excessive growing of the number of the states, parameter tying is applied in that structural similarity among patterns is reflected. Experiments on on­line Hangul recognition showed about 19% of error reductions, compared to the previous intuitive design methods.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 239-248.

Download PostScript
Download PDF


Full Paper #24

NPEN++: AN ON-LINE HANDWRITING RECOGNITION SYSTEM

S. JAEGER, S. MANKE, A. WAIBEL

Interactive Systems Laboratories
University of Karlsruhe
Computer Science Department, 76128 Karlsruhe, Germany
and
Carnegie Mellon University
School of Computer Science, Pittsburgh, PA 15213-3890, USA
E-mail: fstefan.jaeger,waibelg@ira.uka.de

This paper presents the on-line handwriting recognition system NPen++ developed at the University of Karlsruhe and the Carnegie Mellon University. The NPen++ recognition engine is based on a Multi-State Time Delay Neural Network and yields recognition rates from 96% for a 5000 word dictionary to 93.4% on a 20,000 word dictionary and 91.2% for a 50,000 word dictionary. The proposed tree search and pruning technique reduces the search space considerably without loosing too much recognition performance compared to an exhaustive search. This allows running the NPen++ recognizer in real-time with large dictionaries.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 249-260.

Download PostScript
Download PDF


Full Paper #25

TWO TREE­FORMATION METHODS AND FAST PATTERN SEARCH USING NEAREST NEIGHBOR AND NEAREST­CENTROID MATCHING

L. SCHOMAKER1, D. MANGALAGIU2, L. VUURPIJL3, M. WEINFELD4

1schomaker@nici.kun.nl
2mangalagiu@hec.fr
3vuurpijl@nici.kun.nl
4weinfeld@lix.polytechnique.fr

This paper describes tree­based classification of character images, comparing two methods of tree formation and two methods of matching: nearest neighbor and nearest centroid. The first method, Preprocess Using Relative Distances (PURD) is a tree­based reorganization of a flat list of patterns, designed to speed up nearest­ neighbor matching. The second method is a variant of agglomerative hierarchical clustering (HCLUS) which aims at finding a hierarchical structure of centroids in the pattern space. Results indicate that the PURD method is a very fast, effective and convenient method for the speedup of 1NN search, from which it is, however, difficult to derive usable character prototypes. HCLUS can be used to obtain very fast search with acceptable classification rate while providing character prototypes, however, at the cost of significant training efforts.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 261-270.

Download PostScript
Download PDF


Full Paper #26

ACCURACY IMPROVEMENT OF HANDWRITTEN CHARACTER RECOGNITION BY GLVQ

Tsuyoshi FUKUMOTO, Tetsushi WAKABAYASHI, Fumitaka KIMURA, and Yasuji MIYAKE

Faculty of Engineering, Mie University, 1515 Kamihama,
Tsu 514­8507, JAPAN
E­mail: kimura@hi.info.mie­u.ac.jp

This paper deals with accuracy improvement of handwritten character recognition by the GLVQ (generalized learning vector quantization). In literature 3 , the way of combining the FDA (Fisher discriminant analysis) and the GLVQ was investi­ gated and evaluated to be e#ective for handwritten Chinese character recognition employing the minimum Euclidian distance classifier. In this paper, the projection distance and the modified projection distance are employed besides the Euclidi­ an distance, and handwritten numerals as well as Chinese characters are used for the evaluation test. The result of experiment shows that the learning of refer­ ence vectors by GLVQ improves the recognition accuracy of not only the Euclidian distance classifier but also the projection distance classifier and the modified pro­ jection distance classifier. The highest accuracy (98.41%) for the Chinese character recognition was obtained when the FDA, GLVQ and the modified projection dis­ tance were employed. The highest accuracy (99.36%) for the numeral recognition was obtained when the GLVQ and the modified projection distance were employed.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 271-280.

Download PostScript
Download PDF


Full Paper #27

GENERATING NEW SAMPLES FROM HANDWRITTEN NUMERALS BASED ON POINT CORRESPONDENCE

Minoru MORI, Akira SUZUKI, Akio SHIO, and Sakuichi OHTSUKA

NTT Cyber Space Laboratories
1­1 Hikari­no­oka, Yokosuka­shi, Kanagawa, 239­0847 Japan
E­mail: mmori@marsh.hil.ntt.co.jp

This paper describes a character generation method based on point correspondence between patterns. The number of training samples used in constructing a recognition dictionary strongly affects its recognition performance. Unfortunately, it's so time­consuming to gather large new samples that it is more useful to generate new samples from a few original ones. The character generation method proposed herein is based on the point correspondence between each sample and the template derived from all samples. The proposed method can automatically generate new samples that appear to be written naturally and extends the handwriting deforma­ tion seen in the original samples. Initial experiments show that using the samples so generated can improve the recognition performance.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 281-290.

Download PostScript
Download PDF


Full Paper #28

ON THE COMPLEXITY OF COGNITION

S. JAEGER

Interactive Systems Laboratories
University of Karlsruhe
Computer Science Department, 76128 Karlsruhe, Germany
email: stefan.jaeger@ira.uka.de

This paper presents an investigation of a cognitive problem in terms of complexity theory. Two global optimization approaches are presented for recovering trajectories from static, handwritten word images. Both take graph-theoretical representations of symbols as input. The first, polynomial approach minimizes the length of the recovered trajectories. This approach cannot recover trajectories traversing parts of the word more than twice. The second approach, which minimizes costs at distinguished nodes of the trajectory, is more powerful in this respect and is proved to be NP-hard. An eÆcient divide-and-conquer method is proposed that splits handwritten words into independent subparts and recovers trajectories for every subpart, which turn out to be very small in practice. The splitting technique exploits morphological features from the static word image. pp. 291-302.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3

Download PostScript
Download PDF


Full Paper #29

From Off-line to On-line Handwriting Recognition

P.M. LALLICAN1, C. VIARD-GAUDIN2, S. KNERR1

1Vision Objects, 11 Rue de la Fontaine Caron, 44300 Nantes, France
2Ecole Polytechnique de l'Université de Nantes, IRCCyN/UMR CNRS 6597, Rue C. Pauc,
BP 60601, 44306 Nantes Cedex 3, France
E-mail: {pmlallican,stefan.knerr}@visionobjects.com,cviard@ireste.fr

On-line handwriting includes more information on time order of the writing signal and on the dynamics of the writing process than off-line handwriting. Therefore, on-line recognition systems achieve higher recognition rates. This can be concluded from results reported in the literature, and has been demonstrated empirically as part of this work.
We propose a new approach for recovering the time order of the off-line handwriting signal. Starting from an over-segmentation of the off-line handwriting into regular and singular parts, the time ordering of these parts and recognition of the word are performed simultaneously. This approach, termed ``OrdRec'', is based on a graph description of the handwriting signal and a recognition process using Hidden Markov Models (HMM). A complete omni-scriptor isolated word recognition system has been developed. Using a dynamic lexicon and models for upper and lower case characters, our system can process binary and gray value word images of any writing style (script, cursive or mixed).
Using a dual handwriting data base which features both the on-line and the off-line signal for each of the 30000 words written by about 700 scriptors, we have shown experimentally that such an off-line recognition system, using the recovered time order information, can achieve recognition performances close to those of an on-line recognition system.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 303-312.

Download PostScript
Download PDF


Full Paper #30

A FAST LEXICALLY CONSTRAINED VITERBI ALGORITHM FOR ON­ LINE HANDWRITING RECOGNITION

Alain LIFCHITZ

Laboratoire d'Informatique de Paris 6,
Université P6 & CNRS (UMR 7606), Case 169,
4, place Jussieu F­75252 Paris Cedex 05, France
E­mail: alain.lifchitz@lip6.fr

Frederic MAIRE

School of Computing Science,
Queensland University of Technology,
2 George Street, GPO Box 2434 Brisbane,
Q4001 Australia
E­mail: f.maire@qut.edu.au

Most on­line cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In this paper, we propose a solution based on a more compact data structure, the directed acyclic word graph (DAWG). We show that our solution is equivalent to the traditional system. Moreover, we propose a number of heuristics to reduce the size of the DAWG and present experimental results demonstrating a significant improvement.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 313-322.

Download PostScript
Download PDF


Full Paper #31

IMPROVEMENT IN HANDWRITTEN NUMERAL STRING RECOGNITION BY SLANT NORMALIZATION AND CONTEXTUAL INFORMATION

Alceu de S. BRITTO JR.

Pontifícia Universidade Católica do Paraná (PUC­PR), R. Imaculada Conceição, 1155 Curitiba (PR) 80215­901 -- Brazil Universidade Estadual de Ponta Grossa (UEPG), Praça Santos Andrade S/N, Centro, Ponta Grossa (PR) 84100­000 -- Brazil
E­mail: alceu@cenparmi.concordia.ca

Robert SABOURIN

École de Technologie Supérieure (ETS), 1100 Rue Notre Dame Ouest Montreal (QC) H3C 1K3 ­ Canada Centre for Pattern Recognition and Machine Intelligence (CENPARMI), 1455 de Maisonneuve Blvd. West, Suite GM 606 ­ Montreal (QC) H3G 1M8 ­ Canada
E­mail: sabourin@gpa.etsmtl.ca

Edouard LETHELIER and Flavio BORTOLOZZI

Pontifícia Universidade Católica do Paraná (PUC­PR), R. Imaculada Conceição, 1155 Curitiba (PR) 80215­901 ­ Brazil
E­mail: {edouard, fborto}@ppgia.pucpr.br

Ching Y. SUEN

Centre for Pattern Recognition and Machine Intelligence (CENPARMI), 1455 de Maisonneuve Blvd. West, Suite GM 606 ­ Montreal (QC) H3G 1M8 ­ Canada
E­mail: suen@cenparmi.concordia.ca

This work describes a way of enhancing handwritten numeral string recognition by considering slant normalization and contextual information to train an implicit segmentation­based system. A word slant normalization method is modified in order to improve the results for handwritten numeral strings. We assume that each connected component (CC) in the string has its own slant. The slant and contour length of each CC are used for obtaining the mean slant of the string. Both the original and modified methods are evaluated by means of some interesting analyses on the NIST SD19 database. These analyses show (a) the positive impact of slant correction on the number of overlapping numerals in strings, and (b) the difference in normalizing isolated numerals based on the slant estimated from their own images and the slant estimated from their original string images. Slant normalization and contextual information regarding string slant and digit size variations within the string are used to train numeral HMMs. Preliminary string recognition results, produced by a system under construction, are shown.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 323-332.

Download PostScript
Download PDF


Full Paper #32

MULTIPLE FEATURE INTEGRATION FOR WRITER VERIFICATION

Sung­Hyuk CHA and Sargur N. SRIHARI

Center of Excellence for Document Analysis and Recognition
State University of New York at Buffalo
Buffalo, NY, 14260
fscha,sriharig@cedar.buffalo.edu

Given two handwritten documents, the writer verification problem is to determine whether the two documents were written by the same person. It is tackled by extracting various features and classi­ fying the patterns into their classes. Features are diverse in type while techniques in pattern recognition typically require that features be ho­ mogeneous. The solution proposed overcomes both the non­homogeneity of features and the intractability of infinite number of writers by a di­ chotomy transformation. In this model, the distance between each homogeneous feature type is used. We integrate several distance measures for many feature types: element, histogram, string, convex hull, etc into one useful for writer verification. Experimental results with 1; 000 writers with three sample documents per writer, using only 12 feature distances, results in 97% accuracy.
Keywords: Dichotomizer, Multiple Feature Integration, Writer Verification

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 333-342.

Download PostScript
Download PDF


Full Paper #33

OFF-LINE HANDWRITING RECOGNITION USING VARIOUS HYBRID MODELING TECHNIQUES AND CHARACTER N-GRAMS

A. BRAKENSIEK, J. ROTTLAND, A. KOSMALA, G. RIGOLL

Dept. of Computer Science, Faculty of Electrical Engineering
Gerhard-Mercator-University Duisburg
47057 Duisburg, Germany
E-mail: {anja, rottland, kosmala, rigoll}@fb9-ti.uni-duisburg.de

In this paper a system for on-line cursive handwriting recognition is described. The system is based on Hidden Markov Models (HMMs) using discrete and hybrid modeling techniques. Here, we focus on two aspects of the recognition system. First, we present different hybrid modeling techniques, whereas one depends on an information theory-based neural network (MMI-criterion) used as a vector quantizer and the other uses a neural net for estimating the a posteriori probabilities to replace the codebook of a tied-mixture HMM system. This is the first paper where we present this novel approach -called tied posteriors- for handwriting recognition. Second, we demonstrate the usage of a language model, that consists of character n-grams, as an alternative to the recognition with a large dictionary of German words. Our resulting system for character recognition yields significantly better recognition results using an unlimited vocabulary.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 343-352.

Download PostScript
Download PDF


Full Paper #34

NEURAL NETWORK­BASED CONTEXT DRIVEN RECOGNITION OF ON­LINE CURSIVE SCRIPT

Predrag NESKOVIC and Leon N. COOPER

Physics Department and Institute for Brain and Neural Systems
Brown University, Providence, RI 02912, USA
email: pedja@cns.brown.edu and Leon Cooper@Brown.edu

Most of the state­of­the­art systems for cursive script recognition are based on a combination of neural networks (NN) and hidden Markov models (HMMs) 1;2 . The post­processing stage is almost exclusively modeled using HMMs and the dynamic programming (DP) technique (the Viterbi algorithm) is used to efficiently search the space of possible segmentations. In this work we introduce a neural network­based model for representing handwritten patterns as an alternative to HMMs. In addition, we present a new algorithm that uses context information to segment, modify and organize bottom up information in order to achieve successful recognition.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 353-362.

Download PostScript
Download PDF


Full Paper #35

ENHANCING CURSIVE WORD RECOGNITION PERFORMANCE BY THE INTEGRATION OF ALL THE AVAILABLE INFORMATION

C. SCAGLIOLA, G. NICCHIOTTI, F. CAMASTRA

Elsag spa, Via Puccini, 2 ­ 16154 Genova, ITALY
E­mail: {carlo.scagliola,gianluca.nicchiotti,francesco.camastra}@elsag.it

Segmentation­by­recognition is a successful approach for recognizing cursively handwritten words. Its main strength is that the interdependence of strokes forming a letter is correctly taken into account by the use of a character recognizer, that evaluates an aggregate of strokes (character hypothesis) as a whole. However, a straightforward implementation of such an approach would fail to take into account the dependencies of each character hypothesis with the adjacent hypotheses and with global characteristics of the image, like the position of upperline and baseline, the average dimensions of strokes, etc. This paper describes a cursive handwritten word recognition system in which recognition performance is enhanced by the use of several complementary sources of information, like the relationships of the strokes that make up a hypothesis among themselves and with the preceding strokes, the position of the hypothesis with respect to baseline and upperline, the statistics of the number of strokes making up letters belonging to different classes, the dispersion of character data around the different code vectors used to measure distances, the plausibility for a hypothesis of being a spurious stroke (extra ink). Experimental results are presented, putting into evidence the contribution of each source of information to the overall performance.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 363-372.

Download PostScript
Download PDF


Full Paper #36

WORD LEXICON REDUCTION BY CHARACTER SPOTTING

Didier GUILLEVIC, Daisuke NISHIWAKI, Keiji YAMADA

Computer & Communication Media Research
NEC Corporation
Kawasaki 216-8555, Japan

We describe a system, currently under development, to dynamically reduce a lexicon of city names, making use exclusively of the information found in a word image. Isolated characters are `spotted' within the word. The recognition results on those isolated characters are then used to initialize a Hidden Markov Model (HMM) like module to dynamically reduce the lexicon. Keywords: Word Lexicon Reduction, Character Spotting, Finite Automata, Hidden Markov Models (HMM), Multi Layer Perceptron (MLP).

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 373-382.

Download PostScript
Download PDF


Full Paper #37

QUANTIFYING THE CONTRIBUTION OF LANGUAGE MODELING TO WRITER­INDEPENDENT ON­LINE HANDWRITING RECOGNITION

John F. PITRELLI and Eugene H. RATZLAFF

Pen Technologies Group, IBM T. J. Watson Research Center
P. O. Box 218, Yorktown Heights, NY 10598, U.S.A.
E­mail: fpitrelli,ratzlaffg@us.ibm.com

We describe experiments varying the degree of language­model constraint applied to writer­independent on­line handwriting recognition. Six types of models are used, varying statistical components and hard constraints which govern recognition search during the sequencing of characters to form valid texts. Experiments on constrained texts, such as dates and phone numbers, show that although tighter language models cause more inputs to be out­of­domain, they can still eliminate up to 50% of string errors and 75% of character errors compared to using a null language model.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 383-392.

Download PostScript
Download PDF


Full Paper #38

WORD LEVEL DISCRIMINATIVE TRAINING
FOR HANDWRITTEN WORD RECOGNITION

Wen­Tsong CHEN

Department of Electrical Engineering
University of Missouri -- Columbia
239 Engineering Building West
Columbia, Missouri 65211
E­mail: wchen@ece.missouri.edu

Paul GADER

Department of Computer Engineering and Computer Science
University of Missouri -- Columbia
201 Engineering Building West
Columbia, Missouri 65211
E­mail: gader@cecs.missouri.edu

Word level training refers to the process of learning the parameters of a word recognition system based on word level criteria functions. Previously, researchers trained lexicon­driven handwritten word recognition systems at the character level individually. These systems generally use statistical or neural based character recognizers to produce character level confidence scores. In the case of neural networks, the objective functions used in training involve minimizing the difference between some desired outputs and the actual outputs of the network. Desired outputs are generally not directly tied to word recognition performance. In this paper, we describe methods to optimize the parameters of these networks using word level optimization criteria. Experimental results show that word level discriminative training without desired outputs not only outperforms character level training but also eliminates the difficulty of choosing desired outputs. The method can also be applied to all segmentation based handwritten word recognition systems.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 393-402.

Download PostScript
Download PDF


Full Paper #39

ACTIVE HANDWRITTEN WORD RECOGNITION

Jaehwa PARK and Venu GOVINDARAJU

Center of Excellence for Document Analysis and Recognition
Department of Computer Science and Engineering
State University of New York at Buffalo
Amherst, NY 14260, USA
E­mail: jaehwap,govind@cedar.buffalo.edu

An active word recognition paradigm using recursive recognition processing is proposed. To achieve successful recognition result with minimum required processing effort, recursive system architecture which has active combination of a recognition engine and a decision making module is introduced. In the proposed model, a closed loop connection between recognizer and decision maker operates recursively with successive upgrades of recognition accuracy. The recursion can eventually reach a satisfactory terminal condition or a rejection state of exhaustive use of all the resources. The proposed model is implemented in a segmentation based lexicon driven word recognition application and experiments show enhanced recognition results.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 403-412.

Download PostScript
Download PDF


Full Paper #40

HANDWRITTEN TEXT RECOGNITION USING A MULTIPLE­AGENT ARCHITECTURE TO ADAPT THE RECOGNITION TASK

L. HEUTTE, T. PAQUET, A. NOSARY AND C. HERNOUX

Laboratoire PSI, Université de Rouen,
F­76821 Mont­Saint­Aignan Cedex, France.
E­mail: Laurent.Heutte@univ­rouen.fr

This communication investigates the automatic reading of unconstrained omni­writer handwritten texts. It shows how to endow the reading system with learning faculties necessary to adapt the recognition to each writer's handwriting. In the first part of this communication, we explain how the recognition system can be adapted to a current handwriting by exploiting the graphical context defined by the writer's invariants. This adaptation is guaranteed by activating interaction links over the whole text between the recognition procedures of word entities and those of letter entities. In the second part, we justify the need of an open multiple­ agent architecture to support the implementation of such a principle of adaptation. The proposed platform allows to plug expert treatments dedicated to handwriting analysis. We show that this platform helps to implement specific collaboration or cooperation schemes between agents which bring out new trends in the automatic reading of handwritten texts.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 413-422.

Download PostScript
Download PDF


Full Paper #41

TWO-STAGE CHARACTER CLASSIFICATION: A COMBINED APPROACH OF CLUSTERING AND SUPPORT VECTOR CLASSIFIERS

Louis VUURPIJL and Lambert SCHOMAKER

vuurpijl@nici.kun.nl, schomaker@computer.org

This paper describes a two-stage classification method for (1) classification of isolated characters and (2) verification of the classification result. Character prototypes are generated using hierarchical clustering. For those prototypes known to sometimes produce wrong classification results, a "support vector classifier" (svc) is trained. The svc can be used to increase the confidence that a classification is correct and furthermore decide on a classification if the confidence using the standard method is too low. Experiments with the iUF UNIPEN database yield 94% recognition rate. In cases where both classifiers agree, the error rate is zero.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 423-432.

Download PostScript
Download PDF


Full Paper #42

HYBRID SCHEMES OF HOMOGENEOUS AND HETEROGENEOUS CLASSIFIERS FOR CURSIVE WORD RECOGNITION

Jin Ho KIM1,2, Kye Kyung KIM1 and Ching Y. SUEN1

1CENPARMI,Concordia University, Montreal, Quebec, Canada
E­mail: {kjinho, kkkim, suen}@cenparmi.concordia.ca

2 Department of Electronic Engineering, Kyungil University,Kyungsan, Kyungpook, Korea
E­mail: kjinho@bear.kyungil.ac.kr

Sophisticated hybrid schemes of the homogeneous and heterogeneous classifiers for cursive word recognition are presented. Two homogeneous MLPs (multi­layer perceptrons) are combined into a new single powerful classifier at the architectural level, and HMM (hidden Markov model) is added to the new classifier as a heterogeneous one at the output level. This is based on the idea that classifiers with more different methodologies and different features can better complement each other. The presented scheme achieves a recognition rate of 92.7% for English legal words of a CENPARMI database, a performance which is better than several previous hybrid schemes reported in the literature.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 433-442.

Download PostScript
Download PDF


Full Paper #43

VARIANTS OF THE BORDA COUNT METHOD FOR COMBINING RANKED CLASSIFIER HYPOTHESES

Merijn van ERP and Lambert SCHOMAKER

NICI, P.O.Box 9104,
6500 HE Nijmegen,
The Netherlands
{M.vanErp, schomaker}@nici.kun.nl

The Borda count is a simple yet effective method of combining rankings. In pattern recognition, classifiers are often able to return a ranked set of results. Several experiments have been conducted to test the ability of the Borda count and two variant methods to combine these ranked classifier results. By using artificial data, domain-specific results were avoided. The results show the strength of the Borda count when many errors occur in the results, but also show its weakness in case of a limited number of large ranking errors.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 443-452.

Download PostScript
Download PDF


Full Paper #44

CASCADING MULTIPLE CLASSIFIERS AND REPRESENTATIONS FOR OPTICAL AND PEN-BASED HANDWRITTEN DIGIT RECOGNITION

E. ALPAYDIN, C. KAYNAK, F. ALIMOGLU

Department of Computer Engineering
Bogaziçi University
TR-80815 Istanbul, Turkey
E-mail: alpaydin@boun.edu.tr

We discuss a multistage method, cascading, where there is a sequence of classifiers ordered in terms of complexity (of the classifier or the repre- sentation) and specificity, in that early classifiers are simple and general and later ones are more complex and are local. For building portable, low-cost handwriting recognizers, memory and computational requirements are as critical as accuracy and our proposed method, cascading, is a way to gain from having multiple classifiers, without much losing from cost. Simulation results on optical and pen-based handwriting digit recognition indicate that when compared with voting, mixture of experts and stacking, our proposed method, cascading, does stand out as the most realistic combination scheme.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 453-462.

Download PostScript
Download PDF


Short Papers

Top


Short Paper #1

UNCONSTRAINED HANDWRITING RECOGNITION: LANGUAGE MODELS, PERPLEXITY, AND SYSTEM PERFORMANCE

U.-V. MARTI and H. BUNKE

Institut für Informatik und angewandte Mathematik
Universität Bern, Neubrückstrasse 10, CH-3012 Bern
Switzerland
email:{marti,bunke}@iam.unibe.ch

In this paper we present a number of language models and their behavior in the recognition of unconstrained handwritten English sentences. We use the perplexity to compare the different models and their prediction power, and relate it to the performance of a recognition system under different language models. In the recognition experiments a system with the classical architecture of preprocessing, feature extraction and recognition by means of Hidden Markov Model is used. In the recognition phase the language model constrains the possible next words. Keywords: handwriting recognition, unconstrained English sentence recognition, unigram probability, bigram probability, perplexity.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 463-468.

Download PostScript
Download PDF


Short Paper #2

ADAPTIVE CONTEXT PROCESSING IN ON-LINE HANDWRITTEN CHARACTER RECOGNITION

Naomi IWAYAMA and Kazushi ISHIGAKI

Fujitsu Laboratories Ltd., 64 Nishiwaki, Okubo-cho, Akashi, Hyogo 674-8555, Japan
E-mail: naomi@flab.fujitsu.co.jp, ishigaki@flab.fujitsu.co.jp

We propose a new approach to context processing in on-line handwritten character recognition (OLCR). Based on the observation that writers often repeat the strings that they input, we take the approach of adaptive context processing. (ACP). In ACP, the strings input by a writer are automatically added to a dictionary designated for ACP. This dictionary thereby can provide good coverage of the strings a writer inputs. Furthermore, the dictionary is compact enough to be loaded on a small terminal. In our experiments, ther first-hit rate of OLCR with ACP was 95.44% after all the strings to be input had been added to the ACP dictionary while that without ACP was 86.09%.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 469-474.

Download PostScript
Download PDF


Short Paper #3

AN INVESTIGATION INTO THE USE OF LINGUISTIC CONTEXT IN AUTOMATIC CURSIVE SCRIPT RECOGNITION

N.H. BRAMMALL, J.H. CONNOLLY and C.J. HINDE

Department of Computer Studies, Loughborough University, UK

The highly ambiguous nature of cursive writing, with high variability not only between different writers but also between different samples from the same writer, means that automatic recognition systems based on purely visual information are prone to errors. It is suggested that the application of linguistic knowledge to the recognition task may improve recognition accuracy. There are many forms of linguistic knowledge that may be used to this end. This paper looks specifically at the use of collocation as a source of linguistic knowledge. Collocation describes the statistical tendency of certain words to co-occur in a language, within a defined range. The construction and use of a post-processing system incorporating collocational knowledge is described, as are a number of experiments to test the effectiveness of collocation as an aid to text recognition.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 475-480.

Download PostScript
Download PDF


Short Paper #4

A METHOD FOR HANDWRITING INPUT AND CORRECTION ON SMARTPHONES

Gareth LOUDON, Olle PELLIJEFF, Li ZHONG­WEI

Cyberlab Singapore, Ericsson Research,
#18­00 SLF Building, 510 Thomson Road, Singapore 298135
gareth.loudon@ericsson.com,
olle.pellijeff@ericsson.com,
zhong­wei.li@ericsson.com

This paper describes a new approach for entering text on a "Smartphone" via natural handwriting input. The approach focuses on ease of use within the confines of a Smartphone display size and processing limitations. Therefore there are two integrated components to the approach. The first is a new handwriting recognition engine that has been designed to have a very high recognition accuracy (98.3% character accuracy), support sentence­based handwriting input, and have a small memory footprint (84 kb) and fast processing time. The second is a method that allows very simple editing and correction of recognized characters.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 481-486.

Download PostScript
Download PDF


Short Paper #5

ANALYTIC WORD RECOGNITION WITHOUT SEGMENTATION BASED ON MARKOV RANDOM FIELDS

Christophe CHOISY and Abdel BELAID

LORIA/CNRS
Campus scientifique, BP 239,
54506 Vandoeuvre­les­Nancy cedex, France
Christophe.Choisy@loria.fr,
Abdel.Belaid@loria.fr

In this paper, a method for analytic handwritten word recognition based on causal Markov random fields is described. The words models are HMMs where each state corresponds to a letter; each letter is modelled by a NSHP­HMM (Markov field). Global models are build dynamically, and used for recognition and learning with the Baum­Welch algorithm. Learning of letter and word models is made using the parameters reestimated on the generated global models. No segmentation is necessary : the system determines itself the best limits between the letters dur­ ing learning. First experiments on a real base of french check amount words give encouraging results of 83.4% for recognition. Keywords : HMM, NSHP­HMM, Cross­learning, Meta­models, Baum­Welch Algorithm.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 487-492.

Download PostScript
Download PDF


Short Paper #6

OFF-LINE CURSIVE SCRIPT RECOGNITION BASED ON CONTINUOUS DENSITY HMM

A.VINCIARELLI and J.LUETTIN

IDIAP - Institut Dalle Molle d'Intelligence Artificielle Perceptive
Rue du Simplon 4, CP592 - 1920 Martigny, Switzerland
{vincia,luettin}@idiap.ch

A system for off-line cursive script recognition is presented. A new normalization technique (based on statistical methods) to compensate for the variability of writing style is described. The key problem of segmentation is avoided by applying a sliding window on the handwritten words. A feature vector is extracted from each frame isolated by the window. The feature vectors are used as observations in letter-oriented continuous density HMMs that perform the recognition. Feature extraction and modeling techniques are illustrated. In order to allow the comparison of the results, the system has been trained and tested using the same data and experimental conditions as in other published works. Performances comparable to those of more complex systems have been achieved.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 493-498.

Download PostScript
Download PDF


Short Paper #7

A SIMPLE AND EFFECTIVE CURSIVE WORD SEGMENTATION METHOD

G.NICCHIOTTI, C.SCAGLIOLA

Elsag spa Via Puccini 2 ­ 16154 Genova, ­ ITALY
E­mail: gianluca.nicchiotti@elsag.it, carlo.scagliola@elsag.it

and S. RIMASSA

Polo Nazionale Bioelettronica Via Roma 28 57030 Marciana (LI) ­ ITALY
E­mail: simone.rimassa@mailcity.com

A simple procedure for cursive word oversegmentation is presented, which is based on the analysis of the handwritten profiles and on the extraction of ``white holes''. It follows the policy of using simple rules on complex data and sophisticated rules on simpler data. Experimental results show robustness and performances comparable with the best ones presented in the literature.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 499-504.

Download PostScript
Download PDF


Short Paper #8

THE SEGMENTATION OF A TEXT LINE FOR A HANDWRITTEN UNCONSTRAINED DOCUMENT USING THINING ALGORITHM

Shinji TSURUOKA, Yusuke ADACHI and Tomohiro YOSHIKAWA

Dept. of Electrical and Electronic Engineering
Faculty of Engineering,
Mie University, 1515 Kamihama, Tsu, Mie 514-8507, Japan
E-mail: tsuruoka@elec.mie-u-ac.jp

For printed documents, the projection analysis of black pixels is widely used for the segmentation of a text line. However, for handwritten documents, we think that the projection analysis is not appropriate, as the separating border line of a text line is not a straight line on a paper with no ruled line. We will extract a curved separating border line. In this paper, we propose the new segmentation of a text line from a handwritten document image using a thinning algorithm. In most documents, a text line is separated by a background region for the reader to detect easily. From this point of view we use a new thinning algorithm for the background region to detect the separating border lines. In the thinned objects, the useless chains for the border line are eliminated by gradual conditions. To confirm the usefulness of this method, we applied it to 475 text lines in 22 handwritten documents as test images and an accuracy of 90.0% is obtained.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 505-510.

Download PostScript
Download PDF


Short Paper #9

AN APPROACH FOR ACTIVE SEGMENTATION OF UNCONSTRAINED HANDWRITTEN KOREAN STRINGS USING RUN-LENGTH CODE

Jason JeongSuk YOON and Gyeonghwan KIM

Dept. of Electronic Engineering, Sogang University
CPO Box 1142, Seoul 100-611 Korea
E-mail: jasonyoon@ieee.org, gkim@ccs.sogang.ac.kr

We propose an active handwritten Hangul segmentation method. A manageable structure based on Run-length code is defined in order to apply to preprocessing and segmentation. Also three fundamental candidate estimation functions are in- troduced to detect the clues on touching points, and the classification of touching types is attempted depending on the structural peculiarity of Hangul. Our experiments show segmentation performance of 88.2% on touching characters with minimal over-segmentation.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 511-516.

Download PostScript
Download PDF


Short Paper #10

A NEW METHOD FOR AUTOMATIC HANDWRITING QUALITY MEASUREMENT

Jeong-Seon Park, Seong-Whan Lee

Center for Artificial Vision Research,
Department of Computer Science and Engineering, Korea University
Anam-dong, Seongbuk-ku, Seoul 136-701, Korea
E-mail: {jspark, swlee}@image.korea.ac.kr

With a surge of interest in OCR in 1990s, a large number of handwriting or hand-printing databases have been built one after another around the world. One problem that researches encounter today is that all the databases differ in various ways including the script qualities. This paper proposes a method for measuring handwriting qualities that can be used for comparison of databases and objective test for character recognizers. The key idea involved is classifying character samples into a number of groups each characterizing a set of qualities. In order to evaluate the proposed method, we carried out experiments on KU-1 database. The result we achieve is meaningful and the method is helpful for the target tasks.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 517-522.

Download PostScript
Download PDF


Short Paper #11

GLOBAL AND STRICT CURVE FITTING METHOD

Y. NAKAJIMA and S.MORI

Department of Computer Software, The University of Aizu,
Tsuruga, Ikki­machi Aizu­Wakamatsu City 965­8580, Japan
E­mail: {nakajima, s­mori}@u­aizu.ac.jp

To find a global and smooth curve fitting, cubic B­Spline method and gathering­ line methods are investigated. When segmenting and recognizing a contour curve of character shape, some global method is required. If we want to connect contour curves around a singular point like crossing points, merging separated contours together which lies apart crossing the singular point is necessary. For this pur­ pose, cubic B­Spline method and new line­gathering method are investigated and proposed. The result is that cubic B­Spline method is rather too easy to bend. Easy to bend feature can cover singular point smoothly, so is not good to detect singular points. Gathering­line method is to represent the contour by overlapped line segments. By overlapping lines, arcs can be represented in natural way. Some investigations and experimental results are shown. Keywords: line fitting, segmentation, character recognition, cubic B­ Spline, gathering­line method, solving singular point.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 523-528.

Download PostScript
Download PDF


Short Paper #12

A SPECIAL SKELETONIZATION ALGORITHM FOR CURSIVE WORDS

Tal STEINHERZ, Nathan INTRATOR

Tel­Aviv University
Ramat Aviv 69978, Israel
(talstz,nin)@math.tau.ac.il

and Ehud RIVLIN

Department of Computer Science
Technion
Haifa 32000, Israel
ehudr@cs.technion.ac.il

We present a novel approach for finding a pseudo­skeleton of a cursive word's image. This pseudo­skeleton preserves all the necessary components of a cursive word such as: loops, curves, junctions, end­points etc. It is expected to be useful for cursive word recognition. Keywords: Skeleton, Thinning, Cursive Words.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 529-534.

Download PostScript
Download PDF


Short Paper #13

A FRACTAL JUSTIFICATION OF THE NORMALIZATION STEP FOR ONLINE HANDWRITING RECOGNITION

N. VINCENT

LI/E3i ­ Université de Tours ­ 64, av. J. Portalis ­ 37200 Tours ­ France
E­mail : vincent@univ­tours.fr

B. DORIZZI

INT ­ Dept. EPH ­ 9, rue Charles Fourier ­ 91011 Evry ­ France
E­mail : Bernadette.Dorizzi@int­evry.fr

In this paper is presented an example of the use of fractal approaches in the field of online handwriting processing. The adaptation of the box counting method to the computation of online handwriting fractal dimension is presented. The influence of different parameters is studied. This allows understanding why the value of the proposed parameter is invariant towards the tablet or the speed or the writing size. The study of the transforms that have been chosen in REMUS software allows seeing that they match quite well the quantitative results obtained with fractal methods. Then, a posteriori, in a theoretical way, this confirms the value of the methods involved.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 535-540.

Download PostScript
Download PDF


Short Paper #14

SIMILARITY MEASURES FOR WRITER CLUSTERING

Jayashree SUBRAHMONIA

IBM T.J. Watson Research, P.O. Box 218 / Route 134,
Yorktown Heights, NY 10598, U. S. A.
E­mail: jays@watson.ibm.com

This paper addresses the problem of improving the performance of an online, writer­independent, large­vocabulary, unconstrained, handwriting recognition sys­ tem by clustering writers with similar writing styles. Recognition performance is enhanced by identifying the writer cluster that a test writer is closest to and using a model trained for the corresponding writer cluster in decoding. The recognition system is based on hidden Markov models. A common set of features are computed for all writers, which are then projected to a lower dimensional space that preserves most of the information in the original feature set. The reduced dimensional space varies from writer to writer. This paper describes two measures of similarity between writing styles. The first is based on the distance between the writer­dependent reduced dimensional feature subspaces. The second is based on the hidden Markov Model output probabilities.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 541-546.

Download PostScript
Download PDF


Short Paper #15

NOVEL RULE-BASED STATIC AND DYNAMIC FEATURE EXTRACTION FROM FIGURE COPYING TASKS FOR THE DETECTION OF VISUO-SPATIAL NEGLECT

R.M. GUEST, M.C. FAIRHURST and J.M. POTTER*

Electronic Engineering Laboratory, University of Kent, Canterbury, Kent, UK
*Nunnery Fields Hospital, Canterbury, Kent, UK

A series of static rule-based assessment criteria and dynamic constructional features are defined and used to analyse the hand-drawn responses from a geometric figure copying task. Assessment subjectivity is removed by the algorithmic definition of analysis criteria and test diagnostic sensitivity to the condition of visuo-spatial neglect is increased through the analysis of the novel dynamic features. This sensitivity increase is demonstrated by the identification of constructional performance deficits in test responses which appear 'normal' by conventional static assessment. The investigation is carried out with a population of stroke patients.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 547-552.

Download PostScript
Download PDF


Short Paper #16

A PERTURBATION­BASED APPROACH FOR MULTI­CLASSIFIER SYSTEM DESIGN

V.DI LECCE1, G.DIMAURO2, A.GUERRIERO1, S.IMPEDOVO2, G.PIRLO2, A.SALZO2

1Dipartimento di Ing. Elettronica ­ Politecnico di Bari­ Via Re David ­ 70126 Bari ­ Italy
2Dipartimento di Informatica ­ Università di Bari ­ Via Orabona, 4 ­ 70126 Bari --Italy

This paper presents a perturbation­based approach useful to select the best combination method for a multi­classifier system. The basic idea is to simulate small variations in the performance of the set of classifiers and to evaluate to what extent they influence the performance of the combined classifier. In the experimental phase, the Behavioural Knowledge Space and the Dempster­Shafer combination methods have been considered. The experimental results, carried out in the field of hand­written numeral recognition, demonstrate the effectiveness of the new approach.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 553-558.

Download PostScript
Download PDF


Short Paper #17

WORD­MATCHING METHOD BASED ON THE PROJECTION OF THE VOTING MATRIX

T. AKAGI, T. HAMAMURA, H. MIZUTANI, B. IRIE

TOSHIBA Corp.,
70 Yanagi­cho, Saiwai­ku, Kawasaki­shi,
Kanagawa 212­8501, Japan
E­mail: takuma.akagi@toshiba.co.jp

A new word­matching method is proposed that is able to achieve matching speedily and correctly even in the presence of noise at the beginning, end or within the word. This method can use the discriminant function in order to choose the most fitting word from the database. These functions take into consideration factors such as the character recognition rate and word segmentation rate. In addition, the method can determine the di#erence in noise between the word in the database (reference word) and the word extracted from the character line and recognized (character­recognition result).

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 559-564.

Download PostScript
Download PDF


Short Paper #18

ENROLMENT MODEL STABILITY IN STATIC SIGNATURE VERIFICATION

C. ALLGROVE and M.C. FAIRHURST

Electronic Engineering Laboratory,
University of Kent, Canterbury
Kent, CT2 7NT, United Kingdom

The stability of enrolment models used in a static verification system is assessed, in order to provide an enhanced chracterisation of signatures through the validation of the enrolment process. A number of static features are used to illustrate the effect of the variation in enrolment model size on the stability of the representation of signatures.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 565-570.

Download PostScript
Download PDF


Short Paper #19

RAPID ANALYTICAL VERIFICATION OF HANDWRITTEN ALPHANUMERIC ADDRESS FIELDS

C. K. LEE and G. LEEDHAM

School of Computer Engineering,
Nanyang Technological University, Singapore 639798
Email: travck@yahoo.com, asgleedham@ntu.edu.sg

This paper presents a combination of fuzzy system and dynamic analytical model to deal with imprecise data derived from feature extraction in handwritten address images which are compared against postulated addresses for address verification. A dynamic building­number locator is able to locate and recognise the building­number, without knowing exactly where the building­number starts in the candidate address line. The overall system achieved a correct sorting rate of 72.9%, 27.1% rejection rate and 0.0% error rate on a blind test set of 450 cursive handwritten addresses.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 571-576.

Download PostScript
Download PDF


Short Paper #20

A NEW APPROACH TO SEGMENT HANDWRITTEN DIGITS

L. S. Oliveira1, E. Lethelier1, F. Bortolozzi1, and R. Sabourin2

1Pontifícia Universidade Católica do Paraná - LARDOC Rua Imaculada Conceiçao 1155, 80215-901 - Curitiba, PR - BRAZIL
E-mail: {soares,edouard,fborto}@ppgia.pucpr.br

2Ecole de Technologie Superieure 1100, rue Notre Dame Ouest, Montreal (Quebec) H3C 1K3, CANADA E-mail: sabourin@gpa.etsmtl.ca

This article presents a new segmentation approach applied to unconstrained handwritten digits. The novelty of the proposed algorithm is based on the combination of two types of structural features in order to provide the best segmentation path between connected entities. In this article, we first present the features used to generate our basic segmentation points. Then, we define our segmentation paths depending on the encountered configurations with only few heuristic rules. Finally, we evaluate the output of our segmentation by using a combination of classi ers.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 577-582.

Download PostScript
Download PDF


Short Paper #21

ZONING DESIGN FOR HAND­WRITTEN NUMERAL RECOGNITION

V.DI LECCE1, G.DIMAURO2, A.GUERRIERO1, S.IMPEDOVO2, G.PIRLO2, A.SALZO2

1Dipartimento di Ing. Elettronica ­ Politecnico di Bari, Via Re David ­ 70126 Bari ­ Italy
2Dipartimento di Informatica ­ Università di Bari, Via Orabona, 4 ­ 70126 Bari -- Italy

In the field of Optical Character Recognition (OCR), zoning is used to extract topological information from patterns. In this paper zoning is considered as the result of an optimisation problem and a new technique is presented for automatic zoning. More precisely, local analysis of feature distribution based on Shannon's entropy estimation is performed to determine "core" zones of patterns. An iterative region­growing procedure is applied on the "core" zones to determine the final zoning.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 583-588.

Download PostScript
Download PDF


Short Paper #22

CHARACTER RECOGNITION BY MATCHING SEQUENCES OF PSEUDO­STROKE POSITIONS AND DIRECTIONS

Hanhong XUE and Venu GOVINDARAJU

CEDAR, State University of New York at Buffalo, Buffalo, NY 14260, USA
E­mail: fhxue, govindg@cedar.buffalo.edu

Chain­coded contours are informative in off­line character recognition. As approximations to contours, sequences of pseudo­strokes consisting of both positional and directional information make up feature vectors for character images. In order to carry out fast pattern matching, a scheme of generating fixed­length feature vectors that combine information about outer contour and inner contours into a uniform data structure is proposed and tested on CEDAR databases.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 589-594.

Download PostScript
Download PDF


Short Paper #23

A MOVING WINDOW CLASSIFIER FOR OFF­LINE CHARACTER RECOGNITION

M. S. HOQUE, M. C. FAIRHURST

Electronic Engineering Laboratory, University of Kent, Canterbury,
Kent CT2 7NT, United Kingdom.
E­mail: fmsh4,mcfg@ukc.ac.uk

A new classification scheme, primarily aimed at applications in document image processing, is presented. Features are extracted from a partial image and a sub­ classifier generates scores based on the likelihood of the sub­image belonging to the candidate classes. This partial classification is carried out for several overlapping image segments and scores are combined to make the final classification. The scheme shows promising results in OCR applications where high processing speeds are achievable with minimal compromise in the recognition accuracy.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 595-600.

Download PostScript
Download PDF


Short Paper #24

DATABASES FOR RECOGNITION OF HANDWRITTEN ARABIC CHEQUES

Yousef AL­OHALI1, Mohamed CHERIET1,2 and Ching Y. SUEN1

1CENPARMI, ConcordiaUniversity GM­606, 1455 de Maisonneuve W., Montreal, Quebec H3G 1M8, Canada
2Imagery, Vision and Artificial Intelligence Laboratory, École de Technologie Supérieure, University of Québec, 1100, Notre­Dame West, Montréal, Québec H3C 1K3, Canada
E­mail: {yousef, suen@cenparmi.concordia.ca}, cheriet@gpa.etsmtl.ca

This paper describes an effort toward building Arabic cheque databases for research in recognition of handwritten Arabic cheques. Databases of Arabic legal amounts, Arabic sub­ words, courtesy amounts, Indian digits, and Arabic cheques are provided. This paper highlights the characteristics of the Arabic language and presents the various steps that have been completed to achieve this goal including segmentation, binarization, tagging and validation.

In: L.R.B. Schomaker and L.G. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: International Unipen Foundation,
ISBN 90-76942-01-3
pp. 601-606.

Download PostScript
Download PDF


Top


Invited Lectures


RECOLLECTIONS OF CONVERSATIONS WITH PROFESSOR J.C. SIMON

[INVITED LECTURE / ABSTRACT]

Theo PAVLIDIS

Department of Computer Science
State University of New York at Stony Brook
Stony Brook, NY 11794-4400, USA
t.pavlidis@ieee.org

I was fortunate to have had numerous conversations with Professor J.C. Simon over the years and I will recount three of our recurring topics, all of them central to his work in OCR.


1. The importance of irregularities for feature selection. The volume "From Pixels to Features" edited by him (Amsterdam: North Holland, 1989) contains two papers:

  • J-C. Simon Ä Complementary Approach to Feature selection," pp. 229-236.
  • T. Pavlidis and D. Lee "Residual Analysis for Feature Extraction," pp. 219-227.

The key idea in both papers is that what is predictable is not interesting. A smooth straight curve does not contain any information. Thus the shape I may be considered as a zero. L is distinguished because of its corner, the anomaly in its shape. Similarly the intersection of the two lines in X is the informative feature. One may argue that this approach is presumed by the old analysis of the difference between representation and discrimination, but there is more to the story. The traditional statistical analysis assumes that features have been already extracted; in order to apply the concept to feature extraction one must define what is the predictable basis.


2. The importance of the engineering approach to solving problems as opposed to relying on a single methodology. The fields of image analysis and pattern recognition have been plagued by fashions: statistical pattern recognition, syntactic pattern recognition, graph grammars, connectionism, relaxation techniques, neural networks, fuzzy logic, hidden Markov models, etc. All of these techniques are quite valid and useful tools under certain conditions. Problems arise only when they are presented as panaceas that each can be used to solve all problems in the field. The engineering approach requires understanding the structure of the objects to be recognized and apply the appropriate combination of techniques. The following papers are example of such approaches that integrate different methodologies:

  • S. Kahan, T. Pavlidis, and H. S. Baird, ``On the Recognition of Printed Characters of Any Font And Size'', IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-9 (1987), pp. 274-288.
  • J-C. Simon, Öff-Line Cursive Word Recognition" Proc. IEEE, vol. 80 (1992), pp. 1150-1161.


3. The importance of senior researchers being closely involved in the research to the point of writing code themselves. Both Professor Simon and myself had spent considerable in writing code to the bewilderment of our colleagues who thought that senior researchers should not write code themselves. Once he asked what should he tell such critics. I told him my ßtock" reply to such criticism. Professors of surgery perform surgery with their own hands, no matter how senior they are; therefore it is only appropriate that professors of computer science should write their own programs. Would a person be willing to be treated by a physician who has been taught by professors who had never treated patients? Maybe the hiring of college graduates who have been taught by faculty who never did any programming themselves causes the sorry state of modern software. I will elaborate on this point in view of the advances in software tool development.

In: L.R.B. Schomaker and L. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: Int. Unipen Foundation,
ISBN 90-76942-01-3
pp. 1-2.


READING RESEARCH IN MOTION

[INVITED LECTURE / ABSTRACT]

Angelique W. HENDRIKS

NICI, P.O. Box 9104,
6500 HE Nijmegen,
The Netherlands
hendriks@nici.kun.nl

When you read these words, your brain is actively engaged in directing its visual sensors, the eyes, in acquiring the information wanted. The eyes are directed to particular areas in the text, thereby selecting parts of the image, while preventing others from entering the visual system. This active directing of the eyes is based on knowledge of the world (" The next word is to the right"), is purposive (" Does the text contain typing errors? ") and takes into account cognitive states (" I don't understand this sentence "). Despite the fact that the influence of eye movements on what is perceived, and vice versa, seems so obvious, models of human visual perception and reading have treated perception traditionally as if it were something static. Experiments with brief presentations of stimuli in order to prevent eye movements from "interfering", have long dominated the field. And even the last few decades of research of eye movements during reading did not really seem to necessitate a major change in focus. In this talk, I begin with a brief description of the two main types of models of human word recognition, as they are based on the traditional type of research. I then summarize the knowledge that we have of eye movement behavior during reading. Finally, I discuss recent findings that suggest how the traditional concept of reading can be enriched to encompass dynamic processes. Possibly, the concepts of selective attention and top-down driven segmentation presented in this lecture may have an influence on the design of control structures in machine-based reading.

In: L.R.B. Schomaker and L. Vuurpijl (Eds.)
Proceedings of the Seventh International Workshop on Frontiers
in Handwriting Recognition, September 11-13 2000, Amsterdam,
Nijmegen: Int. Unipen Foundation,
ISBN 90-76942-01-3
p. 177.

Top


Editor