Feature Selection for Classification of Old Slavic Letters

Cveta Martinovska Bande; Mimoza Klekovska; Igor Nedelkovski; Dragan Kaevski

doi:10.61416/ceai.v16i4.2370

Feature Selection for Classification of Old Slavic Letters

Cveta Martinovska Bande, Mimoza Klekovska, Igor Nedelkovski, Dragan Kaevski

Abstract

This paper describes methodology for extracting discriminative features for fuzzy classification of Old Slavic characters. Recognition process is based on structural and statistical features, such as number and position of spots in outer segments, presence and position of horizontal and vertical lines and holes, compactness and symmetry. Preprocessing is divided into the following steps: conversion to black and white bitmaps, normalization, contour extraction and segmentation. Features are extracted from contour profiles, histograms and character intersections. C4.5 decision trees are used for feature selection. The same feature set is appropriate for different Old Slavic Cyrillic alphabets because of the similarity of their graphemes. The classification accuracy and precision are tested on Old Macedonian manuscripts and the decision trees are created for two alphabets Macedonian and Bosnian. The main advantage of the proposed method is saving processing resources and eliminating the need of large training sets necessary for Bayesian classifiers or neural networks.

Keywords

classifiers, decision tree, fuzzy logic, character recognition, precision and recall, historical manuscripts

Full Text: PDF

Username
Password
Remember me

Journal of Control Engineering and Applied Informatics

Feature Selection for Classification of Old Slavic Letters

Abstract

Keywords