Multimed Tools Appl https://doi.org/10.1007/s11042-018-6289-6
A comparative study of graphic symbol recognition methods Irshad Khan1 · Naveed Islam2 · Hafeez Ur Rehman3 · Murad Khan1
Received: 28 February 2018 / Revised: 27 May 2018 / Accepted: 18 June 2018 © Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract From the very beginning of written scripts, contents of documents generally comprise of text, images, figures, graphs and graphic symbols. A graphic recognition system involves representation of graphic symbols, description of features extracted from the symbol and classification of the unknown symbols. Due to the wide range of symbols, no generalize technique is available that can recognize the symbol for all the application domains. this paper, we present an overview of the many models and methodologies available to symbol recognition for representation, description and classification. We provide a general survey of symbol recognition process, beginning with the basic definition of symbol, which is further classified into their types based on application areas. distinctive part of the survey is categorization of different symbol recognition methods into four categories i.e. statistical, structural, syntactical and hybrid methods, which is aimed to help potential researchers in exploring areas of research in the field of graphic symbol recognition. Keywords Symbol recognition · Object recognition and classification
Irshad Khan
[email protected] Naveed Islam
[email protected] Hafeez Ur Rehman
[email protected] Murad Khan
[email protected] 1
Sarhad University of Science and Information Technology, Peshawar, Pakistan
2
Islamia College University, Peshawar, Pakistan
3
FAST-National University of Computer and Emerging Sciences, Peshawar, Pakistan
Multimed Tools Appl
1 Introduction 1.1 What is a symbol? Symbol is a visual representation of concept, idea, belief or action. The word symbol is derived from Greek word symbolon meaning token or watchword. Symbol is a mean of complex communication that often have multiple level of meanings. This separate it from the sign, because sign have only one meaning. Symbol is defined as an image, object etc., by Oxford Advanced Learner’s dictionary [65]. Symbols are 2D images or shapes which have different colors, i.e. black & white, gray scale, and multicolor (red, green, blue) etc.
1.2 Types of symbol Symbols can represent more than one meaning, for example, a circle can represent an orbitrary shape or it can be used to represent the moon in the astronomy. Therefore, it is very important to understand the context of the symbol, in the symbol recognition field. Context explains the type and the domain of the graphical symbol. Symbols are generally categorized into the following list based on graphemes and the domain where they are used: • •
Basic communication symbols: Such symbols are used in the communication applications. These symbols represent the status, emotions, or like and dislike etc. Scientific and engineering symbols: These symbols are used to represent the concepts and rules, in the science and engineering domain. The scientific and engineering symbols are further categorized into the following categories: – – – – –
– – • • • •
Alchemical symbols: These symbols denote the elements and compounds in the field of chemistry. Astronomical symbols: These symbols represent the astronomical objects in the astronomy field (e.g. sun, moon). Electronic symbol: These symbols represent circuit diagrams, for example resisters and capacitors etc. Engineering drawing symbols: These symbols give detail representation of an engineering drawing. Energy systems language: These are also referred as, “energese”. These symbols are used to represent the energy circuit language, for example, source, consumption, and generic glow etc. Mathematical symbols: These symbols belongs to different branches of mathematics, for instance, + is used for mathematical addition operation. Rod of Asclepius / Caduceus as a symbol of medicine: These symbols are associated with the field of medicine and health care.
Consumer symbols: This category of symbols used for consumer domain, like currency symbols used for currency (e.g. $ for dollar currency). Navigational symbols: This category contains symbols related to navigation, for example traffic signs. ISO 7001 standards for general navigational symbols provided a details list of these symbols [67]. Food and packaging material: Often packaging contains symbols like glass, or some food symbol like egg etc., such symbols are fall in the category of food and packaging symbols. Hazard Symbols: Symbols to warn about dangerous material, objects and locations. High voltage symbol is an example of warning hazard symbols.
Multimed Tools Appl
• • • • • •
Property and pricing symbols: Symbols belong to property and pricing are fall in this category, an example of property symbol is trademark ™. Technology symbols: These symbols use in the domain of media and computers, some examples of technology symbols are media control symbols, power symbols and computer icons etc. Religious and mystical symbols: Different symbols used in different religions for example, cross is used in Christianity. Vexillological symbols: Symbols related to the study of flags fall in this category. Sports and game symbols: Symbols related to the domain of different sports and games. Musical score: Musical symbols are the marks and symbols used in musical scores and musical notations. Table 1 shows some example symbols of the above mentioned categories.
1.3 Graphic symbols and symbol analysis The above section generally discussed the symbol and their categories, while this section and the remaining paper will focus on the symbol as a system and application point of view. Therefore, in the literature we find the term “graphical symbols” for symbol recognition systems.
1.3.1 Graphic symbols Graphics or graphic symbols are set of graphic signs or 2D graphic shapes or designs which denotes something in its domain, i.e. some conceptual information. This information is the semantic assign to them. In fact symbols are designed according to the rules and they are not like some free shapes. This information and rules help us to recognize them, for example a red octagon traffic sign with “STOP” word means, to stop in the domain of transportation Table 1 Some example symbols from different categories of symbols Symbol categories Basic communication symbols Scientific and engineering Consumer symbols Navigational symbols Food and packaging Material Hazard symbols Property and pricing symbols Technology symbols Religious symbols Vexillological symbols Sports and game symbols Musical score
Some example symbols
Multimed Tools Appl
system, if a driver sees it on the road it will stop the vehicle. These graphic symbols play very important role in a variety of applications. Therefore, graphic symbol recognition is the subject of intensive research in the field of pattern recognition.
1.3.2 Symbol analysis Documents (file and image) are made up of different entities, and its significant part is graphic symbol. Graphic symbols are not like characters and it is distributed over the document or image. These graphic symbols may appear separately, or often appear to be connected with other graphic components like illustration, text or color. Symbols vary by their characteristics for example, an architectural drawing is different that from the shape of an object. Similarly, a musical score is different from the traffic signs on the road. Every domain of graphic symbols have its own intrinsic characteristics. These characteristics make the recognition and generalization task difficult. Other issue with symbol recognition system is that symbols are often embedded. Therefore, for correctly recognize the symbol we need to segment it, and for correctly segment the symbol we need to recognize it. From the symbol analysis point of view, to locate the symbols or regions for applying recognition task, researcher must rely on the prior knowledge about the nature of the symbols or documents. Therefore, different approaches have been proposed in the past for different types of application domains of graphical symbols. The task of graphical symbol recognition is very difficult and different, because of the higher number and variety of available symbols. It is impossible to provide a database for the entire graphic symbols, and develop a generalized technique to recognize them all. Document analysis is categorized into two sub domains, one is text processing and other is graphics processing. Both domains involve the basic tasks of segmentation of an image, understanding the layout, and the recognition of graphical symbol. The process of graphic symbol recognition is either comparing parts of symbols like lines and arcs and the relationship between them or matching features between a query and datasets of symbols. The symbol recognition is distinguished from character recognition, because character recognition involves the conversion of handwritten or typed text into machine encoded text, where symbol recognition involves the recognition of graphics in its particular domain. This paper focuses on graphic symbol recognition. Where detail methods of pattern recognition can be found in [49, 119]. There are few survey papers available on symbol recognition which completely or partially discussed the symbol recognition domain [17]. Chhabra provided an overview of graphic symbol recognition [32]. Tombre provided a survey on engineering drawings [164, 165]. Kanungo also provide a survey on the understanding of engineering drawings [74]. O’Gorman discussed basic techniques of graphic symbols [120]. Tang et al. provided document processing survey [161]. Blostien et al. provided a survey on musical image analysis [16]. Cordella et al. provided a survey on the technique and methods used for graphical symbols [37]. Llados et al. discussed the challenges and advances of graphical symbols [101]. The first goal of this paper is, to provide up to date literature review of graphic symbol recognition. The second goal is, to categorize the symbol recognition domain, and to provide a comparative study of all the techniques and to investigate the area for future directions specially the hybrid area of graphic symbol recognition. Rest of the sections are categorized as follows: Section 2 provides the application domains of symbol recognition, similarly, Section 3 discusses different graphic symbol recognition system. In Section 4, different methods are discussed which are used in the domain of graphic symbol recognition. In the end, in Section 5, the conclusion of the article is given.
Multimed Tools Appl
2 Application domain of graphical symbol recognition Symbol recognition is the main area of pattern recognition. Graphic symbols appear in a variety of domains (e.g. electronics, architecture, engineering, maps, music, logo, and web). The automatic interpreting and recognition of graphic symbols in these fields are domain dependent. Therefore, application domain of symbol recognition is categorized in the followings;
2.1 Logic circuit symbols recognition Logic circuit symbol recognition involves, automatic interpreting of logic circuit diagram to CAD systems. It is an earlier domain of graphic symbol recognition. These diagrams provide the loop structures and rectilinear connections between them, which is an advantage to segment them by distinguishing lines and loops. The usual approach for logic diagram recognition is, to convert the input image into binary image, and then detect the straight line and segments, structural elements, and interconnection patterns, as done by bunk et al. [22], [21]. The same work is modified by Groen et al. [62] using probabilistic graph matching. In which the authors represented the skeleton of symbol by a graph, and then used probabilistic matching technique by calculating the likelihood of each match. Kuner et al. [84] provided neural network strategies to solve the graph matching problems in reasonable time. Okazaki et al. [121] first segmented the symbols and then identified it by hybrid approaches, i.e. feature extraction and template matching. Lee et al. [88] provided attributed graph to recognize hand-written circuit symbols. The loops and lines features used by Kim et al. [80], to automatically recognize the symbols. Jiang et al. [70] used a generalized median graph approach, to measure the graph matching. In the proposed method authors used Genetic Algorithm, to tackle the exponential cost of graph matching. Feng et al. [56] use Dynamic Programming for online hand written electric circuit diagram recognition. A simple logic circuit on the left, and complex logic circuit on the right is shown in the Fig. 1. The complex circuit contains the text as well as graphics which makes the recognition task very difficult. Recently, authors proposed a system for online hand drawn logical circuits [125, 131].
2.2 Engineering drawing symbols recognition Engineering drawing recognition system involves the process of automatic segmentation and recognition of components in engineering drawing and documents, for CAD/CAM (Computer Aided Manufacturing) systems. Engineering drawings are difficult, because you cannot assume standardized diagrammatic notation. A part of complex drawing is shown in Fig. 2. Some authors proposed syntactic approaches based on some grammar for interpreting and recognizing the engineering drawings. Collin et al. [35] used special grammar, to describe dimensions of drawings, where a dimension is a collection of basic primitives, for example contour (C), witness (W), Shape (S), Tail (L), Text blocks (T), and Arrowheads (A). Dori et al. [44] proposed syntactical and geometrical approaches along with a deterministic finite automata, for recognizing dimensions in engineering machine drawings. Min et al. [115] proposed web grammar to recognize dimension in machine drawing, where a web is an undirected node labeled graph. The nodes are dimensioning components, and their relation is described as a web. The detection of hatched patterns is an important detection problem in engineering drawing. In [76], the authors proposed a method for hatched pattern detection. The proposed
Multimed Tools Appl
Fig. 1 Logic Circuits [56]
method detect the hatched pattern by clustering those lines which are parallel, and have same slope and angle. The method sorts these parallel lines along a normal direction. In [1], the authors distinguished four types of graphic symbols: arc and straight lines, crosshatched areas, dashed lines, and dimension. These information provides details like angular information, a section of a part etc. Further, In [2], the authors presented the study of arc detection using the work proposed in [46]. The proposed work is based on the idea of segmenting a curve into a set of arcs and segments. Wenyin et al. [171] proposed an example driven approach for the engineering drawings, by using knowledge acquisition to learn graphical knowledge, i.e. geometrical constraints from the graphic object examples, represented in vector form.
Fig. 2 Part of a Complex Drawing with dimension part and hatched patterns annotated
Multimed Tools Appl
2.3 Maps symbols recognition Map recognition and conversion of graphical information into a GIS (Geographic Information System) is another challenging sub domain of graphic symbol recognition. Maps are categorized into three types: cadastral maps, utility maps, and geographical maps, on their bases it deals with different techniques and notations. Recently, A survey of digital map processing techniques is provided by Chiang et al. [33]. 1.
Cadastral maps: These maps contain polygonal shapes and hatching patterns as shown in Fig. 3a. These shapes represent areas and their surrounding streets annotated with text. The symbols are recognized through polygonal shapes and hatching patterns [7, 99, 142]. Boatto et al. [18] proposed engineering drawing hatching technique for cadastral maps. 2. Utility maps: These maps contain information of utilities like water, telephone, gas etc., as shown in Fig. 3b. The proposed techniques related to this domain, convert images into binary images, and then extract lines and small primitive geometric shapes [10]. Madej et al. [110] provided a raster to CAD conversion. Hartog et al. [42] proposed a knowledge based approach for map interpretation, which used prior knowledge of map objects and their spatial relationships. Adam et al. [3] recognized multi-oriented and multi-scaled characters, by using Mellin Fourier Transform and Genetic Algorithm. 3. Geographical maps: These maps contain geographical entities. These entities are connected through lines, which represent roads, as shown in Fig. 3c. The related work on geographical maps is available in [66, 118]. Samet et al. [148] proposed a system called MARCO, which is used for the acquisition, storage, indexing, and retrieval of map images. Reiher et al. [136] proposed Neural Network and directed Hausdorff distance for recognizing maps. The Hausedorff distance measures the degree of mismatch between two set of points, by measuring the distance of a point of one set that is the farthest away from any point of the other set. Various work of such domain can be found in the CBIR (content Base Image Retrieval) [156]. Fei et al. [55] proposed a method for recognition of oil-well identifier in scanned petroleum geological maps, based on fuzzy classification and template matching. Recently, authors proposed a system called Shear Line Segment Generalized Hough Transform (SLS-GHT) to recognize the point symbols in the scanned topographic maps [114].
(a)
(b)
Fig. 3 a Cadastral Map Example b Part of Utility Map c Part of Geographical Map
(c)
Multimed Tools Appl
2.4 Music score symbol recognition This application domain involves, the recognition of music symbols available in the music documents. entire music symbol interpretation system is categorized into three steps. In the first step, the staff lines are extracted. In the second step, individual notes are recognized, and in the third and final step, the entire musical score is interpreted [6]. The extraction of staff lines, allows the system to segment the individual symbols. Individual note recognition can be easily done through any simple technique, because there are finite set of standard symbols. The recognition of note symbols for musical symbols is available in the [96]. Stuckelberg et al. [155] provided a probabilistic framework using Hidden Markov Model (HMM). Miyao et al. [116] proposed a system, to extract a note heads and stems from the note using candidate regions, and then three layers Back Propagation Neural Network (BPNN) is trained for classification. Yadid et al. [173] proposed a technique for handwritten musical scores based on modified neocognitron model, which is a variation of neural network. In [11], authors proposed a filtering process for segmentation and correction of a musical score, based on hierarchical and recursive approach. Randriamahefa et al. [134] proposed an attributed graph based technique for musical score recognition. In the proposed method, initially staff lines are eliminated using thinning and polygonalization, and then attributed graph is constructed for the notes. Fahmy et al. [54] proposed a graph grammar for musical score. Recently, authors proposed a system for the recognition of musical scores in mobile devices [168]. Similarly, an online handwritten musical score recognition system is proposed by [91]. A comparative study of recognition algorithms for musical scores is available in [135]. An example of music score with staff line is given in Fig. 4, where note head and stem is also pointed out.
2.5 Architectural drawings symbol recognition The architecture drawings symbol recognition system involve the recognition of entities, such as windows, doors, furniture, and wall etc., as shown in Fig. 5. The symbols in the architectural drawings are characterized as, prototype based symbols, and texture based symbols. Prototype based symbols are based on door and window. Texture based symbols are based on wall, floors and stairs. Due to the lack of standard notation availability for interpreting the symbols makes symbol recognition task difficult for this domain. Ah-soon et al. [4] proposed a geometrical constraint model for architectural drawings. Aoki et al. [8] proposed a model for hand sketched drawings, to convert it into CAD. Sanchez et al. [100] proposed a technique for texture symbols in the architectural drawings. The proposed method used string based representation, and then clustering is performed based on string edit distance for similarity measure. Valveny et al. [166] proposed a Bayesian framework and deformable template matching method for distorted hand-drawn architectural drawings.
Fig. 4 An Example of Musical Score with Staff Lines
Multimed Tools Appl
Fig. 5 An example of Architectural Drawing
Dosch et al. [47] provided a graphic recognition software environment based on different applicative modules. Recently, In [40], the authors proposed a method based on statistical segmentation and structural recognition for floor plan interpretation.
2.6 Logo symbol recognition Logo or trademark classification, [146, 182] involves the recognition of logos in the documents. Logo recognition involves extracting features from the image using contours and other techniques. Logo often combines text and graphics, therefore, OCR process is also included in some cases. Chang et al. [30] proposed a two-dimentional pseduo-hidden Markov model for deformed trademarks retrieval. Cesarini et al. [29] proposed a method based on neural architecture for recognition of noisy logo. Francesconi et al. [59] proposed recursive neural network for logo recognition. Doermann et al. [43] recognized logos using a method based on algebraic and differential invariant. Soffer et al. [154] proposed shape features for matching the similarities of logos. Recently, authors used Self Organization Map (SOM) and relevance feedback technique for the recognition of color logos [128]. An example of two different logos is given in Fig. 6.
2.7 Formula recognition Formula recognition is a combination of symbol recognition and OCR. Celik et al. [28] segment the input expression to isolate the symbols, and then each symbols is recognized by OCR. In the proposed method, the OCR process is followed by syntactic(grammar rules) analysis to validate the structure of the formula. Lavirotte et al. [86] provided a contextsensitive graph grammar for mathematical formulas. Lee et al. [89] proposed a dynamic programming based technique for features extracted from mathematical expressions, which are scanned from a printed graphical document. Ramel et al. [132] proposed a structural
Multimed Tools Appl
Fig. 6 Example Logos
approach for hand written symbols. In [31], the authors proposed an approach based on structural features for recognition and interpretation of mathematical formulas. Recently, in [95], the authors proposed a method based on machine learning techniques and heuristic rules for formula identification in PDF(Portable Document Format) documents. An example of a hand-written mathematical formula is given in Fig. 7. The CROHME(Competition on Recognition of On-line Handwritten Mathematical Expressions) is a handwritten mathematical expression’s recognition competition [117]. The competition provided a platform for comparing the methods and bringing out the relative matrices of individual approaches. Because, there are few benchmark datasets available publicly, and there is variation in the evaluation matrices proposed by the researchers. Similarly, authors proposed a system based on densely connected convolutional network to strengthen the feature extraction [181].
2.8 Online pen-based symbol recognition Online man-machine interfaces, involves the online recognition of symbolic shortcuts such as select, delete, copy etc., in graphic environment. Similarly, it provides an interface to correctly identify and sketch, the exact shapes and symbols drawn by the user(e.g. rectangle, circle). Wenyin et al. [170] proposed three different approaches for comparison and development of a system. These approaches are rule-based, SVM-based, and ANN-based. In [172], the authors proposed a spatial relation graph for online composite graphic representation. Liu et al. [94] proposed a structural approach based on spatial relational tree, to recognize online graphical symbols, before the user finishes drawing all its strokes. Lee Fig. 7 An example of online formula
Multimed Tools Appl
et al. [90] proposed an attributed relational graph based approach for online multi-strokes graphical symbols.
2.9 WWW documents The need of making queries by graphical content on the internet is high, because the World Wide Web(WWW) is rich with such documents, and it enables a more efficient search of information into the website [160]. Paek et al. [122] addressed the locating of text in web images. Kherfi et al. [78] provided a survey on image retrieval from the WWW, which discussed the issues, techniques and systems. In [60], the authors proposed a search system, based on meta-data extraction by field (caption, words, in the article title, etc), and this meta-data become the basis of region, time and subject indexing.
3 Graphic symbol recognition system The graphical symbol recognition problem is the focus of many researchers in the field of pattern recognition. The work is going on to develop efficient and effective recognition systems. A graphical symbol recognition system is a system capable of interpreting and recognition of symbols or segment of symbols in an image or document. In general, the graphical symbol recognition system is based on three phases. categorically, these phases are: representation phase, description phase, and classification phase, as shown in Fig. 8. Details of each phase is discussed in the following sections.
3.1 Representation phase This phase involves the extraction of information reducing the noise and presenting the data in such a way that the differences among the required information becomes clearer. The objective of the representation phase should not only be the segment or spot the symbol, but, this phase should extract key information (points), which is then described as features for further classification. For example, in logical circuit diagram, the key information are lines and loops in the circuit. Similarly, in the musical scores the key information is to extract the notes from the image. To enhance the symbols and remove noise, sometimes this phase involves preprocessing step. For example, the conversion of a color image into a gray scale image etc. Some basic techniques involve in this phase are given here; Binarization: is the process in which image is converted into a binary image [147]. Thinning: is the morphological operation used to remove noise and represent the symbol [85]. Polygon approximation: use polygonal shapes to represent a symbol [142]. Run length coding: use to convert image into code for compression. Connected component labeling: is a method which segmenting a document into regions, and among those regions it selects those components which corresponds to a symbol or part of a symbol [76]. Curve fitting: use curve to represent the symbols, and detecting arcs and circles in the image [46]. Hough transformation: is used to detect lines in the image [75]. Thinning combined with polygonal approximation, for the purpose of vectorization is the most common method used for representation phase. The drawback of thinning methods is
Multimed Tools Appl
Fig. 8 Flow Chart of Symbol Recognition System
that, it cannot discriminate filled symbols with empty ones (loops). In this case medial axis transform is useful, because it preserves the information of local thickness [139].
3.2 Description phase This phase use to describes a symbol in such a way that its feature can be easily extracted, trained, and recognized by a classifier in the next phase. These features should also be invariant to translation, rotation, and scaling. From structural point of view to describe a symbol, graph is one of the popular data structure, which is used to represent symbol features. Attributed graph is used to represent the structural features of a symbol, by labeled node and links of graph with different attributes like length, angle etc. It can be used as a simple linked list of pixels, or it can be used as region adjacency graph, to describe a symbol, where symbol segments are graph nodes, and their relationships represented by links. It can also be used to describe polylines. In polylines description graph nodes are associated to the segments of the polygonal, and edges are associated to the spatial relations between
Multimed Tools Appl
adjacent segments. This phase can use some grammars to describe a symbol syntactically. Different graph grammar based methods have been proposed in the literature [27]. Graph grammars, effectively represented repeated patterns in the symbols (e.g. hatched pattern in architectural drawings) [102]. Similarly, in the statistical methods, symbols are described in terms of feature vectors. Statistical descriptors mostly use pixels values to describe a symbol. In statistical methods, the symbol is described as a single feature vector. Different statistical approaches are used to describe the feature of a symbol as a feature vector. Fourier descriptor is a common choice to describe the shape of a symbol [50]. Other feature descriptors are also used for example, moments invariant and histogram based descriptors etc. Developing a good descriptor is still an open challenge for the researchers of graphical symbol recognition field. Because, the main challenges researchers face when developing a descriptor, are the high variable appearances of symbols, such as rotation, partial occlusion, elastic deformation, and variations in intra-class and inter-class.
3.3 Classification phase The classification phase involves, the training and testing of a graphical recognition system. When the features are described statistically in the form of a feature vector, then well known techniques are K-Nearest Neighbors [176], Back-propagation Neural Network [175], Bayesian Network [103] etc. Similarly, from structural point of view template matching, decision trees, graph matching (exact graph matching or sub graph isomorphism) are used for classification, when graph is used as a feature descriptor [27]. The graphical symbol is described as a feature vector when classification is based on statistical approaches. Symbol describing as a feature vector provides number of useful properties, specially, the collection of available variety of mathematical operations in vector space, i.e. sum, product, mean, or the distance measuring of two entities. These operations can be efficiently accomplished. However, the using single feature vector for the symbol have two limitations. First limitation is that vectors always represent a predefined set of features, therefore, irrespective of the size or complexity of the corresponding graphical symbol, all the feature vectors in a given application will have to keep the same length. The second limitations of feature vector is that, it provides very less information related to the relationship among different part of a symbol. These two issues in a graphical symbol classification phase affect the performance of a statistical classifier, when the underlying patterns of a symbol need to be characterized by complex structures and their relationships rather than the statistical distribution of a fixed size set of patterns [24]. In structural classification based approaches, the graphical symbol is represented by data structure like strings, trees, or graphs. Among which graph is the most popular and powerful tool to represent the underlying symbol, because strings and trees both are the substructures of graphs. The structured based approaches overcome the size constraints, and the ability of representing the relationships, of the feature vector based approach of statistical classifier. But the major drawback of graph based classifiers is the computational complexity. For example, the comparison of two feature vectors for similarity checking accomplishes in a non-exponential time. Where comparing the two graphs for isomorphism (similarity) requires exponential time. Another limitation while using the graph, is the availability of little mathematical structures. For example, calculating the sum or product of a pair of graphs is not possible. Therefore, because of these problems we see less availability of algorithmic tools for graph-based pattern recognition. From the information processing point of view, either top-down or bottom-up strategy is adopted for graphical symbol recognition. The top-down strategy, starts from a model of
Multimed Tools Appl
a symbol to be found and try to fit the model into the data. The bottom-up strategy, starts from the lowest level of information and move to the highest level. It stops until the final comparison is done between the sample object and known object. The researchers always experimenting to combine different feature extraction techniques with the classification techniques, to increase the performance of symbol recognition system.
4 Symbol recognition methods The task of developing an efficient and accurate graphical symbol recognition system is very challenging and difficult. Because there are vast variety of symbols available in different domains. Therefore, in order to develop an efficient and accurate graphical symbol recognition system, researchers have been using different types of methods. Every method has some advantages and disadvantages due to the nature of mechanism it provided. These methods are categorized by the way they utilize information from the images: Statistical methods [68], Structural methods [36, 126], Syntactical methods [81] and Hybrid methods [130]. The detailed description of each method is given below:
4.1 Statistical methods Statistical methods utilize the statistics of a graphical symbol. In such methods, statistics of the symbol is used to extract the features, such as pixels values, number of black pixels etc. These features are extracted from the considered featured space of a symbol. In such methods, features selection and model selection are the two basic aspects, because it affects the performance of the entire graphical recognition system. In these methods, patterns of a graphical symbol is represented by an n-dimensional feature vectors, these feature vectors are then processed by a classifier to accomplish the recognition task. The classifier partition these feature vectors into different classes, which are labeled in a feature vector space. In feature vector space, one class is used for each type of symbol, such as symbol class for rectangles, symbol class for circles, or symbol class for other shapes. The accuracy of a classifier mainly depends on the feature vector. Such feature vector is described by a feature descriptor. A feature descriptor is a way of describing interesting points (features) in the image. A feature description technique is said to be efficient, if it minimizes the distance among the patterns which belong to the same class, and it maximizes the distance among the patterns which belong to different classes. Further it should also hold the properties of invariance to the transformation and robustness to the noise and distortion. There are different ways to describe the feature vectors. The remaining part of this section will provide a survey of such descriptors.
4.1.1 Contour based statistical methods Contour based descriptors deal with the boundary of the symbol. These descriptors, describe small set of features, such as length, angle, distance from the center etc. These feature set make these methods robust to noise and small distortion. Contour based descriptors consider the shape of the graphic symbols. Fourier Descriptor (FD), based on the Fourier Theory is a well-known technique used for shape analysis [127]. The basic idea of FD for shape feature is to use Fourier transformed boundary of a shape [177]. It can be easily integrated, because of its simplicity in many applications [77]. Since, FD is a shape descriptor, it can work with different variety of shape signature. A shape signature is a one dimensional function
Multimed Tools Appl
representing two dimensional areas of boundaries. These shape signatures capture the perceptual feature of the shape. Some shape signatures are real and some are complex. Radial distance, chord length distance, angular function, triangular centroid area, triangular area representation, complex coordinates, polar coordinates, angular radial coordinates, and farthest point distance are most commonly used shape signatures available in literature [50]. A comparative study of the three FD methods in the domain of content-based image retrieval can be found in [179]. The authors compared conventional FD, affine FD, and short-time FD. The conventional FD is the discrete Fourier transform of 1D signature. Affine FD is invariant under affine transformation [9]. Short-time FD is used in the proposed method to locate the local boundary features. The experimental results show that Affine FD is not suitable for general shapes retrieval, because it is sensitive to boundary changes. Further, FD outperformed short-time FD in terms of retrieval effectiveness and computational complexity. Different modification have been proposed to enhance the shape discrimination ability of Fourier descriptors, for example, the Elliptic Fourier Descriptor (EFD) describes a contour of any object using a set of ellipses [93]. The spatial descretization of shapes is considered with new distance measure for closed planer curves in Multimedia Analysis and Retrieval Systems (MARS) [143]. The proposed system reduce the computational cost of feature extraction and also has low feature matching cost. Polygonal approximation and string matching is another contour based graphical symbol recognition technique [111]. In this technique, a string is formed from the symbol based on the polygon and then for classification two strings are matched using string edit distance. In [12], authors provided a new multi-resolution polygonal shape descriptor. This shape descriptor segments the contour of any symbol equally. The proposed descriptor is invariant to scale, rotation and translation. In the this work the matching process measures the similarity using elastic comparisons of symbols segments. Elastic matching provides advantages when matching partially occluded symbols. Hough Transform is also used for contour based shape analysis of the symbol. For instance, in [14], the authors proposed Generalized Hough Transform. The usefulness of this technique is that it provides direction information with edge information to calculate the shape information of a symbol. But the drawback of this technique is that, it has the space and time complexity. By summarizing the contour based descriptors, it is concluded that it works well for symbols having silhouette shapes, but it is limited to some applications, because it cannot capture the interior contents, and will also not work for disconnected shape, where boundary information is not available.
4.1.2 Region based statistical methods Region based descriptors consider the region of the symbol including contours and internal pixels in the pattern of the graphical symbol. In such approaches, the features of the symbol are represented by a single feature vector by computing some statistical measures. A disadvantage of region based approaches is that they are less discriminate, because a single vector regarding region is not enough to represent the entire symbol. Some of the region based approaches are discussed in the proceeding of the section. Moments invariant is a region based approach, used as a feature selection technique for graphical symbols [162]. Generally, moments describe numeric values from a reference point or axis at some distance. An image moment is a certain weighted averages of intensities of the image pixels or it is a function of such weighted average. These moments represent some properties or interpretation. In the domain of graphical symbols different variation of moments have been proposed. The basic and simplest form of moments
Multimed Tools Appl
invariants are regular moments. Regular moments have the form of projection of f (x, y) onto the monomial x p y q of order p + q [163]. Regular moments have the redundant information and are not orthogonal. This property of regular moments leads to the disadvantage of computational complexity, when using as a feature selection technique for graphical symbol recognition. But it also has the advantage of simple translational and scale invariance. Orthogonal moments are another variation of regular moments used for the feature extraction of graphical symbols. In the orthogonal moments the polynomial basis is orthogonal, that is, if its elements satisfy the condition of orthogonality, which produces orthogonal basis sets [13]. The advantage of orthogonal moments over regular moments is that it requires lower computational complexity. Zernike moments is a well known moments invariant used in the domain of pattern recognition. It is defined as, the set of orthogonal polynomials defined on the unit disk [79, 92]. Zernike moments are simple rotation invariance and it has less information redundancy. It also provides better image reconstruction as compared to normal moments. Zernike moments also has a disadvantage, i.e., Zernike moments needs to normalize an image in order to achieve scale invariance, which leads to inaccuracy of the classifier because of the re-sampling and re-quantization of digital images. To overcome this drawback various methods have been proposed such as, modified Zernike moments and Pseudo Zernike moments [34, 73]. Despite the fact that moments invariant are computationally complex it also provides some advantages. First, they are easy to compute. Second, they have relation with geometric properties. Third, they can be made invariant to scaling, rotation, and translation. A comparative study and survey of moments in the field of pattern recognition for two set of data i.e. handwritten numerals and aircraft shapes is available in [15]. It also presented a detailed study of Zernike moments invariants. The study of Cartesian moment theory and its application to object recognition and image analysis is available in [129].
4.1.3 Transformation based statistical methods Features can also be selected through transformation of an image, such as Fourier Transform [178], Radon Transform (RT), and Trace transform. Radon transform using all possible projections to globally describe an object [41]. Trace transform is another feature extraction technique to use global patterns [72]. Trace transform consists of tracing an image with straight lines, each tracing line parameters are calculated by a 2D function. In [158], authors proposed a method which combined radon transform with angular signatures. The angular objects are considered in the proposed method. Similarly, in order to capture the shape of a symbol at different levels, a new method based on Radon transform and distance transform is proposed in [157]. To increase the capability of Radon transform R-transform is proposed in [159]. R-transform is modified version of Radon transform which is invariant to common geometrical transformations and it is also capable of identifying complex shapes.
4.1.4 Model based statistical methods Model based descriptors have been proposed recently to obtain rich information from the symbol. Blurred Shape Model (BSM) is proposed for symbol having irregular deformation in [52]. The proposed method codify the shape in term of a set of key-points. These keypoints are defined by high magnitude pixels. The BSM descriptor defines a set of spatial regions by means of a grid. Then, the method computes the spatial relations among keypoints from neighbor regions, in order to obtain the descriptor features. The proposed BSM
Multimed Tools Appl
method is not rotation invariant. Therefore, to overcome this issue, BSM is modified and a new method is proposed in [51]. The proposed method is called Circular Blurred Shape Model (CBSM) used for feature extraction of a symbol. It captures the spatial arrangement of significant characteristics of an object in a correlogram structure. A correlogram C = {rg{1,1} , ..., rg{Co,Sec} } is defined as radial distribution of the sub-regions of the image, given a number of cocentric circles Co, a radius rad, a number of sections Sec, and an image region I rg. Each region rg is defined by its centroid coordinates. The proposed method shared the shape information from the object within correlogram regions, where a prior blurring degree defines the allowed level of distortion in the symbol. This makes the descriptor tolerant to irregular deformations. Further, to outline the shape of the symbol, probability of appearance of each pixel is encoded using BSM to extract new features. The BSM is also rotation invariant. In [5], authors proposed a method called Active Appearance Model (AAM), to combine shape deformations with appearance variability. In this work authors modified the original BSM to overcome its rigidity limitation. This work is applied on handwritten digits and symbols, and it showed better results in the representation and classification as compared with BSM.
4.1.5 Histogram based statistical methods Histogram is a popular graphical tool used for graphical representation of distribution of data. Histogram based technique are also used for description of features for graphic symbol. In [176], a new descriptor is proposed by constructing histograms for every pixel. These histogram are used to figure out the distribution of constraints among the pixels. Then, these histograms are converted to feature vectors by using statistical integration methods. It works well in the symbol where shape is close to skeleton. But it is not accurate in the filled shape symbols, where topological attributes are important, because the proposed method compute the constraints of every pixel with the reference of their neighborhood pixel. It also has the high computational complexity of roughly O(N 3 ), where N is the number of points in the shape. In [180], a new approach is proposed. In the proposed work a symbol is represented by a 2D kernel density and the similarity measure is based on Kullback-Leibler divergence. However, both techniques are not sufficient for complex symbols.
4.1.6 Geometrical information based statistical methods Geometrical feature selection is another approach which deals the geometric statistics of the symbols, such as centroids, axes of inertia, circularity, area, line intersections, holes and projection etc. These features are used to recognize different patterns in image analysis [138]. However, such stand-alone methods for extracting geometric features may not be suitable for addressing wide graphics recognition problems, as discussed in [141]. But, they are still useful for such symbols which have distinct shapes [123].
4.2 Structural methods Structural methods consider the structure of the symbol instead of the statistics of the symbol. In structural methods, primitive components of the graphical symbols are the basic of any algorithm. These primitive components include straight lines, circles, arcs, and curves segments etc. These features are extracted through any structural mean to produce the feature set. Further, their relationship is also considered to produce a robust feature set for graphical recognition system. Other geometric primitives, such as circles and rectangle
Multimed Tools Appl
are also used to obtain the features from an image. In [48], [144], the authors proposed vectorial signatures for symbol discrimination. The authors reported that the use of vectorial signature significantly reduces the symbol recognition time as well as solves the symbol segmentation problem. This method is based on region of interest, rather general graph matching. Similarly, in [171], the authors proposed an example-driven approach to engineering drawing. The proposed work represents graphical objects in terms of its structural components and their syntactical relationships and the system used an interactive approach by combining learning and user interaction. Some researchers proposed Galois Lattice as a classifier for graphic symbol recognition [19, 20, 38]. Other than the above mentioned structural methods, the following techniques are mostly available in the literature.
4.2.1 Vectorization based structural methods In structural methods where basic primitives of a symbol are extracted using vectorization, a vectorization process is needed. The vectorization process involves the extracting of straight line segments, a sample of vectorization process is given in Fig. 9. Kultanen et al. [83] proposed Randomized Hough Transform to extract these vectors from engineering drawings. The basic work of vectorization is to find the lines and curves of an object in the image. But the system can work well when embedded with the thinning method. A survey of thinning methods is available in [85]. In [76], authors proposed a system, in which first binary image is obtained by digitizing a paper based document, and segmented it into text-string and graphic image, then lines and arcs are identified, to obtain basic geometric primitives. Dori [45] proposed a method called Orthogonal Zig-Zag (OZZ), it is used to vectorization of the engineering drawings. It is fast as compare to Hough Transform, because it avoids pixel by pixel examination. Janssen et al. [69] presented a novel adaptive approach for vectorization process. It is an iterative algorithm, based on maximum threshold morphology to adjust the vector extremes. But it is sensitive to noise, because of local adjustments. In [64], authors proposed a method, which is not based on domain knowledge, rather it is based on random sampling. It shows good results for segmentation of line and curves, but it is computationally expensive. Recently, De Paramita et al. [39] proposed a method called ASKME: adaptive sampling with knowledge-driven vectorization of mechanical engineering drawings. The proposed work, not only identifies the lines, circles, and circular arcs, of a mechanical engineering drawing, but also classifies these primitives into different types that are usually found in practice. It employing certain digital geometric properties of these primitives in the discrete plane, which makes the sampling procedure effective. The proposed work outperform the work of Hilare and Tombre method [64] for engineering drawings. Wonkyo Seo et al. [153] proposed a method that locates tables and their cells based on junction detection and labeling, in camera-captured document images.
Fig. 9 (Vectorization) From left to right: source image, vectorized arcs, vectorized lines
Multimed Tools Appl
4.2.2 Graph based structural methods Graph is the choice of many researches for the representation of graphic symbol recognition in structural approaches [63, 97]. Graph is a powerful tool to represent the underlying data. In the domain of graphical symbol graph, not only able to describes the graphic symbol, but it also provides relationships between different parts of an underlying symbol through the edges [58]. Groen et al. [62] proposed graphs to represent electric symbol in circuit image, and recognition is done by probabilistic matching, based on likelihood. Kuner et al. [84] proposed a technique based on dealing with the problem of complexity of graph matching. Jiang et al. [70] proposed the idea of median graph for graph matching problem, for the recognition of graphic symbols. The authors proposed Genetic Algorithm to deal the complexity of median graph matching problem. Lee et al. [87] proposed attributed graph for graphic symbols. In the proposed method authors provided a model based approach, using attributed observed graphs, and then its pose is estimated in terms of translation, rotation, and scale with respect to the model. Park et al. [124] proposed a method based on probabilistic attributed relational graph for partially occluded objects. The proposed method is robust to noise and occlusions. Graph isomorphism is the similarity measure between two graphs. Two graph are said to be isomorphic, if they contain the same number of graph vertices connected in the same way. The algorithms which exist to test the two graph for isomorphism have exponential cost [112]. Therefore, researchers often test subgraph isomorphism instead of graph isomorphism for matching purpose. Subgraph isomorphism is the process in which two graphs are matched and find a similarity between there subgraphs, rather than exact graph matching technique. Here we will discuss the techniques related to subgraph isomorphism for graphic symbol recognition. Llados et al. [98] proposed attributed graph structure as a symbolic representation to recognize the architectural images. The proposed methods is focused on the matching process by inexact subgraph isomorphism procedure using relaxed labeling technique. Messmer et al. [113] proposed the idea of subgraph isomorphism, where in an offline process, each model graph is decomposed into smaller subgraphs, and then unknown input graphs are matched with these model graph by using decomposed subgraphs. The advantage of this approach is that those subgraphs which appear many times in the same or in the different model graphs must be matched only once. Wenyin et al. [170] proposed a method using shape fitting and shape regularization approach for online graphic symbol recognition. Xiaogang et al. [172] proposed a Spatial Relation Graph (SRG) for online graphic recognition, by identifying their primitive components, i.e. lines, arcs, ellipses. The method break down multistroke into single-stroke pairs. The SRG is used to represent graphic object, with its nodes representing the primitive components of the graphic object, and its edges represents the spatial relationship between these components. Similarly, Liu et al. [94] proposed a method for online graphic symbol recognition by recognizing the incomplete graphic objects, to recognize a graphic object before user finishes drawing all its strokes. Lee et al. [90] proposed an attributed relational graph that contains geometrical and topological features as its attributes for online graphic symbol. A comprehensive discussion and review of graph based approaches for pattern recognition are available in [36, 57, 71]. In [25–27], the authors provided a survey on graph based approach in which he discussed graph matching advancement and its application in document analysis. In [102], the authors provided a comprehensive survey of graph matching and graph parsing for the problem of symbol recognition. Authors also proposed a combined approach of both the strategies for symbol recognition. In [137], the authors calculated the eigen-structure of the adjacency matrix. In the proposed method, the purpose of the
Multimed Tools Appl
eigenvector is to normalize the adjacency matrix for a random walk to convert the graph into strings, and then string matching technique is used for the recognition purpose. In [145], the authors first extract the local key points of symbols and text from an image using different descriptors (e.g. invariant features), and then a proximity graph is constructed.
4.3 Hybrid methods Hybrid methods integrates both structural and statistical methods for graphic symbol recognition. A detail discussion of hybrid methods is given in the following sub-sections.
4.3.1 Graph based hybrid methods In graph based hybrid methods, the feature representation technique use structural methods for representing underlying complex objects in graphical symbols, because structural methods use graph as a powerful data structure. Graph is capable of extracting rich features and their relationships. Moreover, these features are converted into feature vector for classification purpose which uses well-known classifiers for this purpose. In the proceeding of this section, we will discuss hybrid approaches used for graphic symbol recognition. In [130], the authors proposed a work based on the combination of symbolic and statistical features for the problem of symbol recognition using graphs. In the proposed method, initially the attributed relational graph is constructed from the symbol using polygonal approximation [133, 169]. Figure 10 illustrate the extraction of attributed graph from a symbol using quadrilaterals. The features extracted from the graph are; quantitative features, symbolic features, and range features. Quantitative features help in bringing graph of the same size together and to differentiate it from other larger or smaller graphs present in the database. Symbol features help to cluster graphs corresponding to symbols having similar shapes. Range features provide discrimination between two graphs presents in the same cluster. Because, it is possible that the graphs of more than one symbol may have same number of nodes or same number of symbolic features, but it is sure that all of them cannot have same relative length as nodes attributes and same relative angle as edge attribute. Based on these features a feature vector is constructed. The authors applied the Nearest Neighbor Rule (NNR) to classify the unknown symbol. Graphic RECognition workshop (GREC) dataset is used for training and testing purpose. International Association of Pattern Recognition (IAPR) organized and sponsored GREC workshop [61]. One of the aim of the GREC workshop is to evaluate
(a)
(b)
(c)
(d)
Fig. 10 a An input symbol. b Extracting Quadrilaterals using polygon approximation c Constructing a graph using node as quadrilaterals, and edges shows their relationship through a zone of influence. d Attributed graph with node attributed with relative lengths, and edges attributed with relative angles and symbolic attribute, the symbolic attribute shows topological relationship
Multimed Tools Appl
stat of the art graphic recognition techniques. In [103], the authors extended the work of Qureshi et al. [130] by classifying through Bayesian classifier to overcome the irregularities in the signatures caused by noise. Further, The work is extended by introducing the fuzzy intervals to the features of the graph [104, 105]. The objective of introducing fuzzy intervals is to deal the sensitivity problem of structural representation to deformation and degradation. These features are divided into five groups. Group one and group two encode the size and arrangement of components respectively, and it is computed in a straight forward manner. Group three encodes the density of connection for nodes, and for this purpose it first computed the list of connection densities for all the nodes of Attributed Relational Graph (ARG). Then use this list, to count the connection density features. Group 4 and 5 encodes lengths and angle attributes for symbol. To compute 4 and 5, the relative length and relative angle is divided into three overlapping intervals. The overlapping intervals (fuzzy approach) is used to handle the issues of deformation (caused by distortion) and degradations (caused by noise). Further, the authors extended the previous work in [106–108], to use histogram based approach for calculating the feature vector through fuzzy intervals, which is called Fuzzy Structure Multilevel Feature Vector (FSMFV). The method embeds the graph into feature vector by using fuzzy and crisp histograms for the three levels of information extracted from the graph. These three levels of information are the graph level information (first level), structural level information (second level), and elementary level information (third level). Numeric features are encoded with fuzzy histograms, and symbolic features are encoded with crisp histograms for fuzzy structure multilevel feature vector (FSMFV). The first level of information (Graph level information) is embedded to a feature vector by two numeric features, graph order and graph size. Graph order, to discriminate between the topology of graphs, for example, it will discriminate between a small graph and bigger graphs. Graph size feature is used to discriminate on the topological relationship of graph, for example, two equal ordered graphs are different by graph size, i.e. a thin graph is different from a thicker graph. Structural level information involves; node degree, node attribute resemblance and edge attribute resemblance.
4.3.2 Bag-of-relation based hybrid methods Recently, another hybrid method namely bag of relations is proposed in [152]. In the proposed technique authors used information related to topological relations, to categorize them in terms of bags, and to guide directional relations. Authors addressed usefulness of the proposed work for symbol spotting, and for user-friendly symbol retrieval applications. The authors also provided a structural and statistical integrated method for graphic symbol recognition [151]. The proposed method applied spatial organization descriptors to the identified shape features within a fixed visual vocabulary that compose a symbol. Then an attributed relation graph expressing the spatial relations between those visual vocabulary elements. The integration of structural and statistical approaches for floor plan architectural drawings is proposed in [40]. The method applies two recognition steps in bottom-up manner. First, the statistical patch-based segmentation is applied to detect the basic building blocks, i.e. walls, doors, and windows. Second, graph is generated to apply structural pattern recognition for locating the main entities, i.e. room of the building.
4.4 Syntactical methods Syntactical methods involve formal grammar approach for symbol recognition [109, 140].
Multimed Tools Appl
4.4.1 Graph grammar based syntactical methods A grammar stores all the instances of a symbol, the recognition of an input symbol is performed by parsing it, to test whether this symbol can be generated by the given grammar or not [82]. A well known grammar based approach is the use of graph grammar for graphic symbol recognition [167]. Graph grammars can be used to parse, generate, or transforms graphs. A graph grammar is specified by a start graph g and a set of production rules. The production rules is used to replace one subgraph by another. For example a production g1 → gr (E), where g1 and g2 are unattributed graphs which may have labeled nodes and/or edges, and (E) specifies an embedding. During application of a production, we locate a subgraph g1host of the host graph which is isomorphic to g1. Then g1host is replaced by grhost , which is a a subgraph isomorphic to gr . Let the RestGraph denote the host graph minus g1host . There are different ways of attaching grhost to RestGraph. This attachment is specified by embedding transformation. Embedding transformation process is used to establish edges between the nodes of RestGraph and the nodes of g1host . The research of graph grammar is mainly focused on the study of embedding transformation study. A survey on graph grammar based approach is available in [53]. In [23], the authors proposed that how attributed programmed graph grammars can be used for the interpretation of schematic diagrams. The process of diagram understanding are divided into two stages, i.e. symbol recognition and high level recognition. The high level recognition is the process of extracting information content from a given two-dimensional arrangement of symbols. In [54], authors proposed a graph grammar based approach, which used the strategy of build-weed-incorporate organization, for automatic recognition of musical scores for high level recognition, i.e. diagram recognition. The graph grammar is programmed to repeatedly process the following three stages. The first stage is to formed all edges which represent potentially meaningful associations (Build). Secondly, edges representing conflicting or uninteresting associations are removed (Weed). Thirdly, the association’s semantics of the remaining associations are incorporated into the attributes of the nodes (Incorporate). Graph grammar is also used for parsing mathematical formula in optical formula recognition [86]. In the proposed method, for efficient parsing of mathematical formula, authors provided a context-sensitive graph grammar to parse the mathematical formula. In [149], the authors proposed a new technique for texture recognition in architectural drawings. The technique is based on graph grammar to model textural symbol in architectural plans by deriving a graph grammar from a structure texture, which is based on a region adjacency graph representation, and the production rules are based on the neighboring relations.
4.4.2 Automata and logic based syntactical methods There is some research work available in the domain of graphical symbol other than graph grammar based syntactical methods. For example, in [44], the authors proposed a DFA (Deterministic Finite Automata) based approach, to demonstrate the profile of a dimension set of engineering machine drawings. In [174], the authors proposed a novel syntactical approach for graphical symbol recognition. The authors claim that the proposed method build up a mathematical model to preciously describe the location information of a primitive with respect to an object instead of describing geometric relation between primitive pairs. The proposed method is also invariant to rotation and scaling. In [150], authors proposed another syntactical method, i.e. inductive logic programming to deal the problem of graphical symbol recognition.
Multimed Tools Appl
5 Conclusion In this paper, we revise and discuss the problem of graphical symbol recognition. The graphic symbol recognition approaches can be categorized into: statistical based approaches, structural based approaches, hybrid methods and syntactical approach. Each of these techniques have advantages and disadvantages in their respective application domains. The major problem being identified in all these approaches is the selection of optimal set of features for the recognition of graphic symbol. Due to irregular structure and shape, wellknown features do not perform well during the classification step of graphic recognition process. Similarly, it has been observed that computation complexity is the common limitation in each of the approach, due to the complex nature of the problem. The main challenges researchers still face in the symbol recognition domain are symbol spotting, segmentation, and symbol representation, for feature extraction phase. In terms of matching and classification, the graphic symbol recognition has the challenge of scalability, robustness, efficiency, and generality. In must be noted that, the application domain of graphical symbol recognition is very vast and symbol recognition can be applied in other field of image processing and pattern matching.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References 1. Ablameyko S (1997) An introduction to interpretation of graphic images. SPIE Press, vol 27 2. Ablameyko S, Bereishik V, Frantskevich O, Homenko M, Paramonova N (1998) Knowledge-based recognition of crosshatched areas in engineering drawings. In: Advances in Pattern Recognition. Springer, pp 460–467 3. Adam S, Ogier J-M, Cariou C, Gardes J, Mullot R, Lecourtier Y (2000) Combination of invariant pattern recognition primitives on technical documents. In: Graphics Recognition Recent Advances. Springer, pp 238–245 4. Ah-Soon C, Tombre K (2001) Architectural symbol recognition using a network of constraints. Pattern Recogn Lett 22(2):231–248 5. Almaz´an J, Forn´es A, Valveny E (2012) A non-rigid appearance model for shape description and recognition. Pattern Recogn 45(9):3105–3113 ´ Co¨uasnon B, Dambreville F (2000) A symbol classifier able to reject wrong shapes for 6. Anquetil E, document recognition systems. In: Graphics Recognition Recent Advances. Springer, pp 209–218 7. Antoine D, Collin S, Tombre K (1992) Analysis of technical documents: The redraw system. In: Structured document image analysis. Springer, pp 385–402 8. Aoki Y, Shio A, Arai H, Odaka K (1996) A prototype system for interpreting hand-sketched floor plans. In: 1996., Proceedings of the 13th International Conference on Pattern Recognition. IEEE, vol 3, pp 747–751 9. Arbter K, Snyder WE, Burkhardt H, Hirzinger G (1990) Application of affine-invariant fourier descriptors to recognition of 3-d objects. IEEE Trans Pattern Anal Mach Intell 12(7):640–647 10. Arias JF, Lai CP, Chandran S, Kasturi R, Chhabra A (1993) Interpretation of telephone system manhole drawings. In: 1993., Proceedings of the Second International Conference on Document Analysis and Recognition. IEEE, pp 365–368 11. Armand J-P (1993) Musical score recognition: a hierarchical and recursive approach. In: 1993., Proceedings of the Second International Conference on Document Analysis and Recognition. IEEE, pp 906–909 12. Attalla E, Siy P (2005) Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching. Pattern Recogn 38(12):2229–2241
Multimed Tools Appl 13. Bailey RR, Srinath M (1996) Orthogonal moment features for use with parametric and non-parametric classifiers. IEEE Trans Pattern Anal Mach Intell 18(4):389–399 14. Ballard DH (1981) Generalizing the hough transform to detect arbitrary shapes. Pattern Recogn 13(2):111–122 15. Belkasim SO, Shridhar M, Ahmadi M (1991) Pattern recognition with moment invariants: a comparative study and new results. Pattern Recogn 24(12):1117–1138 16. Blostein D, Baird HS (1992) A critical survey of music image analysis. In: Structured Document Image Analysis. Springer, pp 405–434 17. Blostein D (1995) General diagram-recognition methodologies. In: Graphics Recognition Methods and Applications. Springer, pp 106–122 18. Boatto L, Consorti V, Del Buono M, Di Zenzo S, Eramo V, Esposito A, Melcarne F, Meucci M, Morelli A, Mosciatti M et al (1992) An interpretation system for land register maps. Computer 25(7):25–33 19. Boumaiza A, Tabbone S (2012) Impact of a codebook filtering step on a galois lattice structure for graphics recognition. In: 2012 21st International Conference on Pattern Recognition (ICPR). IEEE, pp 278–281 20. Boumaiza A, Tabbone S (2012) Symbol recognition using a galois lattice of frequent graphical patterns. In: 2012 10th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, pp 165–169 21. Bunke H (1982) Automatic interpretation of lines and text in circuit diagrams. In: Pattern Recognition Theory and Applications. Springer, pp 297–310 22. Bunke H (1982) Experience with several methods for the analysis of schematic diagrams. In: Proceedings of 6th International Conference on Pattern Recognition, pp 710–712 23. Bunke H (1982) Attributed programmed graph grammars and their application to schematic diagram interpretation. IEEE Trans Pattern Anal Mach Intell 6:574–582 24. Bunke H, Sanfeliu A (1990) Syntactic and structural pattern recognition, theory and applications. World Scientific, vol 7 25. Bunke H, Messmer BT (1995) Efficient attributed graph matching and its application to image analysis. In: Image Analysis and Processing. Springer, pp 44–55 26. Bunke H, Messmer BT (1997) Recent advances in graph matching. Int J Pattern Recognit Artif Intell 11(01):169–203 27. Bunke H, Riesen K (2011) Recent advances in graph-based pattern recognition with applications in document analysis. Pattern Recogn 44(5):1057–1067 28. Celik M, Yanikoglu B (2011) Probabilistic mathematical formula recognition using a 2d context-free graph grammar. In: 2011 International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp 161–166 29. Cesarini F, Francesconi E, Gori M, Marinai S, Sheng J, Soda G (1997) A neural-based architecture for spot-noisy logo recognition. In: 1997., Proceedings of the Fourth International Conference on Document Analysis and Recognition. IEEE, vol 1, pp 175–179 30. Chang M-T, Chen S-Y (2001) Deformed trademark retrieval based on 2d pseudo-hidden markov model. Pattern Recogn 34(5):953–967 31. Chaudhuri BB, Garain U (2000) An approach for recognition and interpretation of mathematical expressions in printed document. Pattern Anal Appl 3(2):120–131 32. Chhabra A (1998) Graphic symbol recognition: An overview. In: Tombre K, Chhabra A (eds) Graphics Recognition Algorithms and Systems, ser. Lecture Notes in Computer Science, vol 1389. Springer, Berlin, pp 68?-79 33. Chiang Y-Y, Leyk S, Knoblock CA (2014) A survey of digital map processing techniques. ACM Comput Surv (CSUR) 47(1):1 34. Chong C-W, Raveendran P, Mukundan R (2003) A comparative analysis of algorithms for fast computation of zernike moments. Pattern Recogn 36(3):731–742 35. Collin S, Colnet D (1994) Syntactic analysis of technical drawing dimensions. Int J Pattern Recognit Artif Intell 8(05):1131–1148 36. Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recogn Artif Intell 18(03):265–298 37. Cordella LP, Vento M (2000) Symbol recognition in documents: a collection of techniques? Int J Doc Anal Recogn 3(2):73–88 38. Coustaty M, Bertet K, Visani M, Ogier J-M (2011) A new adaptive structural signature for symbol recognition by using a galois lattice as a classifier. IEEE Trans Syst Man Cybern Part B: Cybern 41(4):1136–1148 39. De P, Mandal S, Bhowmick P, Das A (2015) Askme: adaptive sampling with knowledge-driven vectorization of mechanical engineering drawings. International Journal on Document Analysis and Recognition (IJDAR) 19(1):11–29
Multimed Tools Appl 40. de las Heras L-P, Ahmed S, Liwicki M, Valveny E, S´anchez G (2014) Statistical segmentation and structural recognition for floor plan interpretation. Int J Doc Anal Recogn (IJDAR) 17(3):221–237 41. Deans SR (2007) The Radon transform and some of its applications. Courier Corporation 42. den Hartog J, Ten Kate T, Gerbrands J (1996) Knowledge-based segmentation for automatic map interpretation. In: Graphics Recognition Methods and Applications. Springer, pp 159–178 43. Doermann D, Rivlin E, Weiss I (1996) Applying algebraic and differential invariants for logo recognition. Mach Vis Appl 9(2):73–86 44. Dori D (1989) A syntactic/geometric approach to recognition of dimensions in engineering machine drawings. Comput Vis Graph Image Process 47(3):271–291 45. Dori D (1997) Orthogonal zig-zag: an algorithm for vectorizing engineering drawings compared with hough transform. Adv Eng Softw 28(1):11–24 46. Dosch P, Masini G, Tombre K (2000) Improving arc detection in graphics recognition. In: Icpr. IEEE, pp 2243 47. Dosch P, Tombre K, Ah-Soon C, Masini G (2000) A complete system for the analysis of architectural drawings. Int J Doc Anal Recognit 3(2):102–116 48. Dosch P, Llad´os J (2004) Vectorial signatures for symbol discrimination. In: Graphics Recognition. Recent Advances and Perspectives. Springer, pp 154–165 49. Duda RO, Hart PE et al (1973) Pattern classification and scene analysis, vol 3. Wiley, New York 50. El-ghazal A, Basir O, Belkasim S (2007) A new shape signature for fourier descriptors. In: 2007. ICIP 2007. IEEE International Conference on Image Processing. IEEE, vol 1, pp I–161 51. Escalera S, Forn´es A, Pujol O, Llad´os J, Radeva P (2011) Circular blurred shape model for multiclass symbol recognition. IEEE Trans Syst Man Cybern Part B: Cybern 41(2):497–506 52. Escalera S, Forn´es A, Pujol O, Radeva P, S´anchez G, Llad´os J (2009) Blurred shape model for binary and grey-level symbol recognition. Pattern Recogn Lett 30(15):1424–1433 53. Fahmy H, Blostein D (1992) A survey of graph grammars: Theory and applications. In: 1992. Vol. II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on Pattern Recognition. IEEE, pp 294–298 54. Fahmy H, Blostein D (1993) A graph grammar programming style for recognition of music notation. Mach Vis Appl 6(2-3):83–99 55. Fei T, Rui W, Qingxin D, Lei X New approach for oil-well symbol recognition in petroleum geological structure map. In: 2010 International Conference on Electrical and Control Engineering (ICECE). IEEE, 2010, pp 5357–5360 56. Feng G, Viard-Gaudin C, Sun Z (2009) On-line hand-drawn electric circuit diagram recognition using 2d dynamic programming. Pattern Recogn 42(12):3215–3223 57. Foggia P, Percannella G, Vento M (2014) Graph matching and learning in pattern recognition in the last 10 years. Int J Pattern Recogn Artif Intell 28(01):1450001 58. Foulds LR (2012) Graph theory applications. Springer Science & Business Media, Berlin 59. Francesconi E, Frasconi P, Gori M, Marinai S, Sheng J, Soda G, Sperduti A (1998) Logo recognition by recursive neural networks. In: Graphics Recognition Algorithms and Systems. Springer, pp 104–117 60. Gelernter J (2009) Image indexing in article component databases. J Amer Soc Inf Sci Technol 60(10):1965–1976 61. GREC (2003) International symbol recognition contest at grec2003 62. Groen FC, Sanderson AC, Schlag JF (1985) Symbol recognition in electrical diagrams using probabilistic graph matching. Pattern Recogn Lett 3(5):343–350 63. Hamada AH (1991) Structural recognition of disturbed symbols using discrete relaxation. Centre de Recherche en Informatique 64. Hilaire X, Tombre K (2006) Robust and accurate vectorization of line drawings. IEEE Trans Pattern Anal Mach Intell 28(6):890–904 65. Hornby AS, Wehmeier S (1995) Oxford advanced learner’s dictionary, vol 1428. Oxford University Press, Oxford 66. Hoshino T, Suzuki S, Kosugi M (1986) Automatic input method for large scale maps. In: Proceedings of the 8th International Conference on Pattern Recognition, pp 449–453 67. ISO I (2007) Standard graphical symbols: Public information symbols (iso 7001: 2007). International Standards Organisation (ISO), Geneva 68. Jain AK, Duin RP, Mao J (2000) Statistical pattern recognition: A review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37 69. Janssen RD, Vossepoel AM (1997) Adaptive vectorization of line drawing images. Comput Vis Image Understand 65(1):38–56 70. Jiang X, M¨unger A, Bunke H (2000) Synthesis of representative graphical symbols by computing generalized median graph. In: Graphics Recognition Recent Advances. Springer, pp 183–192
Multimed Tools Appl 71. Jouili S, Tabbone S (2011) Towards performance evaluation of graph-based representation. In: GraphBased Representations in Pattern Recognition. Springer, pp 72–81 72. Kadyrov A, Petrou M (2001) The trace transform and its applications. IEEE Trans Pattern Anal Mach Intell 23(8):811–828 73. Kamila N, Mahapatra S, Nanda S (2005) Retracted: Invariance image analysis using modified zernike moments. Pattern Recogn Lett 26(6):747–753 74. Kanungo T, Haralick R, Dori D (1995) Understanding engineering drawings: A survey. In: Proceedings of First IARP Workshop on Graphics Recognition, pp 217–228 75. Kassim AA, Tan T, Tan K (1999) A comparative study of efficient generalised hough transform techniques. Image Vis Comput 17(10):737–748 76. Kasturi R, Bow ST, El-Masri W, Shah J, Gattiker JR, Mokate UB (1990) A system for interpretation of line drawings. IEEE Trans Pattern Anal Mach Intell 12(10):978–992 77. Kauppinen H., Sepp¨anen T, Pietik¨ainen M (1995) An experimental comparison of autoregressive and fourier-based descriptors in 2d shape classification. IEEE Trans Pattern Anal Mach Intell 17(2):201– 207 78. Kherfi ML, Ziou D, Bernardi A (2004) Image retrieval from the world wide web, Issues, techniques, and systems. ACM Comput Surv (CSUR) 36(1):35–67 79. Khotanzad A, Hong YH (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497 80. Kim S, Suh J, Kim J (1993) Recognition of logic diagrams by identifying loops and rectilinear polylines. In: 1993., Proceedings of the Second International Conference on Document Analysis and Recognition. IEEE, pp 349–352 81. King S (1982) Syntactic pattern recognition and applications 82. Kiyko VM (1995) Recognition of objects in images of paper based line drawings. In: 1995., Proceedings of the Third International Conference on Document Analysis and Recognition. IEEE, vol 2, pp 970–973 83. Kultanen P, Oja E, Xu L (1990) {R} andomized {H} ough {T} ransform {(RHT)} in engineering drawing vectorization system 84. Kuner P, Ueberreiter B (1988) Pattern recognition by graph matching-combinatorial versus continuous optimization. Int J Pattern Recognit Artif Intell 2(03):527–542 85. Lam L, Lee S-W, Suen CY (1992) Thinning methodologies-a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 14(9):869–885 86. Lavirotte S, Pottier L (1997) Optical formula recognition. In: 1997., Proceedings of the Fourth International Conference on Document Analysis and Recognition. IEEE, vol 1, pp 357–361 87. Lee S-W, Kim JH, Groen FC (1990) Translation-, rotation-and scale-invariant recognition of handdrawn symbols in schematic diagrams. Int J Pattern Recognit Artif Intell 4(01):1–25 88. Lee SW (1992) Recognizing hand-drawn electrical circuit symbols with attributed graph matching. In: Structured document image analysis. Springer, pp 340–358 89. Lee H-J, Lee M-C (1994) Understanding mathematical expressions using procedure-oriented transformation. Pattern Recogn 27(3):447–457 90. Lee W, Kara LB, Stahovich TF (2007) An efficient graph-based recognizer for hand-drawn symbols. Comput Graph 31(4):554–567 91. Lee H-L, Gong S-J, Chen L-H (2016) An online handwritten recognition system of music score. In: 2016 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, vol 2, pp 552– 557 92. Liao SX, Pawlak M (1998) On the accuracy of zernike moments for image analysis. IEEE Trans Pattern Anal Mach Intell 20(12):1358–1364 93. Lin C-S, Hwang C-L (1987) New forms of shape invariants from elliptic fourier descriptors. Pattern Recogn 20(5):535–545 94. Lin Y, Wenyin L, Jiang C (2004) A structural approach to recognizing incomplete graphic objects. In: 2004. ICPR 2004. Proceedings of the 17th International Conference on Pattern Recognition. IEEE, vol 1, pp 371–375 95. Lin X, Gao L, Tang Z, Baker J, Sorge V (2014) Mathematical formula identification and performance evaluation in pdf documents. Int J Doc Anal Recogn (IJDAR) 17(3):239–255 96. Liu X (2012) Note symbol recognition for music scores. In: Intelligent Information and Database Systems. Springer, pp 263–273 97. Llad´os J (1997) Combining graph matching and hough transform for handdrawn graphical document analysis, Application to Architectural Drawings. Phd, Universitat Autonoma de Barcelona 98. Llad´os J, L´opez-Krahe J, Mart´ı E (1997) A system to understand hand-drawn floor plans using subgraph isomorphism and hough transform. Mach Vis Appl 10(3):150–158
Multimed Tools Appl 99. Llad´os J, Mart´ı E, L´opez-Krahe J (1999) A hough-based method for hatched pattern detection in maps and diagrams. In: 1999. ICDAR’99. Proceedings of the Fifth International Conference on Document Analysis and Recognition. IEEE, pp 479–482 100. Llad´os J, S´anchez G, Mart´ı E (1998) A string based method to recognize symbols and structural textures in architectural plans. In: Graphics Recognition Algorithms and Systems. Springer, pp 91–103 101. Llad´os J, Valveny E, S´anchez G, Mart´ı E (2002) Symbol recognition: Current advances and perspectives. In: Graphics Recognition Algorithms and Applications. Springer, pp 104–128 102. Llados J, Sanchez G (2004) Graph matching versus graph parsing in graphics recognition a combined approach. Int J Pattern Recognit Artif Intell 18(03):455–473 103. Luqman MM, Brouard T, Ramel J-Y (2009) Graphic symbol recognition using graph based signature and bayesian network classifier. In: Document Analysis and Recognition, 2009. ICDAR’09. 10th International Conference on. IEEE, pp 1325–1329 104. Luqman MM, Delalandre M, Brouard T, Ramel J-Y, Llad´os J (2010) Employing fuzzy intervals and loop-based methodology for designing structural signature: an application to symbol recognition, arXiv:1004.5427 105. Luqman MM, Delalandre M, Brouard T, Ramel J-Y, Llad´os J (2010) Fuzzy intervals for designing structural signature: An application to graphic symbol recognition. In: graphics recognition. Achievements, Challenges, and Evolution. Springer, pp 12–24 106. Luqman MM, Llad´os J, Ramel J-Y, Brouard T (2010) A fuzzy-interval based approach for explicit graph embedding. In: Recognizing patterns in signals, speech, images and videos. Springer, pp 93–98 107. Luqman MM, Llad´os J, Ramel J-Y, Brouard T (2011) Dimensionality reduction for fuzzy-interval based explicit graph embedding. In: Ninth IAPR International Workshop on Graphics RECognition, vol 9, pp 117–120 108. Luqman MM, Ramel J-Y, Llad´os J, Brouard T (2013) Fuzzy multilevel graph embedding. Pattern Recogn 46(2):551–565 109. Lutz S (2014) Whats right with a syntactic approach to theories and models. Erkenntnis 79(8):1475– 1492 110. Madej D (1991) An intelligent map-to-cad conversion system. In: Proceedings of 1st. Int. Conf. on Document Analysis and Recognition, pp 602–610 111. Maes M (1991) Polygonal shape recognition using string-matching techniques. Pattern Recogn 24(5):433–440 112. Mehlhorn K, Brauer W, Rozenberg G, Salomaa A (1984) Data structures and algorithms. Volume 2: Graph algorithms and NP-completeness. Springer, Berlin 113. Messmer BT, Bunke H (2000) Efficient subgraph isomorphism detection: a decomposition approach. IEEE Trans Knowl Data Eng 12(2):307–323 114. Miao Q, Xu P, Li X, Song J, Li W, Yang Y (2017) The recognition of the point symbols in the scanned topographic maps. IEEE Trans Image Process 26(6):2751–2766 115. Min W, Tang Z, Tang L (1993) Using web grammar to recognize dimensions in engineering drawings. Pattern Recogn 26(9):1407–1416 116. MIYAO H, NAKANO Y (1996) Note symbol extraction for printed piano scores using neural networks*. IEICE Trans Inf Syst 79(5):548–554 117. Mouch`ere H, Zanibbi R, Garain U, Viard-Gaudin C (2016) Advancing the state of the art for handwritten math recognition: the crohme competitions, 2011–2014. Int J Doc Anal Recogn (IJDAR) 19(2):173–189 118. Myers GK, Mulgaonkar PG, Chen C-H, DeCurtins JL, Chen E (1996) Verification-based approach for automated text and feature extraction from raster-scanned maps. In: Graphics Recognition Methods and Applications. Springer, pp 190–203 119. Nagy G (1968) State of the art in pattern recognition. IEEE 56(5):836–863 120. O’Gorman L (1995) Basic techniques and symbol-level recognition-an overview. In: Graphics Recognition Methods and Applications. Springer, pp 1–12 121. Okazaki A, Kondo T, Mori K, Tsunekawa S, Kawamoto E (1988) An automatic circuit diagram reader with loop-structure-based symbol recognition. IEEE Trans Pattern Anal Mach Intell 10(3):331–341 122. Paek S, Smith JR (1998) Detecting image purpose in world wide web documents. In: Photonics West’98 Electronic Imaging. International Society for Optics and Photonics, pp 151–158 123. Pal SK, Rosenfeld A (1991) A fuzzy medial axis transformation based on fuzzy disks. Pattern Recogn Lett 12(10):585–590 124. Park BG, Lee KM, Lee SU, Lee JH (2003) Recognition of partially occluded objects using probabilistic arg (attributed relational graph)-based matching. Comput Vis Image Underst 90(3):217–241 125. Patare MD, Joshi MS (2016) Hand-drawn digital logic circuit component recognition using svm. International Journal of Computer Applications 143(3):24–28
Multimed Tools Appl 126. Pavlidis T (1977) Structural pattern recognition, vol 2, Springer-verlag, New York 127. Persoon E, Fu K-S (1977) Shape discrimination using fourier descriptors. IEEE Trans Syst Man Cybern 7(3):170–179 128. Pinjarkar L, Sharma M, Selot S (2018) Efficient system for color logo recognition based on selforganizing map and relevance feedback technique. In: Smart Computing and Informatics. Springer, pp 53–62 129. Prokop RJ, Reeves AP (1992) A survey of moment-based techniques for unoccluded object representation and recognition. CVGIP: Graph Models Image Process 54(5):438–460 130. Qureshi RJ, Ramel J-Y, Cardot H, Mukherji P (2007) Combination of symbolic and statistical features for symbols recognition. In: 2007. ICSCN’07. International Conference on Signal Processing, Communications and Networking. IEEE, pp 477–482 131. Rabbani M, Khoshkangini R, Nagendraswamy H, Conti M (2016) Hand drawn optical circuit recognition. Procedia Comput Sci 84:41–48 132. Ramel J-Y, Boissier G, Emptoz H (2000) A structural representation adapted to handwritten symbol recognition. In: Graphics Recognition Recent Advances. Springer, pp 228–237 133. Ramel J-Y, Vincent N, Emptoz H (2000) A structural representation for understanding line-drawing images. Int J Doc Anal Recognit 3(2):58–66 134. Randriamahefa R, Cocquerez JP, Fluhr C, Pepin F, Philipp S (1993) Printed music recognition. In: 1993., Proceedings of the Second International Conference on Document Analysis and Recognition. IEEE, pp 898–901 135. Rebelo A, Capela G, Cardoso JS (2010) Optical recognition of music symbols. Int J Doc Anal Recogn (IJDAR) 13(1):19–31 136. Reiher E, Li Y, Donne VD, Lalonde M, Hayne C, Zhu C (1996) A system for efficient and robust map symbol recognition. In: 1996., Proceedings of the 13th International Conference on Pattern Recognition. IEEE, vol 3, pp 783–787 137. Robles-Kelly A, Hancock ER (2004) String edit distance, random walks and graph matching. Int J Pattern Recognit Artif Intell 18(03):315–327 138. Rosenfeld A (1984) Image analysis: problems, progress and prospects. Pattern Recogn 17(1):3–12 139. Rosenfeld A (1986) Axial representations of shape. Comput Vis Graph Image Process 33(2):156–173 140. Rosenfeld A (1990) Array, tree and graph grammars. Syntactic and Structural Pattern Recognition: Theory and Applications, pp 85–115 141. Rosenfeld A, Kak AC (2014) Digital picture processing, vol 1. Elsevier, New York 142. Rosin PL (1997) Techniques for assessing polygonal approximations of curves. IEEE Trans Pattern Anal Mach Intell 19(6):659–666 143. Rui Y, She AC, Huang TS (1997) A modified fourier descriptor for shape matching in mars. Ser Softw Eng Knowl Eng 8:165–180 144. Rusi˜nol M., Llad´os J. (2006) Symbol spotting in technical drawings using vectorial signatures. In: Graphics Recognition. Ten Years Review and Future Perspectives. Springer, pp 35–46 145. Rusi˜nol M, Llad´os J (2008) Word and symbol spotting using spatial organization of local descriptors. In: 2008. DAS’08. The Eighth IAPR International Workshop on Document Analysis Systems. IEEE, pp 489–496 146. Rusinol M, Llados J (2009) Logo spotting by a bag-of-words approach for document categorization. In: 2009. ICDAR’09. 10th International Conference on Document Analysis and Recognition. IEEE, pp 111–115 147. Sahoo PK, Soltani S, Wong AK (1988) A survey of thresholding techniques. Comput Vis Graph Image Process 41(2):233–260 148. Samet H, Soffer A (1996) Marco: Map retrieval by content. IEEE Trans Pattern Anal Mach Intell 18(8):783–798 149. S´anchez G, Llad´os J (2001) A graph grammar to recognize textured symbols. In: 2001. Proceedings. Sixth International Conference on Document Analysis and Recognition. IEEE, pp 465–469 150. Santosh K, Lamiroy B, Ropers J-P (2009) Inductive logic programming for symbol recognition. In: 2009. ICDAR’09. 10th International Conference on Document Analysis and Recognition. IEEE, pp 1330–1334 151. Santosh K, Lamiroy B, Wendling L (2014) Integrating vocabulary clustering with spatial relations for symbol recognition. Int J Doc Anal Recogn (IJDAR) 17(1):61–78 152. Santosh K, Wendling L, Lamiroy B (2014) Bor: Bag-of-relations for symbol retrieval, vol 28 153. Seo W, Koo HI, Cho NI (2015) Junction-based table detection in camera-captured document images. Int J Doc Anal Recogn (IJDAR) 18(1):47–57 154. Soffer A, Samet H (1998) Using negative shape features for logo similarity matching. In: 1998. Proceedings. Fourteenth International Conference on Pattern Recognition. IEEE, vol 1, pp 571–573
Multimed Tools Appl 155. St¨uckelberg MV, Doermann D (1999) On musical score recognition using probabilistic reasoning. In: 1999. ICDAR’99. Proceedings of the Fifth International Conference on Document Analysis and Recognition. IEEE, pp 115–118 156. Suzuki S, Yamada T (1990) Maris: Map recognition input system. In: Mapping and spatial modelling for navigation. Springer, pp 95–116 157. Tabbone S, Wendling L (2003) Binary shape normalization using the radon transform. In: Discrete Geometry for Computer Imagery. Springer, pp 184–193 158. Tabbone S, Wendling L, Tombre K (2003) Matching of graphical symbols in line-drawing images using angular signature information. Doc Anal Recogn 6(2):115–125 159. Tabbone S, Wendling L, Salmon J-P (2006) A new shape descriptor defined on the radon transform. Comput Vis Image Underst 102(1):42–51 160. Tan Q, Mitra P, Giles CL (2009) Effectively searching maps in web documents. In: Advances in Information Retrieval. Springer, pp 162–176 161. Tang YY, Lee S-W, Suen CY (1996) Automatic document processing: a survey. Pattern Recogn 29(12):1931–1952 162. Teague MR (1980) Image analysis via the general theory of moments*. JOSA 70(8):920–930 163. Teh C-H, Chin RT (1988) On image analysis by the methods of moments. IEEE Trans Pattern Anal Mach Intell 10(4):496–513 164. Tombre K (1997) Analysis of engineering drawings: State of the art and challenges. In: Graphics Recognition Algorithms and Systems. Springer, pp 257–264 165. Tombre K, Ah-Soon C, Dosch P, Habed A, Masini G (1998) Stable, robust and off-the-shelf methods for graphics recognition, vol 1, IEEE 166. Valveny E, Marti E (2000) Hand-drawn symbol recognition in graphic documents using deformable template matching and a bayesian framework. In: 2000. Proceedings. 15th International Conference on Pattern Recognition. IEEE, vol 2, pp 239–242 167. Viswanathan M (1992) Analysis of scanned documents a syntactic approach. In: Structured Document Image Analysis. Springer, pp 115–136 168. Vo Q, Lee G, Kim S, Yang H (2017) Recognition of music scores with non-linear distortions in mobile devices. Multimedia Tools and Applications:1–19. https://doi.org/10.1007/s1104 169. Wall K, Danielsson P-E (1984) A fast sequential method for polygonal approximation of digitized curves. Graph Image Process Comput Vis 28(2):220–227 170. Wenyin L, Qian W, Xiao R, Jin X (2001) Smart sketchpad-an on-line graphics recognition system. In: 2001. Proceedings. Sixth International Conference on Document Analysis and Recognition. IEEE, pp 1050–1054 171. Wenyin L, Zhang W, Yan L (2007) An interactive example-driven approach to graphics recognition in engineering drawings. Int J Doc Anal Recogn (IJDAR) 9(1):13–29 172. Xiaogang X, Zhengxing S, Binbin P, Xiangyu J, Wenyin L (2004) An online composite graphics recognition approach based on matching of spatial relation graphs. Doc Anal Recogn 7(1):44–55 173. Yadid-Pecht O, Gerner M, Dvir L, Brutman E, Shimony U (1996) Recognition of handwritten musical notes by a modified neocognitron. Mach Vis Appl 9(2):65–72 174. Yajie Y, Zhang W, Wenyin L (2007) A new syntactic approach to graphic symbol recognition. In: 2007. ICDAR 2007. Ninth International Conference on Document Analysis and Recognition. IEEE, vol 1, pp 516–520 175. Yang D, Webster JL, Renmdell L, Garrett Jr JH, Shaw DS (1993) Management of graphical symbols in a cad environment: a neural network approach. In: 1993. TAI’93. Proceedings., Fifth International Conference on Tools with Artificial Intelligence. IEEE, pp 272–279 176. Yang S (2005) Symbol recognition via statistical integration of pixel-level constraint histograms: a new descriptor. IEEE Trans Pattern Anal Machi Intell 2:278–281 177. Zahn CT, Roskies RZ (1972) Fourier descriptors for plane closed curves. IEEE Trans Comput 100(3):269–281 178. Zhang D, Lu G (2002) Shape-based image retrieval using generic fourier descriptor. Signal Process Image Commun 17(10):825–848 179. Zhang D, Lu G (2005) Study and evaluation of different fourier methods for image retrieval. Image Vis Comput 23(1):33–49 180. Zhang W, Wenyin L, Zhang K (2006) Symbol recognition with kernel density matching. IEEE Trans Pattern Anal Mach Intell 12:2020–2024 181. Zhang J, Du J, Dai L (2018) Multi-scale attention with dense encoder for handwritten mathematical expression recognition, arXiv:1801.03530 182. Zhu G, Doermann D (2009) Logo matching for document image retrieval. In: 2009. ICDAR’09. 10th International Conference on Document Analysis and Recognition. IEEE, pp 606–610
Multimed Tools Appl
Mr. Irshad Khan received his M.S degree in computer science from National University of Computer and Emerging Sciences Peshawar 2009. He is pursuing his Ph.D. degree in computer science from National University of Computer and Emerging Sciences Peshawar. He is currently an Assistant Professor at Sarhad University of Science and Information Technology, Peshawar, Pakistan. His area of expertise includes computer vision and image processing, pattern matching.
Dr. Naveed Islam has completed his PhD in computer science from University of Montpellier II, France in 2011. His research interests include digital image processing, computer vision, artificial intelligence, visual data protection and security. He is the author of numerous international journal and conference articles. He has worked as a computer vision R&D engineer at different organization in France. He is currently Assistant Professor at Islamia College University, Peshawar, Pakistan.
Multimed Tools Appl
Dr. Hafeez Ur Rehman Received his B.S. degree in computer science from Comsats Abbottabad Pakistan in 2006. He completed his Ph.D. degree in information and system engineering from Politecnico Di Tornio, Italy. His research interests include digital image process, bioinformatics, computer vision and artificial intelligence. He is currently Assistant Professor at FAST-National University of Computer and Emerging Sciences, Peshawar, Pakistan.
Dr. Murad Khan Received his B.S. degree in computer science from university of Peshawar Pakistan in 2008. He completed his Ph.D. degree in computer science and engineering from School of Computer Science and Engineering in Kyungpook National University, Daegu, Korea. Dr. Khan published over 40 International conference and Journal papers along with two books chapters in Springer and CRC press. He also served as a TPC member in world reputed conferences and as a reviewer in numerous journals such as Future Generation Systems (Elsevier), IEEE Access, etc. In 2016, he was awarded with Qualcomm innovation award at Kyungpook National University for designing a Smart Home Control System. He was also awarded with Bronze Medal in ACM SAC 2015, Salamanca, Spain, on his distinguished work in Multi-criteria based Handover Techniques. He is a member of various communities such as ACM and IEEE, CRC press, etc. His area of expertise includes ad-hoc and wireless networks, architecture designing for Internet of Things, and Communication Protocols designing for smart cities and homes, Big Data Analytics, etc.