Similarity Measure of Multiple Sets and its Application to Pattern Recognition

Multiple set is a newborn member of the family of generalized sets, which can model uncertainty together with multiplicity. It has the power to handle numerous uncertain features of objects in a multiple way. Multiple set theory has the edge over the well established fuzzy set theory by its capability to handle uncertainty and multiplicity simultaneously. Similarity measure of fuzzy sets is well addressed in literature and has found prominent applications in various domains. As multiple set is an efficient generalization of fuzzy set, the concept and theory of similarity measure can be extended to multiple set theory and can be developed probable applications in various real-life problems. This paper introduces the concept of similarity measure of multiple sets and proposes two different similarity measures of multiple sets and investigates their properties. Finally, this work substantiates application of the concept of similarity measure of multiple sets to pattern recognition. A numerical illustration demonstrates the effectiveness of the proposed technique to this application.


Introduction
Various mathematical models are available in the literature to represent the concepts like uncertainty, vagueness and inexactness. Such models includes fuzzy sets , L-fuzzy sets [1], multisets [2], rough sets [3], intuitionistic fuzzy sets [4], fuzzy multisets [5], vague sets [6], multi fuzzy sets[7], etc. Each of these models has advanced into an elaborated theory and has numerous practical applications. [3] A fuzzy set is characterized by a membership function which assigns a grade of membership to each object in the universal set. Even though, the concept of fuzzy set is strong enough to handle uncertain data successfully, it can manage only one uncertain feature of the object at a time. Also, fuzzy set fails to handle the multiplicity of objects. Later, The notion of fuzzy multiset was defined as an extension of a fuzzy set. Fuzzy multiset gives fuzzy membership values for identical copies of each object. The main advantage of fuzzy multiset over fuzzy set is that it can handle the multiplicity of objects. However, it can handle only one feature of the object at a time. On the other hand, multi fuzzy set is also an extension of fuzzy set, and gives fuzzy membership values for different features of objects. The main advantage of multi fuzzy set over fuzzy set is that it can simultaneously manage numerous uncertain characteristics of objects, but fails to handle the multiplicity of objects. Recently, multiple set is introduced to model uncertainty together with multiplicity. The advantage of multiple set lies in the fact that it simultaneously accumulates numerous uncertain features of objects together with its multiplicity, in a better way. It was put forward by Shijina et al. [8,9] as a generalization of fuzzy set, multiset, fuzzy multiset and multi fuzzy set. Later, Shijina et al. [10,11] defined more operations, viz. aggregation operators and matrix norms on multiple sets. Then, the concept of relation on multiple sets is introduced and applied this concept in medical diagnosis problem [12]. As a continuation, this work is aspired as an attempt to extend the concept of similarity measure to multiple sets. Measuring the similarity between objects plays a crucial role in many real life problems involving image processing, image retrieval, image compression, pattern recognition, clustering, information retrieval problems, etc. Many measures of similarity have been proposed and researched in literature and it has been shown that similarity measure is proficient in coping with uncertain information. For example, the theory of fuzzy sets, introduced by Zadeh [13], is a successful approach in confronting uncertainty. Fuzzy set has enormous power to describe the objective world that we live in and the strength of fuzzy set has transpired in several real life applications. Zadeh himself initiated the idea of similarity measure of fuzzy sets [14]. Later, similarity measure of fuzzy sets has been explored widely by many researchers [15,16,17,18,19,20,21,22,23] and have applied them to real life problems involving pattern recognition [24], image processing [25,26,27,28,29,30], etc. As an extension of fuzzy set theory, intuitionistic fuzzy set theory has found to be highly useful in dealing with imprecision and uncertainty. Many different similarity measures between intuitionistic fuzzy sets have been proposed and are extensively applied to many areas such as decision making [31,32], pattern recognition [33,34,35,36,37,38,39], etc. As a combined concept of intuitionistic fuzzy set and interval valued fuzzy set, Atnassov [40] introduced interval valued intuitionistic fuzzy sets. It greatly furnishes the additional capability to deal with vague information and model non-statistical uncertainty by providing both membership interval and nonmembership intervals. Similarity measure of interval valued intuitionistic fuzzy sets was also proposed and it has found applications in pattern recognition and multi-criteria decision making [41]. Type-2 fuzzy sets, which is an extension of fuzzy sets was also proposed by Zadeh [42]. Their membership values are fuzzy sets on the interval [0, 1]. Type-2 fuzzy sets can improve certain kinds of inference better than fuzzy sets with increasing imprecision, uncertainty and fuzziness in information. Hung and Yang [43] presented a similarity measure of type-2 fuzzy sets based on the fuzzy Hausdor distance. There were further studies of similarity measures on Type-2 fuzzy sets [44,45,46] and have found applications in clustering [47,48,49], pattern recognition [50], students' evaluation [51], etc. Hesitant fuzzy set was first introduced by Torra [52] and Torra and Narukawa [53]. It permits the membership degree of an element to a set comprising of several possible values between 0 and 1. Hesitant fuzzy sets are very useful in dealing with situations where people are hesitant in providing their preference over objects in a decision making process. Therefore hesitant fuzzy set has played a significant role in the uncertain system and received much attention from researchers. Similarity measures of hesitant fuzzy sets [54] have been proposed, but it has not yet gained wide acceptance. Analogously, several similarity measures between sets have been proposed and have found many real life applications. But, here we will restrict our attention to the theory of similarity measures of fuzzy sets and its various applications, so that it can be explored to define the similarity measure of multiple sets. Before presenting the theory of similarity measure of fuzzy sets, it is desirable to have a short discussion on its application in day-to-day life. So, in the following, the potential of similarity measure of fuzzy sets in real life applications is reviewed. Weken et al. [25] gave an overview of similarity measures of fuzzy sets which can be applied to images. These similarity measures are all pixel-based and fail to produce satisfactory results consistently. To overcome this drawback,Weken et al. [26] extended their work to propose similarity measures based on neighbourhoods so that the relevant structures of the images are observed better. In his survey paper on similarity measures of fuzzy sets, Weken et al. [27] established measures for image comparison. The same authors presented an overview of the possible application of similarity measures of fuzzy sets to colour images in [28]. Nachtegael et al. [30] presented a color image retrieval system using a specific similarity measure of fuzzy sets. Li et al. [55] presented a faster algorithm on similarity measure using cen-ter of gravity of fuzzy sets in content-based image retrieval. The discussion in [55] nearly covers all the similarity measures of fuzzy sets, which may be greatly helpful to both the development and application of fuzzy set theory for content based image retrieval. Chen et al. [29] proposed a novel algorithm viz., normalized fuzzy similarity measure to deal with the nonlinear distortion in finger print images. Chaira and Ray [24] presented a region extraction algorithm to identify a color region similar to the query image from an image database containing images with different types of colors. Here, the matching process is based on similarity measure of fuzzy sets between the query image and the images in the database. Capitaine [56] proposed a general framework of designing similarity measures based on residual implication functions. They presented some new families of parametric similarity measures using parametric residual implications and modeled an algorithm to learn the parameter of each similarity measure based on relevance degrees. El-Sayed and Aboelwafa [57] introduced a new approach for face recognition based on similarity measure of fuzzy sets. Xu et al. [58] proposed a new similarity measure of fuzzy sets based on the extension of the Dice and cosine similarity measures and then applied the variation coefficient similarity to the emergency group decision-making problems. Also, they gave a practical example to evaluate the emergency management capability of major snow disaster in Hunan province of China. Baccour [59] applied similarity measures of fuzzy sets reported in existing literature to classification of shapes, mosaic recognition and Arabic sentence recognition. As discussed above, similarity measure of fuzzy sets have found widespread application in various fields such as image processing, pattern recognition, decision making, etc. Multiple set, which is an extension of fuzzy set, is capable of handling uncertainty and multiplicity simultaneously. Motivated by the benefits of similarity measure of fuzzy sets, this work intends to extend similarity measure to multiple sets. This paper proposes two different types of similarity measures-one is based on similarity measure of fuzzy sets; other one is based on the similarity measure of fuzzy sets and fuzzy aggregation operators. We strongly believe that similarity measure of multiple set can handle uncertain information in a better way. It must, therefore, have a better scope of real life applications. To substantiate our claim, we have applied the concept of similarity measure of multiple sets to pattern recognition, which is the first of its kind. The rest of the paper is organized as follows. In section 2, we briefly review some standard facts on multiple sets and the similarity measures of fuzzy sets. In section 3, we derive two interesting formulas for similarity measure on multiple sets and establish some of their properties. In section 4, we indicate how these techniques may be used to pattern recognition problems. In section 5, we end the paper by encapsulating the main conclusions.

Preliminaries
In this section, we first give some basic concepts related to multiple sets. Then, we proceed with a brief exposition of similarity measures of fuzzy sets. Throughout this paper, the following notations are used. R + = [0, ∞); X is the universe of discourse; | X | is the cardinality of X; capital letters A, B, C, etc. are fuzzy sets on X and also represents corresponding membership functions; A(x) is the fuzzy membership value of the element x in X; φ is the fuzzy set with all membership values equal to 0; I is the fuzzy set with all membership values equal to 1; M is the fuzzy set with all membership values equal to 0.5;Ā is the complement of fuzzy set A; F S(X) is the class of all fuzzy sets of X; P (X) is the class of all crisp subsets of X. Let M = M n×k ([0, 1]) denotes the set of all matrices of order n × k with entries from [0, 1] and for ∈ [0, 1], [ ] n×k denotes the matrix in M with all its entries equal to .
4. Join of M and N , denoted by M ∨ N , is a matrix in M defined by (M ∨ N ) ij = M ij ∨ N ij for every i = 1, 2, · · · n and j = 1, 2, · · · k.

Meet of M and N , denoted by
From this definition it can be noted that, M, ≤ , [0] n×k , [1] n×k is a bounded lattice.

Multiple sets
Multiple set is a unified structure to represent numerous uncertain features of objects simultaneously, in a multiple way. Multiple set utilizes distinct fuzzy membership functions to delineate each uncertain features of the object and assigns various values to each membership function according to the multiplicity. This is symbolized by assigning a matrix to each object, where each row in the matrix indicates distinct fuzzy membership function corresponding to each feature of the object. Further, entries in a row points out different values of the corresponding membership function according to its multiplicity. Multiple set can be defined as follows: Definition 2.2. Let X be a non-empty crisp set called the universal set and A 1 , A 2 , · · · A n be n distinct fuzzy sets of X. For each i = 1, 2, · · · n, A 1 are membership values of the fuzzy set A i for k identical copies of the element x ∈ X, in descending order. Then, multiple set A of order (n, k) over X is an object of the form where for each x ∈ X its membership value is an n × k matrix in M given by Note that, fuzzy sets A 1 , A 2 , · · · A n evaluates n distinct properties of objects and are called underlying fuzzy sets of the multiple set A. Further, each underlying fuzzy set A i corresponds to k fuzzy sets The set of all multiple sets of order (n, k) over X is denoted by M S (n,k) (X). It is perceived that a multiple set A of order (n, k) over X can be viewed as a function A : X → M, which maps each x ∈ X to its n×k membership matrix A(x) in M.
As an example, multiple set can be used to represent the evaluation of a set of students under the characteristics of intelligence, extra curricular activities, communication skill and personality by three experts.
is the universal set of students under consideration and there is a panel consisting of three experts evaluating the students under the criteria of intelligence, extra curricular activities, communication skill and personality. Then the performance of the students can be represented by a multiple set of order (4, 3) as follows: Definition 2.8. The complement of A is a multiple set in M S (n,k) (X), denoted asĀ, whose membership matrix for each x ∈ X is an n × k matrix, for every i = 1, 2, ..., n and j = 1, 2, ..., k.

Similariy measure of fuzzy sets
Being an important topic in the theory of fuzzy sets, similarity measure of fuzzy sets has been investigated extensively by many researchers from different point of view. But, there does not exist a unique definition of similarity measure of fuzzy sets. There do exist many special purpose definitions which have been employed with success in cluster analysis, pattern recognition, image processing, classification, diagnostics and many other fields. Recently, several similarity measures are proposed and used for various purposes. For example, Zwick et al. [15] reviewed 19 measures of similarity and compared their performance in a behavioral experiment. Xuecheng [16] systematically gave an axiom definition of similarity measure of fuzzy sets as:

For all
On account of this definition, Xuecheng proposed a similarity measure on the basis of a measurable function with respect to borel field B 1 : Let X = [0, 1] and for all A, B ∈ F , is a similarity measure on F . Pappis and Karacapilidis [17] presented three similarity measures as follows: (1) Measure based on the operations of union and intersection: (2) Measure based on the maximum difference: (3) Measure based on the difference and the sum of grades of membership: The authors summarized that similarity measures (2.2) and (2.4) satisfies the following properties: xcvbnm,.   Hyung et al. [18] proposed a similarity measure of fuzzy sets using maximum and minimum operators: (1) Measure based on geometric distance model: (2) Measure based on the set theoretic approach: (3) Measure based on the matching function [60]: They summarized that similarity measure (2.6) satisfies the properties (p1), (p2), (p4) and (p5) and fails to satisfy (p3), similarity measure (2.7) satisfies the properties (p1) and (p3) and fails to satisfy (p2), (p4) and (p5) and similarity measure (2.8) satisfies the properties (p1) to (p5). Later, Wang et al. [19] made a comparitive study of similarity measures. They commended on the study of similarity measures introduced by Pappis [17]. Also, they introduced a new class of similarity measures extracted from the work of Bandler and Kohout on fuzzy power sets [61], as: where I is any fuzzy implication operator. Wang [21] proposed two new similarity measures of fuzzy sets: They examined that similarity measures (2.10) and (2.11) satisfies the Definition 2.9. They also made a comparison between similarity measures put forward by them with that of [17] and [18]. Razaei et al. [22] developed a new similarity measure of fuzzy sets based on their relative sigma count:.
where A = φ or B = φ and also define S(φ, φ) = 1. They probed that this similarity measure satisfies the Definition 2.9 and also satisfies the properties (p1) to (p5).

Similarity measure of multiple sets
In this section, we first introduce the axiom definition of similarity measure of multiple sets. Let ξ (n,k) (X) be the subset of MS (n,k) (X), which is the collection of all multiple sets over X whose membership matrices are either In the following, we propose two similarity measures between multiple sets, one is based on the similarity measure of fuzzy sets; other is based on similarity measure of fuzzy sets and a fuzzy aggregation operator. Let S be any similarity measure of fuzzy sets satisfying the Definition 2.9. For multiple sets A and B in M S (n,k) (X), denote  Using the properties of fuzzy similarity measure and definition of similarity measure of multiple set the following properties can be proved easily:   Based on the similarity measure of fuzzy sets and fuzzy aggregation operator, we give a similarity measure formula for multiple sets as follows: Let S be any similarity measure of fuzzy sets satisfying Definition 2.9 and H be any fuzzy aggregation operator [62]. For multiple sets A and B in MS (n,k) (X), denote Theorem 3.6. S H (A, B) is a similarity measure between the multiple sets A and B in X.
Proof. Axioms (1) and (2) are obvious, respectively, from axioms (1) and (2)  Now, for any A, B ∈ MS (n,k) (X), we have for every i = 1, 2, ..., n. Therefore Combining equations (3.5) and (3.6), it follows that Axiom(4): Suppose A, B, C ∈ MS (n,k) (X), such that A ⊆ B ⊆ C. Then A j i ⊆ B j i ⊆ C j i for every j = 1, 2, ..., k and i = 1, 2, ..., n. Then, from axiom (4) of Definition 2.9 for fuzzy similarity measure S, we have for every i = 1, 2, ..., n. Therefore, and hence S H (A, B) ≥ S H (A, C). In a similar way, we can prove that S H (B, C) ≥ S H (A, C). That is, S(A, B) satisfies all the axioms of Definition 3.1. Thus S H (A, B) is a similarity measure between the multiple sets A and B in X.  Using the properties of fuzzy similarity measure and definition of similarity measure of multiple set the following properties can be proved easily:

Applications of similarity measures to pattern recognition
The capability of recognizing and classifying patterns is one of the most fundamental characteristics of human intelligence. Pattern recognition may be defined as a process by which we search for structures in data and classify these structures into categories such that the degree of association is high among structures of the same category and low between structures of different categories. There are three fundamental problems in pattern recognition. The first one is sensing problem which is concerned with the representation of input data obtained by measurements on objects that are to be recognized. In general, each object is represented by a vector, known as pattern vector, in which each component represents a particular characteristic of the object. The second problem is feature extraction problem, which concerns the extraction of characteristic features from the input data in terms of which the dimensionality of pattern vectors can be reduced. The features should be characterizing attributes by which the given pattern classes are well discriminated. The third problem is classification of given patterns. This is usually done by defining an appropriate discrimination function for each class, which assigns a real number to each pattern vector. Individual pattern vectors are evaluated by these discrimination functions, and their classification is decided by the resulting values. Each pattern vector is classified to that class whose discrimination function yields the largest value. Pattern recognition systems have found vast applications in many areas such as handwritten character and word recognition; automatic screening and classification of X-ray images; electrocardiograms, electroencephalograms, and other medical diagnostic tools; speech recognition and speaker identification; fingerprint recognition; classification of remotely sensed data; analysis and classification of chromosomes; image understanding; classification of seismic waves; target identification and human face recognition.
The utility of fuzzy set theory in pattern recognition was already recognized and the literature dealing with fuzzy pattern recognition is now quite extensive. In their position paper [63], Mitra et al. gave an outline to the contribution of fuzzy sets to pattern recognition. They mentioned that the concept of fuzzy sets can be used at the feature level in representing input data as an array of membership values denoting the degree of possession of certain properties; in representing linguistically phrased input features for their processing; in weakening the strong commitments for extracting ill-defined image regions, properties, primitives, and relations among them. Also, fuzzy sets can be used at the classification level, for representing class membership of objects, and for providing an estimate (or representation) of missing information in terms of membership values. As mentioned above, fuzzy sets are very effective in representing different patterns in pattern recognition. Since multiple set is a generalization of fuzzy sets and it has the capability to represent numerous features simultaneously, they are well suited to model patterns. In this section, we establish a new procedure for pattern recognition with the aid of similarity measure on multiple sets. Assume that there exist m patterns which are represented by multiple sets A r for r = 1, 2, ...m. Suppose that there be a sample to be recognized which is represented by a multiple set B. According to the principle of the maximum degree of similarity between multiple sets, we can decide that the sample belongs to the pattern A r with maximum S(A r , B). In the following, a fictitious numerical example is given to show application of the similarity measures to pattern recognition problems. Let three patterns be represented by multiple sets A 1 , A 2 and A 3 on X = {x 1 , x 2 , x 3 }, given by the following membership matrices;  Table 2; Now, the similarity measures S(A r , B) for r = 1, 2, 3, given by the Definition (3.4), based on similarity measures S 1 , S 2 and S 3 of fuzzy sets and fuzzy aggregation operators H = min, max or avg are given in tables 3, 4 and 5.
From the tables 2, 3, 4 and 5, we can see that S(A 1 , B) has the maximum value. The important point to note here is     that all formulae of multiple similarity measure mentioned here, results the same conclusion. Obviously, the sample B belongs to the pattern represented by the multiple set A 1 .

Conclusion
Similarity measure of fuzzy sets is a mature research field and has found applications in diverse areas such as pattern recognition, image processing, decision making, etc. Comparatively, similarity measure of multiple sets is a new topic. This paper deals with the similarity measure of multiple sets. Two formulas for similarity measure of multiple sets are proposed and their properties are investigated. This new concept is applied to pattern recognition problem and the suitability of proposed method is demonstrated using a numerical example. We believe that the concept can be extended to other applications such as image processing, decision making, etc. Investigation along these lines will be considered as a part of future work.