Soft Computing (2018) 22:4971–4987 https://doi.org/10.1007/s00500-018-3191-0
FOCUS
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer clustering Sultan Ceren Oner1 · Ba¸sar Oztaysi1 Published online: 17 April 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018
Abstract Owing to the advancements in information and telecommunication technologies, mobile location-based services are able to use previously collected mobile check-in data. This multi-dimensional data can provide new opportunities for research problems such as establishing new platforms for location-based advertising and location-based personalized recommendations. In other words, location data of potential customers indicate personal interests and visiting preferences. In some cases, customers’ preferences could not be easily determined or predicted while considering visiting patterns of mobile users. In this respect, this study provides a novel retailer segmentation approach based on multi-criteria decision-making (MCDM) combined fuzzy data clustering. The proposed model consists of two phases: (1) an interval type 2 hesitant MCDM approach for the determination of location perceived value and (2) retailer (store) clustering via different product sale prices with fuzzy data-based fuzzy c means (FcM) clustering. Proposed approach enables the simplification of FcM clustering adaptation to non-symmetric fuzzy data using dissimilarity measure. Using this integrated approach, advertisers and recommender system suppliers will be able to manage their product-special offerings to customers considering retailer segments and shopping mall characteristics. Additionally, the proposed approach constitutes the infrastructure of location-based recommender systems under imprecise environment. Keywords Retailer clustering · Location clustering · Fuzzy c means clustering · Fuzzy data clustering · Interval type 2 hesitant fuzzy sets
1 Introduction Segmentation is one of the essential tools for targeted marketing and customer-centric decision making. As a new research paradigm, location-based services rely on geographical information determination of mobile users via new technologies such as Global Positioning System (GPS) and Bluetooth (via Beacons). These services pinpoint users’ real-time locaCommunicated by C. Kahraman. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00500-018-3191-0) contains supplementary material, which is available to authorized users.
B
Sultan Ceren Oner
[email protected] Ba¸sar Oztaysi
[email protected]
1
Industrial Engineering Department, Faculty of Management, Istanbul Technical University, 34367 Macka, Istanbul, Turkey
tions and provide valuable insights on customer visiting preferences. Additionally, location-based services offer an alternative gateway by providing the necessary infrastructure for sending special offers with regard to consumers’ preferences or previous visits (Lee et al. 2015). To present customer-centric opportunities for personalized marketing, effective segmentation conduces to competitive advantage by extracting new marketing directions with limited advertising budgets. Thus, location segmentation for specifying personalized properties from previously visited locations provides valuable insights of customer tendency to further potential visits (D’Urso et al. 2015). As a result of increasing penetration rate of telecommunication strategies, mobile location-based services can use mobile check-in data, geospatial data and customer comments for a specific place. These multi-dimensional data provide new opportunities for research problems such as establishing new platforms based on the prediction of customer future visiting preferences. These systems rely on time and preference-based services including monitoring
123
4972
the current position and determining the potential visiting locations of mobile users for searching appropriate suggestions or promotions. To conduct location-based systems, Junglas and Watson (2008) mentioned that two major steps are essential: location detection aimed models to investigate “potential” future locations from geospatial data processing and prediction algorithms considering previous search patterns, previous location visits, online profiles and online comments. For targeted marketing management activities, location detection and the estimation of future visits are fundamental processes to capture customers’ needs and dynamic changes in their interests. Because of location visiting, preferences can be determined from mobile devices, the context of the promotion or message can be defined consistently in order to keep communication channels alive. Therefore, companies are prone to send instant messages to their customers according to their recent locations. However, customers do not want to follow these instant offers because of previously pushed irrelevant offers and promotion context (Shin and Lin 2016). The other reasons for the failure of push-up messages are location privacy concerns of customers such as the disruption of simultaneous location and time detection. These reasons also cause a significant barrier to the penetration of location-based services and accidentally keystrokes (Pingley et al. 2012). Thus, companies are seeking for the adaptation of more notable and flexible services to satisfy personalized expectations without any disruption by reaching a broad spectrum of customers via personal and commonly held recommendations. However, the review of the literature indicates that users’ interests and expectations continuously change and the differentiation of these needs and interests can cause troubles to conduct capable recommender systems. Thus, location prediction and location segmentation exist as the most crucial topics in location-based systems regarding the reflection of user movements as a characteristic of customer preferences or needs (Fan et al. 2015). Additionally, in some cases, customers’ needs and visiting tendency cannot be directly or precisely detected due to imprecise location information or lack of data. From this point of view, leading research direction of this study is appropriately modeling retailer segmentation. The study composes two different approaches: (1) an interval type 2 hesitant MCDM approach for the determination of location perceived value; in this approach, data from Foursquare, Google and Facebook are integrated with real estate and sector-specific statistics; and (2) retailer (store) clustering via different product group sale prices using FcM clustering of fuzzy data. In the first phase, hesitancybased MCDM procedure is applied for the determination of location perceived value. Existing methods which are conducted under hesitant fuzzy sets (HFSs) assume that the membership has a set of possible crisp or exact type 1
123
S. C. Oner, B. Oztaysi
fuzzy numbers (T1FNs). On the other hand, these models cannot deal with the reflection of the complexity of socioeconomic environment when a set of possible crisp and exact values are inefficient or inadequate, or uncertainty about the membership function is appeared as indicated in Hu et al. (2015)’s paper. Additionally, type 2 hesitant fuzzy sets (T2HFS) can grasp the uncertainty and fuzziness by considering primary and secondary membership which enriches the theory of hesitant fuzzy sets. Moreover, T2HFS provides an indirect way to deal with hesitant fuzzy linguistic term set (HFLTS) because HFLTS cannot be handled in a direct way when vague information is presented in the hesitant fuzzy environment. Finally, the use of T2HFS overcomes the limitations of the computation process of HFLTSs which have a considerable transformation phase of linguistic evaluations. Therefore, T2HFS is preferred rather than HFSs when modeling vagueness in decision-making process of shopping mall perceived value procedure. In the second phase, sales price data are also presented with fuzzy numbers for providing the uncertainty of the variations in sales prices. This procedure will enhance the determination of the alternative retailer. Therefore, clustering is selected to divide heterogeneous data into homogenous subgroups concerning experts’ opinions and sales prices according to the similarities assigned by common characteristics. For this purpose, retailer clustering can be applied by grouping specific locations using sales prices and alternative retailer detection can be conducted concerning the similarities of shopping mall perceived values. In addition to that, retailer sales price data are also uncertain because of seasonal prices and special day discounts. Since the nature of the problem contains imprecise data and conflicting opinions from diverse shopping mall valuation experts, the clustering problem varies as a fuzzy clustering problem with fuzzy data, in particular, the fuzzy partition of clusters (Aliahmadipour et al. 2017). The remainder of this paper is structured as follows: Sect. 2 explains the brief concepts of location-based services. The third part presents basic concepts of fuzzy clustering and location-based clustering. The proposed methodology is presented with a numerical application, and comparative analysis is given for both an interval type 2 hesitant MCDM approach and FcM clustering with fuzzy data. Section 6 includes the conclusions and future directions.
2 Location-based services Location-based systems are described as the service or application that combine the utilization of the geographical location of the consumer in order to provide a service or a marketing message (Mobile Marketing Association 2011). In other words, location-based systems provide real-time
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer…
location data of consumers. For instance, location-based systems capture location data, ensure mobile connection to other mobile devices and send related content including promotions or attractive messages to mobile costumers when they are appeared in certain fields, such as shopping malls (Anagnostopoulos et al. 2015). Another aspect of location-based services (LBSs) is that they can be evaluated as a subset of web services that provide location-aware functions. The utilization of such services focused on extracting knowledge from where the services are constructed. Until this time, LBSs have been acquainted with a distributed mobile computing infrastructure where the geographical locations of users are specifically used for application-related optimization. From a different viewpoint, sensor data hide previous interactions between users and locations that can be available using an Internet connection or Bluetooth technology. For this reason, large-scale retail companies, such as Wal-Mart, have commonly transferred their retail activities to LBSs for raising brand awareness (Zou and Huang 2015). Other examples of LBSs include location-based advertising, coordination of traffic flow, natural disaster search and rescue, tourist route recommender systems, nearest available park and ride applications (Cheverst et al. 2000). The literature review indicates that LBSs can be defined as an emerging research topic in terms of indoor and outdoor navigation including location-based advertising and location-based mobile advertising (Li and Du 2012), mobile shopping (Yang et al. 2008), travel recommendation systems (Sun et al. 2013; Versichele et al. 2014), recognition of places for future predictions (Vu et al. 2009), group buying (Li et al. 2014) and social media-based recommender systems (Li and Li 2014). In addition to these studies, user satisfaction of LBS systems was investigated by Kuo et al. (2009), and a novel concept of a mobile ad hoc network was proposed by Ramya and Prasad Babu (2014) using circular data aggregation technique. From all these studies, automatic location-based applications have been used widespread that ease traceability of individuals’ physical moving in different indoor shopping fields. For this reason, personal positioning and segmentation algorithms acquire critical roles in detecting individuals’ positions and locations to make a practical analysis on the determination of group behavior of customers, shopping and visiting tendencies. In other words, location-aware systems provide following up customer’s shopping needs with location-dependent offers and promotions to cope with competition. Although remarkable advantages have been realized, location-based systems have some drawbacks which are emerged in practice. Two major problems are aroused as privacy issues and disruption of messages: customers do not prefer to send their location data to service providers and, generally, do not dispose to gather instant messages
4973
especially when they are not available in certain times. Additionally, the irrelevance of the message is another problem that causes incorrect predictions related to customers. In our case, retailer clustering necessitates collecting data from various sources to describe the relationship between customers, locations (shopping malls), retailers and their relationships with each other. In other words, clustering procedure provides desired services to satisfy customer needs considering customer tendencies (Gavalas et al. 2014). In this respect, the main research question of this study is stating personalized sales suggestions with respect to previously visited location value and also the determination of alternative retailers according to the similarities assigned by common location characteristics. For this purpose, location clustering can be applied for grouping retailers and alternative retailer suggestions can be conducted using the similarities appeared from the location clusters and product segments. Thus, a literature review is given in Sect. 3 for describing previous studies on location clustering.
3 Literature review The revealing of the advancements in mobile technologies and wide application of location-based technologies have triggered LBSs utilization. Data gathered from LBSs include precise or imprecise visiting similarities that can be useful for sending proper services to customers when they visit a specific location. These applications actually aim to reflect the purchasing decision considering location properties by applying clustering methods directly or indirectly. (Schilke et al. 2004). Therefore, location clustering under imprecise environment is essential for understanding the visiting behavior of customers. From this point, this section explains recent studies on location clustering and fuzzy clustering to provide a basic background to the readers.
3.1 Recent studies on location clustering The technological improvements in positioning technologies triggered the penetration of location-based services concerning user similarities. Today, LBS providers tend to direct their messages by taking account of location information that can be gathered from mobile device signals (Lin et al. 2016). The consolidation of location-based services with clustering techniques emerges from the need of understanding the customer purchasing decision-making process with respect to location, time, personal interests and current needs (Schilke et al. 2004). The current demand can be extracted from demographical information, consumer lifestyle and consumption habits and also from the former purchasing decisions (Shin and Lin 2016). Nevertheless, these factors do not directly influence the conclusive purchasing decisions or reactions to
123
4974
marketing messages. For this reason, researchers and practitioners endeavor to investigate other factors that can influence customer tendency such as geospatial data and search history of special products to cope with purchasing variations in customers’ final shopping decisions (Gavalas et al. 2014). Location clustering is an inevitable method for the successful adaptation of location-based recommender systems. In this respect, user preference similarities can be captured from previously visited location type, geospatial data and online ratings. To calculate the degree of similarity between locations, user preferences should be directly (with ratings) or indirectly (Bluetooth data or GPS signal) taken in advance and analyzed by using clustering and classification approaches. The differentiation of locations can be conducted from categorization, in other words, segmentation, by using supervised and unsupervised learning techniques such as cluster analysis, classification techniques, heuristic methods, regression, neural network, k nearest neighbor, decision tree, association rule mining (Park et al. 2012). These techniques provide the patterns of user diversification in order to analyze the variations in user habits for making realistic suggestions in acceptable time. Besides that, the computation time of the methodology is a substantive parameter to the adaptation of location clustering (Oztaysi et al. 2016). For better LBS performance or better analysis of the relationship between human mobility pattern and recommendation systems, measures for presenting frequently visited locations and characteristics of sub-locations in a specific location are required. As realized from literature review, previous research related to this topic is limited. Human mobility tendency is generally adapted in travel recommended system (Sun et al. 2013), location recommendation system (Versichele et al. 2014), location prediction (Vu et al. 2009). However, no accepted measures for location clustering can be indicated from these documents. Some of the studies mainly focused on users’ movements regarding the prediction of future locations without giving alternative locations (Vu et al. 2009). The other relevant work of location clustering is sequential pattern mining that temporally ordered itemsets are determined from frequent itemsets via support values from large user transaction information (Hipp et al. 2000). In addition to that, models based on mobile users’ velocity and direction of movement utilized probability distributions and Markov chains. On the other hand, they are sensitive to minor changes in user visiting tendency that prediction accuracy can be influenced substantially. Another drawback of these studies is the multi-scale issues that location visiting timestamps are ignored when customer entry information is missing. In this regard, one of the interesting results can be found in the research titled “Defining Measures for Location Visiting Preference” by Song and Choi (2015). According to this research, position frequency (PF) indicates the visiting frequency of a certain location and is similar to
123
S. C. Oner, B. Oztaysi
the term’s frequency. In addition to that, inverse document frequency (TF × IDF) structure is chosen as a powerful indicator to present the perceived value of a visiting location. Therefore, in our study, voters’ visiting frequency and visiting timestamp frequency of Foursquare are considered to determine the perceived value of the locations. Additionally, retailers are listed whether they opened a shop in a certain shopping mall via a pivot table. In this way, shopping mall perceived value calculated from MCDM process is linked with retailers’ sales prices which can be evaluated as an economic indicator for location value. Social reflections are gathered from Foursquare, Google and Facebook ratings, and also, real estate index of a relevant shopping mall is considered in MCDM process to demonstrate the perceived value of the shopping mall for performing healthier clustering results.
3.2 Fuzzy clustering and fuzzy c means clustering Clustering procedure can be explained as the method of splitting data into subgroups which are named as “clusters.” A considerable amount of techniques are available for clustering and purpose of these techniques is grouping input data/information to gather corresponding objects in a cluster and distributing different objects to alternative clusters (Han and Kamber 2001). Crisp clustering algorithms match input data to one specific cluster. On the other hand, fuzzy clustering algorithms assign an object to diversified clusters simultaneously with a membership degree (Oztaysi and Isik 2014). From this point of view, one of the most applied clustering algorithm for fuzzy clustering is FcM clustering that clusters should be determined in advance (Chen et al. 2014). Note that the input is a set of data or objects, each of which consists of different attributes or features. After that, cluster analysis can be performed using similarity or dissimilarity measures which are extracted from distance measurement such as Euclidean distance to define the similarity of given observations. A fuzzy partition matrix for extracting clusters is defined from Ruspini (1970) with the conditions given in the following: μik ∈ [0, 1] , 1 ≤ i ≤ c, 1 ≤ k ≤ N , c μik = 1, 1 ≤ k ≤ N ,
(1a) (1b)
i=1
0<
N
μik < N , 1 ≤ i ≤ c
(1c)
k=1
Equation (1b) defines the sum of each cluster should be equal to 1, and membership degree should be represented with an interval [0, 1]. The main goal of FcM clustering relies on the minimization of the corresponding objective function
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer…
which comprised of a nonlinear optimization problem:
J (Z , U , V ) =
c N
2 (μi j )m z j − vi
(2)
i=1 j=1
where Z is the data set needed to be partitioned, U represents the fuzzy partition matrix, and V is the cluster centers’ vector. As seen from the given formula, N represents the number of observations, μ denotes the related membership value, c is the number of appeared clusters, and m is the parameter called fuzzifier that identifies the fuzziness degree of the final clusters and fuzzifier parameter can get values greater than 1. Besides that, z j − vi denotes the distance from observation j to the center of cluster i. Note that the first step of FcM clustering algorithm contains gathering fuzzy partition
4975
matrix as U = [u i j ] matrix and U (0) denotes the fuzzy partition matrix appeared in the first phase. After that, center with U (k) by considering vectors V (k) = [vi ] are calculated the center vector formula vi =
N m i=1 μi j ·z j m i=1 μi j
N
. Again, fuzzy par-
tition matrix in kth step (U (k) ) is updated for the further step 1 as μi j = 2 with a considerable computac z j −vi m−1 k=1 z j −vk tional error δ. As seen from literature, fuzzy clustering is widely adapted to sentence similarity detection as seen in Devi and Gandhi (2015)’s study that “Page Rank” algorithm is combined with expectation-maximization (EM) framework. In addition to that, textual document archive clustering from Torra et al. (2005) provides an extension to fuzzy clustering techniques
Fig. 1 Steps of the proposed methodology
123
4976
using Gambal system-based visualization of documents. Real-time flood forecasting study from Ren et al. (2010) utilized a fuzzy clustering model with a back-propagation (BP) neural network training model. Sowmya and Rani (2011) evaluated image segmentation using FcM algorithm combined with possibilistic fuzzy c means (PFcM) algorithm and competitive neural network (CNN). Besides these studies, precision in agriculture (Fu et al. 2010) used a fuzzy clustering algorithm that is optimized by particle swarm optimization (PSO). Different from other topics, user profiling (Han and Chen 2009) and logistics enterprise evaluation (Fu and Yin 2012) are also investigated with FcM clustering. Additionally, some of the studies focused on the comparison of fuzzy clustering performance with respect other forms of fuzzy c means clustering (Gosain and Dahiya 2016) and the determination of the most proper number of clusters for fuzzy clustering (Erilli et al. 2011). Most of these studies used crisp data for converting input data to mutually exclusive subsets. On the other hand, in our case, the nature of the given problem contains imprecise data and conflicting prices from diverse retailers. From this point of view, the clustering problem varies as a fuzzy clustering problem with fuzzy data. Additionally, MCDM-based clustering can be beneficial for retailer clustering by assigning retailer location value to each retailer. Thus, the proposed methodology includes interval type 2 hesitant MCDM phase for location value detection and fuzzy product sales prices-based FcM clustering phase of for retailer clustering. The proposed methodology is given in the following section.
4 Proposed methodology The proposed methodology consists of two phases: The first phase performs (Hu et al. 2015)’s interval type 2 hesitant MCDM model for defining retailers’ store prestige in a specific shopping mall and the second phase uses shopping mall prestige scores to reallocate retailer sales prices. In the second phase, retailer sales prices are grouped according to the dissimilarity measure based on the methodology given in D’Urso and Giordani (2006) and Coppi et al. (2012). Finally, clusters are extracted using triangular fuzzy numbers. The schematic diagram of the proposed methodology for retailer clustering is given in Fig. 1. Definition 1 (Hu et al. 2015) An interval type-2 hesitant fuzzy set (IT2HFS) on the fixed X set is a function that maps
123
S. C. Oner, B. Oztaysi
a subset of interval type-2 fuzzy numbers (IT2FNs) when each x is matched in X . IT2HFS is denoted by a mathematical symbol G given as G = < x, h˜ G (x) > |x X , where h˜ G (x) is a set of IT2FNs, representing membership degree ˜ A˜ i of the element x X to the set G. h˜ G (x) = h˜ = { A˜ i h| U , aU , aU , aU ; H ( A U ), H ( A U )), (a L , a L , a L , a L ; ˜ ˜ = ((ai1 1 2 i2 i3 i4 i i i1 i2 i3 i4 H1 ( A˜ iL ), H2 ( A˜ iL )))} presents the interval type-2 hesitant fuzzy number (IT2HFN). Step 1 Consider multi-criteria decision-making problem by n number of relevant criteria set as C = {c1 , c2 , . . . , cn } and alternatives’ set as A = {a1 , a2 , . . . , am } using the criterion weight vector W = {w1 , w2 , . . . wn } taking into account of the constraint nj=1 w j = 1. Definition of the linguistic term set and linguistic expressions should be given in advance. Additionally, definition of the importance level of criteria using interval type 2 hesitant linguistic term set (IT2HFLTS) and corresponding values according to (Hu et al. 2015)’s study are utilized. Step 2 Gather k number of decision makers’ individual preference relations (B l ) for both criteria, sub-criteria and alternatives where l ∈ {1, 2, . . . , k} and express IT2HFN − + according to the lower and upper bounds as ([bil j , bil j ]). The pairwise comparison matrix will be given in application phase. After, transform hesitant fuzzy linguistic evaluations to interval type 2 hesitant fuzzy terms (IT2HFTs) to determine the relevant ratings h˜ i j where i denotes alternatives (Ai ) and j denotes criterion (c j ). IT2HFT-based H matrix is obtained as follows for ∀ j: ˜ A˜ i h˜ = A˜ i h|
U U U U , H2 A˜ U , = ai1 , ai2 , ai3 , ai4 ; H1 A˜ U i i
L L L L ai1 , ai2 , ai3 , ai4 ; H1 A˜ iL , H2 A˜ iL Step 3 Calculate pessimistic and optimistic preferences for each alternative and criterion. Define linguistic interval −main g + to aggregated preferutilities (bi = b i , bi ) according g g g ences as LPg = b1 , b2 , . . . , bn for the alternatives and main criteria. Aggregation of the decision makers’ individual preferences using interval type 2 hesitant fuzzy weighted average (IT2HFWA) operator is presented for calculating optimistic and pessimistic preferences as follows:
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer…
4977
n
w j h˜ i j (i = 1, 2 . . . m) h˜ i = IT2HFWA h˜ i1 , h˜ i2 , . . . , h˜ in = j=1
= ∪ A˜ i1 ∈h˜ i1 , A˜ i2 ∈h˜ i2 ,..., A˜ in ∈h˜ in ⎧⎛ ⎞⎫
n n n ⎪ ⎪ ⎪ ⎪ U U U −1 −1 −1 ⎪ w j l ai j1 w j l ai j2 w j l ai j3 , l , l , ⎟⎪ ⎪ ⎜ l ⎪ ⎪ ⎪ ⎬ ⎨⎜ ⎟ j=1 j=1 j=1 ⎜ ⎟ , × ⎜ ⎟ ⎟⎪ ⎪
n ⎪ ⎪⎜ ⎪ ⎪ ⎝ ⎠⎪ ⎪ , min j H2 A˜ U w j l aiUj3 ; min j H1 A˜ U l −1 ⎪ ⎪ ij ij ⎭ ⎩ j=1
⎧⎛ ⎞⎫
n n n ⎪ ⎪ ⎪ ⎪ ⎪ w j l aiLj1 w j l aiLj2 w j l aiLj3 , l −1 , l −1 , ⎟⎪ ⎜ l −1 ⎪ ⎪ ⎪ ⎪ ⎨⎜ ⎬ ⎟ j=1 j=1 j=1 ⎜ ⎟ × ⎜ ⎟ ⎜ ⎪ ⎟⎪
n ⎪ ⎪ ⎪ ⎝ ⎠⎪ ⎪ ⎪ w j l aiLj3 ; min j H1 A˜ iLj , min j H2 A˜ iLj l −1 ⎪ ⎪ ⎩ ⎭
(3)
j=1
˜ Step 4 Calculate the score s(h)(i = 1, 2, . . . , m) of h˜ for ˜ aggregated h i (i = 1, 2, . . . , m) using the score function definition given by Hu et al. (2015):
1 1 a1U + a4U ˜ ˜ score h = score A = 2 # h˜ ˜ ˜ # h˜ ˜ ˜ A∈h A∈h
U
U
L
L H1 A + H2 A + H1 A + H2 A + 4 ×
a1U + a2U + a3U + a4U + a1L + a2L + a3L + a4L 8
(4)
where # h˜ implies of the IT2HFN that A˜ ∈ h˜ and score h˜ ˜ ˜ equals to the crisp value. Note that if h i and h 2 are two IT2HFSs then, score h˜i ≥ score h˜ 2 implies h˜i ≥ h˜ 2 . According to the given definition, scores should be calculated for both pessimistic and optimistic values. Finally, an average of pessimistic and optimistic values indicate “final score” for alternatives according to each criterion and sub-criterion. Step 5 Construct dominance matrix using the difference between each preference relation to determining dominance degree (DDi j ) of alternatives. After that, perform Rodriguez et al. (2012)’s non-dominance rule to ith criterion and alternative using non-dominance degree (NDDi ) expression. Determining the alternatives’ scores (NNDDi ) after the normalization process is adapted as seen in Eq. 7.
DDi j = max 0, (B Ii > I j − B I j > Ii
(5)
NDDi = |min ((1 − DD1 ) , (1 − DD2 ) , . . . , (1 − DDn ))| where n = i
(6)
NDDi NNDDi = n i=1 NDDi
(7)
In some real-life cases, data could obtain hesitancy or sometimes crisp data clustering must be performed considering missing data. This imprecise information can be accepted as uncertain data (fuzzy data) or crisp data set concerning uncertain clusters (Aliahmadipour et al. 2017). Thus, FcM clustering algorithm should be properly adapted to fuzzy data. Before the implementation of clustering, location perceived value evaluation of shopping malls is calculated using total NNDDi scores. The procedure is described in the following. Step 6 (Shopping mall perceived value adaptation) Let SVni be the shopping mall perceived value matrix consists of total (NNDDi scores) using relevant N number of shopping mall including I number of retailer. If store appears in the relevant shopping mall, it gets the corresponding shopping mall perceived value. Besides that, if store does not appear in the relevant shopping mall, value gets
the corresponding be the 0. Additionally, let Y ≡ y˜i j = li j , m i j , ki j triangular fuzzy data for ith retailer with jth product sales price. To reflect shopping mall perceived value on the fuzzy data, the following operation is applied: ⎞ ⎛ y11 sv11 · · · svn1 ⎟ ⎜ ⎜ Y˜ = ⎝ ... . . . ... ⎠ ⎝ ... 0 · · · svni nxi yi1 ⎛
⎞ · · · y1 j . . .. ⎟ . . ⎠ · · · yi j ixj
(8)
where Y˜ ≡ y˜n j = ln j , m n j , kn j is the modified fuzzy sales price data of the retailer in a specific shopping mall. Note that, Y˜ is normalized for diversified sales prices of the product as seen below.
123
4978
S. C. Oner, B. Oztaysi
⎧ ⎨
⎫ ⎬ ln j mn j kn j , , = ln j , m n j , kn j Y˜ = ⎩ max k max m n j max ln j ⎭ nj
squared Euclidean distance to spread components is denoted 2 2 2 as an − an , bn − bn , dn − dn . w K and w S are suitable weights that constraint the distance measure as w K and w S ≥ 0 and w K + w S = 1 Note that the distance measure is calculated according to Agrawal (2015). Specifically, the objective function that must be minimized is given as follows:
(9) Step 7 (Fuzzy c means clustering to fuzzy data) The FcM algorithm for fuzzy data is executed by D’Urso and Giordani (2006) using LR type fuzzy numbers. Coppi et al. (2012) mentioned fuzzy and probabilistic FcM clustering that fuzzy data can be partitioned with segmentation variables and allocating membership degrees enable the presentation of vagueness while assigning elements to a specific cluster. Different from crisp clustering, fuzzy data-based FcM clustering considers center (centroid) and spread distances concerning a suitable weighing system. For clustering process, the distance measure between two units defined by Coppi et al. (2012) is given as follows:
min
n=1 c=1
(11) subject to: C
w2K
u nc = 1, u nc ≥ 0, w K ≥ w S ≥ 0, w K + w S = 1
c=1
where u nc denotes the membership degree of nth point according to cth cluster, λ > 0 is the component of fuzziness, d 2f y˜n , h˜ c indicates the dissimilarity degree between nth point and the centroid and $ cluster. The #
spreads of cth fuzzy vector h˜ c ≡ h cj = h cL1 , h cM1 , h cK 1 and h˜ c ≡
h cj = h cA , h cB , h cD ( j = 1, . . . , J ) represent both centroid and spread points of each cluster (c = 1, . . . , C) and h cL1 , h cM1 , h cK 1 denote minimum, medium and maximum points for the centroid of cth cluster and h cA , h cB , h cD represent the spreads for cth cluster, respectively. The iterative solution of the constrained optimization problem is given in the following:
−1 2 2 2 2 2 2 " λ−1 l1n − h cL1 + m 1n − h cM1 + k1n − h cK 1 + w2S an − h cA + bn − h cB + dn − h cD
!
c =1
n=1 c=1
2 2 2 w2K l1n − h cL1 + m 1n − h cM1 + k1n − h cK 1 2 2 2 " +w2S an − h cA + bn − h cB + dn − h cD
where y˜n ≡ y˜n j = ln j , m n j , kn j represents the fuzzy
data vector of nth object, l1n ≡ l1 j1 , . . . , l1 jn , . . . , l1 j N ,
m 1n ≡ m 1 j1 , . . . , m 1 jn , . . . , m 1 j N , k1n = k1 j1 , . . . , k1 jn , . . . , k1 j N , are the components of fuzzy data and the squared Euclidean distance between two triangular fuzzy 2 number for center components is presented as l1n − l1n ,
m 1n − m 2 , k1n − k 2 . an ≡ a j1 , . . . , a jn , . . . , 1n 1n
a j N , bn ≡ b j1 , . . . , b jn , . . . , b j N , dn =, d j1 , . . . , d jn , . . . , d j N are the components of spreads and the
C
C N
u λnc d 2f y˜n , h˜ c , h˜ c = u λnc
!
d 2f y˜n , y˜n ! 2 2 2 = w2K l1n − l1n + m 1n − m 1n + k1n − k1n 2 2 2 " (10) +w2S an − an + bn − bn + dn − dn
u nc =
C N
! w2K
−1 2 2 2 2 2 2 " λ−1 L1 M1 K 1 2 A B D + w − h + − h + − h − h + − h + − h m 1n k1n bn dn l1n S an
c
c
c
c
c
c
N h cL1 =
N N λ λ λ n=1 u nc l1n n=1 u nc m 1n n=1 u nc k1n M1 K 1 , hc = N , hc = , N N λ λ λ n=1 u nc n=1 u nc n=1 u nc
N h cA =
N N λ λ λ n=1 u nc an n=1 u nc bn n=1 u nc dn B D , hc = N , hc = N N λ λ λ n=1 u nc n=1 u nc n=1 u nc
(12)
N wK =
n=1
2 2 2 λ A B D u − h + − h + − h a b d n n n nc c c c c=1
C
2 2 2 2 2 2 N C λ L1 M1 K 1 A B D n=1 c=1 u nc l1n − h c + m 1n − h c + k1n − h c + an − h c + bn − h c + dn − h c
123
(13)
(14)
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer… Table 1 The decision criteria and sub-criteria for shopping mall perceived value determination
Foursquare rating
4979
Number of votes in foursquare
Number of uploaded photographs
Number of check-ins
Facebook rate
Number of comments in foursquare
Recommendation rate in Google
Number of comments on Facebook
Number of comments in Google
Voting day (just after visiting or not)
Variety spare time activities
Voting intention
Visiting day or hour
Visiting frequency of foursquare
Total number of visits (monthly)
Number of recommendations in foursquare
Average time spent in the location
Page traffic
Availability (transportation)
Variety of restaurants
Real estate index (TL/m2 )
Financial turnover
Change in rent (%)
Annual gyro
Variety of “on sales”
Average advertisement costs
Minimum rent for per m2
Annual gyro for each category
Variety of stores Overall area (m2 ) Available area (m2 ) Number of competitors in the same location
The fundamental assumption for triangular fuzzy numbers is triangular fuzzy numbers inherit their topology from observed data. Note that centroids are updated using weighted means of observed data and weights are calculated from membership degrees as indicated in D’Urso et al. (2015)’s study. The general algorithm for FcM clustering with fuzzy data is given as follows: Step 7.1 Gather membership degree matrix (U 0 ) randomly using the objective function. Step 7.2. Generate H˜ 0 matrix including centroids and spreads using the formulas of h cA , h cB , h cD according to the membership degree matrix (U 0 ). (T ) (T ) Step 7.3 Generate weights w K and w S by fixing U (T −1) and H˜ (T −1) where T represents the iteration number. Step 7.4 Update H˜ (T ) by fixing U (T −1) . Step 7.5 Update U (T ) using the formulas of h cL1 , h cM1 , h cK 1 ) (T ) by stating H˜ (T ) , w (T K and w S . Step 7.6 If U (T ) − U (T −1) < ε , stop the algorithm. Otherwise, go to Step 7.3. Section 5 presents the proposed methodology adaptation to a real case for retailer clustering.
5 Application As discussed in introduction part, location similarity detection is the fundamental process for customer mobility prediction-based studies and recommendation systems. In this study, location similarity is evaluated as a retailer segmentation process with a holistic view: (1) an interval
type 2 hesitant MCDM approach is used for the determination of location perceived value; (2) retailer (store) clustering is implemented to different product group sale prices using FcM clustering of fuzzy data. In the first phase, shopping malls are evaluated according to several criteria and sub-criteria to assign shopping mall perceived value. After that, retailer sales prices are updated according to this value. Finally, fuzzy data-based FcM clustering procedure is applied for the determination of retailer clusters as a prior phase of building location-based systems.
5.1 Data collection and data processing Data for this study are collected from diversified sources. For the evaluation of location perceived value, six main criteria and 23 sub-criteria are defined by shopping mall valuation experts. To give a general point of view, relevant locations are limited to three shopping malls which are named as A1 , A2 and A3 located in different regions of Istanbul. The main criteria and sub-criteria are given in Table 1. As seen from Table 1, shopping mall data contain Foursquare, Google and Facebook ratings and number of comments in Foursquare/ Facebook. These data could reflect anonymous users’ opinions about a specific location (see Fig. 2). Total number of visits according to last 3 months, availability, average time spent in the shopping mall, variety of stores for last 3 years, financial turnover (annual gyro) and change in rental value for last 3 years also affect the decision making of the perceived value of shopping malls. After that, three decision makers evaluated these criteria, and finally, shopping malls’ perceived values are extracted.
123
4980
S. C. Oner, B. Oztaysi
Fig. 2 Location information extracted from Google for location perceived value
In the second phase, retailer sales prices are collected with the assistance of 15 trained volunteers who investigated minimum, maximum and average prices including price discounts, and level of prices constituted as a triangular fuzzy number. The sales price data include the following categories: ready-to-wear (n = 53), menswear (n = 16), womenswear (n = 98), accessories (n = 32), shoes and bags (n = 59), electronics (n = 8), kids wear and toys (n = 16), home textile (n = 37), sportswear (n = 36) and cosmetics (n = 50). The data consist of four types of diversified products as presented in Table 2. For instance, retailers in “electronics” category are clustered according to the refrigerator, washing machine, notebook and mobile phone sales data. Note that before starting the clustering process, outliers of retailer sales price data are eliminated. Because the variables have diversified units of measurement, the corresponding centroids and spreads are standardized using mean and the standard deviation as seen from Coppi et al. (2012)’s study for eliminating missing or noisy data without losing necessary information.
5.2 Numeric application and results In the first phase, interval type 2 hesitant fuzzy decisionmaking procedure is implemented. Here, shopping mall perceived value is evaluated according to the relevant criteria and sub-criteria. Step 1 The main criteria and sub-criteria are determined as given in Table 2. Three famous shopping malls are assessed
123
for location perceived value determination. Criteria weights are accepted equally. Linguistic term set and semantics are defined according to Hu et al. (2015)’s study. Step 2–3 Three decision makers’ individual preference relations (B l ) for both criteria, sub-criteria and alternatives is gathered, and relations are expressed in the form of IT2HFN according to the lower and upper bounds as − + ([bil j , bil j ]). The collected preferences for major criteria are represented in Appendix A. After that, transformation of hesitant fuzzy linguistic evaluations of IT2HFTs is conducted and then, pairwise linguistic evaluations are obtained. Evaluations should be turned into numeric evaluations with respect to the corresponding IT2HFLTS as appeared in Hu et al. (2015)’s study. In order to prevent unnecessary explanations, we only present a brief explanation. For example, the pairwise comparison of the evaluations for “Number of Votes in Foursquare” and “Variety of stores” can be defined as [M, H] and [ML, H]. For [ML,H], the relevant interval type 2 hesitant fuzzy element (IT2HFE) can be assigned as (0.2325, 0.255, 0.325, 0.3575; 0.8, 0.8), (0.17, 0.22, 0.36, 0.42; 1.0, 1.0) ; (0.7825, 0.815, 0.885, 0.9075; 0.8, 0.8), (0.72, 0.78, 0.92, 0.97; 1.0, 1.0) and for [M, H], the IT2HFE is specified as (0.4025, 0.4525, 0.5375, 0.5675; 0.8, 0.8), (0.32, 0.41, 0.58, 0.65; 1.0, 1.0); (0.7825, 0.815, 0.885, 0.9075; 0.8, 0.8), (0.72, 0.78, 0.92, 0.97; 1.0, 1.0). Finally, aggregation of the decision makers’ individual preferences using IT2HFWA operator is conducted. For instance, aggregated preference relation for “Number of Votes in Foursquare “with respect to “Variety of
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer…
4981
Table 2 A sample of fuzzy data of retailer sales prices for accessories Retailer ID Diamond jewelry
Pearl necklace
Sunglasses
Watch
Minimum Average Maximum Minimum Average Maximum Minimum Average Maximum Minimum Average Maximum 1
0
0
0
0.002
0.003
0.01
0
0
0
0
0.001
0.001
2
0.001
0.023
0.094
0.026
0.054
0.126
0
0
0
0.007
0.019
0.085
3
0.002
0.02
0.152
0.044
0.184
0.827
0
0
0
0.004
0.02
0.073
4
0.002
0.019
0.261
0.049
0.13
0.262
0
0
0
0
0
0
5
0.002
0.021
0.285
0.078
0.13
0.268
0
0
0
0
0
0
6
0
0
0
0
0
0
0.009
0.105
0.658
0
0
0
7
0.002
0.019
0.073
0
0
0
0
0
0
0
0
0
8
0
0
0
0
0
0
0
0
0
0
0.005
0.769
9
0.016
0.094
0.646
0
0
0
0
0
0
0.111
0.436
0.917
10
0
0
0
0
0
0
0.022
0.072
0.231
0.001
0.002
0.005
stores” is calculated in the following according to the aggregation operator from Step 3. n
w j h˜ i j (i = 1, 2 . . . m) h˜ 24 = IT2HFWA h˜ i1 , h˜ i2 , . . . , h˜ in = j=1
!1
" (0.4025, 0.4525, 0.5375, 0.5675; 0.8, 0.8) , (0.32, 0.41, 0.58, 0.65; 1, 1) , (0.7825, 0.815, 0.885, 0.9075; 0.8, 0.8) (0.72, 0.78, 0.92, 0.97; 1, 1) ! " 1 (0.2325, 0.255, 0.325, 0.3575; 0.8, 0.8) (0.17, 0.22, 0.36, 0.42; 1, 1) , (0.7825, 0.815, 0.885, 0.9075; 0.8, 0.8) ⊗ 3 (0.72, 0.78, 0.92, 0.97; 1, 1) ! " 1 (0.2325, 0.255, 0.325, 0.3575; 0.8, 0.8) (0.17, 0.22, 0.36, 0.42; 1, 1) , (0.7825, 0.815, 0.885, 0.9075; 0.8, 0.8), ⊗ 3 (0.72, 0.78, 0.92, 0.97; 1, 1) ! " (0.294, 0.328, 0.405, 0.437; 0.8, 0.8) (0.223, 0.289, 0.444, 0.510; 1, 1) , (0.782, 0.815, 0.885, 0.907; 0.8, 0.8) = (0.72, 0.780, 0.920, 0.970; 1, 1) =
3
˜ (i = Step 4 Proper calculations for the score s(h) 1, 2, . . . , m) for h˜ i (i = 1, 2, . . . , m) are adapted by using the score function definition as seen below. The overall scores of pairwise comparison matrix are given in Appendix B. For example, score function result for “Number of Votes in Foursquare” according to “Variety of stores” is determined in the following way:
1 Optimistic score h˜ = score A˜ ˜ #h ˜ ˜ A∈h ! " 1 (0.792+0.907) (0.8+0.8+1+1) = + 2 4 # h˜ ˜ h˜ A∈
(0.72+0.780+0.920+0.970+0.782+0.815+0.885+0.907) = 1.479 8
1 Pessimistic score h˜ = score A˜ # h˜ ×
=
1 # h˜
! ˜ h˜ A∈
˜ h˜ A∈
(0.437+0.294) (0.8+0.8+1+1) + 2 4
"
(0.223+0.289+0.444+0.510+0.294+0.328+0.405+0.437) = 0.463 × 8
Note that the overall score for “Number of Votes in Foursquare” is the average of optimistic and pessimistic scores (0.9712). Step 5 Using dominance matrix by considering the difference between each preference relations to determine dominance degree of alternatives is given in Table 3. An example is performed for “Number of Votes in Foursquare” and “Variety of stores” as follows: DD24 = max (0,(B (I2 > I4 ) − B (I4 > I2 )) = max (0; (0.9711 − 0.5824)) = 0.389 Rodriguez et al. (2012)’s non-dominance rule is performed according to “Number of Votes in Foursquare” and relevant alternatives using the following expression: NDD2 = | min((1 − DD1 ), (1 − DD2 ), . . . , (1 − DDn ))| = | min((1 − 0.4700), (1 − 0.0711), . . . , (1 − 0.1747))| = 1
The non-dominance rule results are calculated as 0.527, 1.000, 0.444, 0.905, 0.905 and 0.850, respectively. Addition-
123
4982
S. C. Oner, B. Oztaysi
Table 3 Dominance matrix for main criteria Foursquare rating
Number of votes in foursquare
Total number of visits (monthly)
Variety of stores
Real estate index (TL/m2)
Financial turnover
Foursquare rating
–
0.000
0.000
0.000
1.905
1.635
Number of votes in foursquare
0.863
–
1.444
0.389
1.571
1.693
Total number of visits (monthly) 0.674
0.000
–
1.680
1.526
1.850
Variety of stores
0.000
0.000
–
0.000
0.000
Real estate index
1.527 (TL/m2 )
Financial turnover
0.000
0.000
0.000
1.905
–
1.070
0.000
0.000
0.000
0.477
0.000
–
ally, alternatives’ scores are gathered after the normalization process is adapted and normalized weights of the main criteria are extracted as 0.114, 0.216, 0.096, 0.195, 0,195, 0.183. Similarly, after the steps are followed for sub-criteria and alternatives, location perceived value scores are determined as appeared in Appendix C. As seen from the related table, location prestige scores are found as 0.527, 0.321 and 0.153 for A1 , A2 and A3 . Step 6 (Shopping mall perceived value adaptation) The shopping mall perceived value indicator matrix consists of total scores of relevant three shopping mall and retailer stores. If the retailer store does not appear in a specific shopping mall, the perceived value will be 0. For instance, if a retailer under the category of “accessories” does not appear in the shopping mall A1 , then the perceived value will be 0. From this point, matrix multiplication is performed to determine the perceived value added sales prices. A brief example considering the retailer prices for “accessories” is shown in the following: ⎡
⎤ 0 · · · 0.527 ⎢ .. . . . ⎥ ⎣ . . .. ⎦ 0 · · · 0.153 3×10 + ,. perceived values of shopping malls
⎡
(0, 0, 0) · · · ⎢ .. .. ×⎣ . . +
(0, 0.001, 0.001) .. .
(0, 0, 0) · · · (0.001, 0.002, 0.005) ,-
⎡
standardized sales prices
SSQ y˜n j , h˜c , y˜n j , h˜cj 2 2 2 = l1n − h cL1 + m 1n − h cM1 + k1n − h cK 1 2 2 2 A B D + an − h c + bn − h c + dn − h c .
⎤ ⎥ ⎦ 10×4
.
⎤
(0, 0, 0) · · · (0.002, 0.003, 0.006) ⎥ .. .. .. ⎦ . . . (0, 0, 0) · · · (0.007, 0.009, 0.01)) ,. +
⎢ =⎣
perceived values added sales prices
Note that fuzzy multiplication and summation procedure for triangular fuzzy numbers are adapted to obtain retailer perceived value added sales price matrix.
123
Step 7 (Fuzzy c means clustering to fuzzy data) The perceived value effect added sales prices are clustered with FcM clustering. For representing the clustering procedure, “accessories” sales data (n = 32) is processed considering c =3 clusters and p = 4 features (categories). Before starting iterations, we assumed that fuzziness coefficient is λ =2, w K = 0.6 and w S = 0.4, respectively. After processing four iterative steps of fuzzy clustering, the algorithm is converged with an acceptable error. As seen from Fig. 3, 12 of the observations are assigned in Cluster 3 (‘high prices-high quality jewelry’), and 4 observations are appeared in Cluster 2 (‘causal accessories’). Additionally, Table 4 presents the centroid and spread points. In fact, the structure of the prototypes is inherited by the observed data. Therefore, the prototypes are presented as triangular fuzzy numbers. The last columns of Table 5 contain
the sum of squares (SSQ) between the object y˜n ≡ y˜n j = ln j , m n j , kn j
and centroids h˜c = h cL , h cM , h cK and also, between spreads,
y˜n j = (a n j , bn j , dn j ) and h˜ cj = h cA , h cB , h cD as
Note that low SSQ values between objects imply clearly assigned object to a given cluster with corresponding centroids and spreads. For example, with respect to Cluster 2, objects with lowest SSQ values are appeared with higher membership degrees (more than 0.90) except for Retailer 13 and Retailer 18. Also from Table 5, all the membership degrees of the objects (retailers) which are clearly assigned (i.e., with membership degrees > 0.50) to Cluster 2 are very high (> 0.90) as the same holds for those objects assigned to Cluster 1 except for Retailer 5 and Retailer 20.
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer…
4983
Assigned clusters
3
2
1 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Observaons Fig. 3 Clusters after four iterations are progressed to “Accessories” sales data Table 4 Centroid and spread points of clusters Diamond jewelry
Pearl necklace
Sunglasses
Watch
Centroid CL1
(0.006, 0.029, 0.15)
(0.01, 0.031, 0.1)
(0.012, 0.031,0.083)
(0.007, 0.031, 0.136)
Centroid CL2
(0.027, 0.103, 0.404)
(0.033, 0.118, 0.348)
(0.054, 0.125, 0.314)
(0.029, 0.119, 0.437)
Centroid CL3
(0.015, 0.045, 0.105)
(0.014, 0.056, 0.148)
(0.031, 0.062, 0.147)
(0.014, 0.056, 0.165)
Spread CL1
(0.021, 0.074, 0.255)
(0.024, 0.087, 0.248)
(0.042, 0.094, 0.231)
(0.022, 0.087, 0.301)
Spread CL2
(0.054, 0.201, 0.619)
(0.054, 0.201, 0.619)
(0.054, 0.201, 0.619)
(0.003, 0.004, 0.016)
Spread CL3
(0.005, 0.027, 0.067)
(0.005, 0.027, 0.067)
(0.005, 0.027, 0.067)
(0.001, 0.004, 0.025)
Table 5 Membership degrees of each cluster for each object Retailer ID
Cluster 1
Cluster 2
Cluster 3
Total SSQ
1
0.001
0.996
0.002
0.2844
17
0.487
0.001
0.513
0.4134
2
0.996
0.004
0
0.1469
18
0.269
0.434
0.297
0.1031
3
0.005
0.99
0.005
0.1209
19
0.003
0.993
0.005
0.1156
4
0.661
0.335
0.003
0.5674
20
0.502
0.001
0.497
0.1054
5
0.526
0.469
0.006
0.6543
21
0.486
0
0.513
0.3870
6
0.028
0.93
0.042
0.1765
22
0.002
0.995
0.003
0.1321
7
0.497
0.001
0.503
0.5217
23
0.49
0.001
0.51
0.3499
8
0
1
0
0.0902
24
0.487
0
0.513
0.2944
9
0.032
0.927
0.042
0.1134
25
0.002
0.995
0.003
0.1669
10
0.002
0.995
0.003
0.1231
26
0.002
0.994
0.004
0.1109
11
0.002
0.993
0.004
0.1256
27
0.07
0.904
0.027
0.1674
12
0.002
0.997
0.001
0.1098
28
0.048
0.905
0.047
0.3143
13
0.301
0.657
0.042
0.2670
29
0.001
0.998
0.001
0.1765
14
0.006
0.988
0.006
0.1006
30
0.002
0.994
0.004
0.5217
15
0.006
0.014
0.979
0.1024
31
0.023
0.964
0.014
0.0902
16
0.497
0.001
0.502
0.6432
32
0.018
0.926
0.056
0.1134
5.3 Comparative analysis A comparative analysis is applied to demonstrate the validation of the methodology. For the first phase, the proposed method is compared with Rodriguez et al. (2013)’s study. The same problem considering the same data is used for identifying the differences between these two methods. The main criteria evaluations are utilized for the comparison process. In that case, same linguistic evaluations and linguistic term set are obtained as we present in Step 1 in the proposed
Retailer ID
Cluster 1
Cluster 2
Cluster 3
Total SSQ
methodology. Linguistic intervals are determined as given in the proposed methodology. According to the preference relations, dominance matrix is shown in Table 6. When the non-dominance choice degree NDDi is applied to the preference relation, results appear as 0.212, 1.000, 0.000, 0.000, 0.000 and 0.000, respectively. After the normalization process, the weights are calculated just for Foursquare rating and Number of votes in Foursquare. The other criteria weights are not determined because of the non-existed non-dominance matrix values. As seen in the results, Num-
123
4984
S. C. Oner, B. Oztaysi
Table 6 Dominance matrix of main criteria
Foursquare rating
Foursquare rating
Number of votes in foursquare
Total number of visits (monthly)
Variety of stores
Real estate index (TL/m2 )
Financial turnover
–
0.00
0.44
0.76
0.47
0.67
Number of votes in foursquare
0.79
–
1.00
1.00
1.00
1.00
Total number of visits (monthly)
0.00
0.00
–
1.00
0.50
0.65
Variety of stores
0.00
0.00
0.00
–
0.00
0.05
Real estate index (TL/m2 )
0.00
0.00
0.00
0.32
–
0.32
Financial turnover
0.00
0.00
0.00
0.00
0.00
–
Cluster with fuzzy data
Cluster with crisp data
Clusters
3
2
1 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 3 2
Number of observaons Fig. 4 Comparison of the extracted 3 clusters for “Accessories” data according to fuzzy and crisp data
ber of votes in Foursquare is selected as the most important criterion. Foursquare rating is represented as the second important criterion. Therefore, Rodriguez et al. (2013)’s study could not ensure the precise distinctions while dealing with three or more criteria. Just after gathering normalized dominance choice degree, the relevant alternative scores are determined as Alternative 1 (0.549) > Alternative 2 (0.271) > Alternative 3 (0.180) as realized from the same results extracted from the methodology. As a result of this comparison, the applicability of IT2HFS-based decision-making method proposed by Hu et al. (2015)’s paper is more comprehensive than that of most existing methods, as seen in Lee and Chen (2013) and Rodriguez et al. (2013)’s studies. Furthermore, interval type 2 hesitant decision-making method supported by Rodriguez et al. (2013)’s study has a substantial advantage over that proposed by Rodriguez et al. (2013) because of getting more accurate results using non-dominance choice degrees without information loss. Moreover, the transformation of the linguistic fuzzy terms into IT2HFS enables more accurate results rather than using T1FNs as discussed in Hu et al. (2015)’s study. For the discussion of the impact of FcM algorithm on fuzzy data, ready-to-wear (n = 53), menswear (n = 16), womenswear (n = 98), accessories (n = 32), shoes and bags (n = 59), electronics (n = 8), kids wear and toys (n = 16), home textile (n = 37), sportswear (n = 36) and cosmetics (n = 50) datasets are adapted to FcM algorithm with both fuzzy data and crisp data. Note that fuzzy number
123
is defuzzified at the beginning of the clustering process. In addition to that case, the proposed integrated clustering procedure is compared with the methodology given in D’Urso and Giordani (2006)’s study. The impact of using clustering approach with fuzzy data is tested using “accessories” dataset. As seen from Fig. 4, one could conclude that the diversification of assigned clusters varies concerning defuzzyfying the data at the beginning of the clustering procedure or proceeding with fuzzy data. According to the clusters gathered from “accessories” dataset, if clustering progresses with fuzzy data, objects (Retailers) are stationed more discriminably than using crisp dataset. For instance, Retailer 9 is assigned to Cluster 1 with crisp data. If fuzzy data are utilized, Retailer 9 will be in Cluster 3. This situation indicates the importance of information loss causing from defuzzyfying data before implementing FcM clustering. Considering other datasets with a considerable number of objects appear, such as the category “womenswear,” the diversification of assigned clusters is more observable and generally, can be more critical. The reason for this situation is a reflection of the objects’ distinctness before building personalized location-based systems as realized in Appendix D. For describing the impact of IT2HFSs-based decisionmaking procedure adaptation in FcM clustering process, the proposed methodology is compared with D’Urso and Giordani (2006)’s study. Since minimization of dissimilarity measure in clustering process is similar to D’Urso and Giordani (2006)’s study and these two models are based
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer… Table 7 Percentage of correctly classified objects (retailers) with membership degrees higher than u = 0.4, u = 0.67 and u = 0.90
Proposed model
4985
D’Urso and Giordani (2006)
u = 0.4
u = 0.67
u = 0.90
u = 0.4
u = 0.67
u = 0.90
n = 10
0.95
0.69
0.45
0.94
0.68
0.41
n = 45
0.94
0.56
0.35
0.93
0.53
0.35
n = 98
0.91
0.53
0.32
0.91
0.55
0.25
p=2
0.87
0.63
0.55
0.83
0.64
0.56
p=3
0.93
0.54
0.12
0.91
0.55
0.23
p=4
0.94
0.52
0.18
0.92
0.51
0.05
λ=2
0.93
0.81
0.34
0.90
0.80
0.33
λ=3
0.91
0.72
0.47
0.90
0.68
0.36
w = 0.5
0.86
0.78
0.48
0.86
0.76
0.44
w = 0.6
0.87
0.74
0.78
0.87
0.73
0.76
w = 0.7
0.89
0.67
0.65
0.88
0.65
0.68
on the minimization of dissimilarity measure, we consider dissimilarity as a comparison measure. The comparison is implemented to “womenswear” dataset, and eleven different data dimensions are used by increasing the number of objects (n = 10, n = 45, n = 98) and the number of features ( p = 2, 3, 4). The index of fuzziness is determined as λ = 2 and λ = 3, and also, centroid weights are assumed as w K = 0.5, 0.6, 0.7. Additionally, dataset is separated into 3 clusters and 4 iterations are conducted for each condition execution. u nc is assumed as 0.4, 0.67 and 0.9. Note that if cluster membership is higher, the model will correctly assign an object to a cluster. The recovering performance of the models is given in Table 7. On average, IT2HFSs-based decision-making procedure adaptation in FcM clustering process is successfully implemented and performed better than D’Urso and Giordani (2006)’s study. Obviously, the performance of the models is decreasing when the number of observations is increasing. In other words, when the membership degree threshold increases from 0.67 to 0.90, the number of well-classified objects decreases. The main reason for this situation is the effect of spreads’ discrimination to the objects rather than only dealing with centroids. Besides that, when the number of clusters is increasing from c = 2 to c = 4 with u = 0.4, our model outperforms D’Urso and Giordani (2006)’s study. On the other hand, the number of well-classified objects is higher in D’Urso and Giordani (2006)’s study under the conditions, u = 0.67 and u = 0.90 and c = 2, 3, 4. The fuzziness coefficient λ = 2 does not have a significant effect on the differentiation of the performance but for λ = 3, IT2HFSsbased decision-making adapted model performed better than D’Urso and Giordani (2006)’s study. Similarly, weights for dissimilarity distance for centroids do not influence the performance except wk = 0.7 for correctly classified objects. All in all, IT2HFSs-based decision-making adapted model performed better when increasing number of observations,
increasing number of clusters and increasing level of fuzziness exist. The mainspring of this situation is the reallocation process of retailer sales prices according to shopping mall perceived value. Unfortunately, when centroid weight wk increases from 0.5 to 0.6, the proposed model drops behind D’Urso and Giordani (2006)’s study.
6 Conclusions and future directions In this paper, a novel retailer segmentation approach based on MCDM combined fuzzy clustering data is proposed. The model consists of two phases: (1) an interval type 2 hesitant MCDM approach for the determination of location perceived value and (2) retailer (store) clustering via different product group sale prices with FcM clustering of fuzzy data. The required data are collected from Foursquare, Google and Facebook ratings, number of votes appeared in Foursquare, monthly total number of visits, variety of stores, real estate index and financial turnover. The reasons for conducting IT2HFS-based decision-making process are: (1) interval type-2 fuzzy sets enable more degrees of freedom in decision-making process when modeling the uncertainty compared with T1FNs; (2) compared with hesitant fuzzy sets, IT2HFSs can reflect uncertainty of inaccurate information by primary and secondary memberships, more efficiently as stated in Mendel and John (2002) and Hu et al. (2015)’s studies. Additionally, proposed approach fulfills the gap in the literature by adapting MCDM procedure to represent conflicting criteria that D’Urso et al. (2015) indicate in their paper and the method also enables FcM clustering approach adaptation to non-symmetric fuzzy data using dissimilarity measure. Reallocation of the sales price with shopping mall perceived value is another contribution for gathering realistic retailer segments. By using fuzzy data, retailer segmentation can be implemented more precisely without information loss
123
4986
rather than using defuzzified sales price data or averaging the sales prices. For future studies, the lack of providing the robustness and stability of the results is remaining as an issue for fuzzy data-based clustering problems. In addition to that, generalizing the results gathered from fuzzy data and extension of multi-way cases are needed by presenting with real-life problems. Another interesting point for utilization of other fuzzy clustering frameworks can be entropy-based clustering c-medoids and so on.
Compliance with ethical standards Conflict of interest All the authors declared that they have no conflict of interest. Ethical approval This article does not contain any studies with animals performed by any of the authors. Informed consent Informed consent was gathered from all individual participants included in the study.
References Agrawal V (2015) Novel fuzzy clustering algorithm for fuzzy data. In: 2015 Eighth international conference on contemporary computing (IC3), 20–22 Aug 2015 Aliahmadipour L, Torra V, Eslami E (2017) On hesitant fuzzy clustering and clustering of hesitant fuzzy data. In: Fuzzy sets, rough sets, multisets and clustering, volume 671 of the series studies in computational intelligence, pp 157–168 Anagnostopoulos C, Hadjiefthymiades S, Kolomvatsos K (2015) Timeoptimized user grouping in location based services. Comput Netw 81:220–244 Chen N, Xu ZS, Xia MM (2014) Hierarchical hesitant fuzzy K-means clustering algorithm. Appl Math A J Chin Univ 29:1–17 Cheverst K, Davies N, Mitchell K, Friday A, Efstratiou C (2000) Developing a context-aware electronic tourist guide: some issues and experiences. In: CHI ’00 Proceedings of the SIGCHI conference on human factors in computing systems, The Hague, The Netherlands, pp 17–24, 01–06 April 2000 Coppi R, D’Urso P, Giordani P (2012) Fuzzy and possibilistic clustering for fuzzy data. Comput Stat Data Anal 56(4):915–927 D’Urso P, Giordani P (2006) A weighted fuzzy c-means clustering model for fuzzy data. Comput Stat Data Anal 50(6):1496–1523 D’Urso P, Disegna M, Massari R, Prayag G (2015) Bagged fuzzy clustering for fuzzy data: an application to a tourism market. Knowl Based Syst 73:335–346 Devi MU, Gandhi GM (2015) An enhanced fuzzy clustering and expectation maximization framework based matching semantically similar sentences. Proc Comput Sci 57:1149–1159 Erilli NA, Yolcu U, E˘grio˘glu E, Alada˘g ÇK, Öner Y (2011) Determining the most proper number of cluster in fuzzy clustering by using artificial neural networks. Expert Syst Appl 38(3):2248–2252 Fan S, Lau RYK, Zhao JL (2015) Demystifying big data analytics for business intelligence through the lens of marketing mix. Big Data Res 2(1):28–32 Fu P, Yin H (2012) Logistics enterprise evaluation model based on fuzzy clustering analysis. Phys Proc 24(Part C):1583–1587
123
S. C. Oner, B. Oztaysi Fu Q, Wang Z, Jiang Q (2010) Delineating soil nutrient management zones based on fuzzy clustering optimized by PSO. Math Comput Modell 51(11–12):1299–1305 Gavalas D, Konstantopoulos C, Mastakas K, Pantziou G (2014) Mobile recommender systems in tourism. J Netw Comput Appl 39:319– 333 Gosain A, Dahiya S (2016) Performance analysis of various fuzzy clustering algorithms: a review. Proc Comput Sci 79:100–111 Han L, Chen G (2009) A fuzzy clustering method of construction of ontology-based user profiles. Adv Eng Softw 40(7):535–540 Han J, Kamber M (2001) Data mining concepts and techniques. Morgan Kauffman Publishers, Burlington, pp 5–33 Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. ACM SIGKDD Explor Newsl 2(1):58–64 Hu J, Xiao K, Chen X, Liu Y (2015) Interval type-2 hesitant fuzzy set and its application in multi-criteria decision making. Comput Ind Eng 87:91–103 Junglas IA, Watson RT (2008) Location-based services. Commun ACM 51(3):65–69 Kuo MH, Chen LC, Liang CW (2009) Building and evaluating a location-based service recommendation system with a preference adjustment mechanism. Expert Syst Appl 36:3543–3554 Lee LW, Chen SM (2013) Fuzzy decision making based on hesitant fuzzy linguistic term sets. In: Proceedings of the 5th Asian conference on intelligent information and database systems. Springer, Berlin, pp 21–30 Lee S, Kim KJ, Sundar SS (2015) Customization in location-based advertising: effects of tailoring source, locational congruity, and product involvement on ad attitudes. Comput Hum Behav 51:336– 343 Li K, Du TC (2012) Building a targeted mobile advertising system for location-based services. Decis Support Syst 54(1):1–8 Li J, Li L (2014) A location recommender based on a hidden Markov model: mobile social networks. J Organ Comput Electron Commer 24(2–3):257–270 Li YM, Chou CL, Lin LF (2014) A social recommender mechanism for location-based group commerce. Inf Sci 274:125–142 Lin TTC, Paragas F, Goh D, Bautista JR (2016) Developing locationbased mobile advertising in Singapore: a socio-technical perspective. Technol Forecast Soc Change 103:334–349 Mendel JM, John RB (2002) Type-2 fuzzy sets made simple. IEEE Trans Fuzzy Syst 10(2):117–127 Mobile Marketing Association (2011) Mobile location based services marketing whitepaper. Technical Report. Mobile Marketing Association Oztaysi B, Isik M (2014) Supplier evaluation using fuzzy clustering. In: Kahraman C, Oztaysi B (eds) Supply chain management under fuzziness: recent developments and techniques. Springer, Berlin, pp 61–80 Oztaysi B, Gokdere U, Simsek EN, Oner SC (2016) A novel approach to segmentation using customer locations data and intelligent techniques. In: Kumar A, Dash MK, Trivedi SK (eds) Handbook of research on intelligent techniques and modeling applications in marketing analytics, IGI Global, Hershey, PA, USA, pp 21–39 Park DH, Kim HK, Choi Y, Kim JK (2012) A literature review and classification of recommender systems research. Expert Syst Appl 39:10059–10072 Pingley A, Yu W, Zhang N, Fu X, Zhao W (2012) A context-aware scheme for privacy-preserving location-based services. Comput Netw 56:2551–2568 Ramya AR, Prasad Babu BR (2014) A novel concept of MANET architecture for location based service using circular data aggregation technique. Int J Innov Res Dev 3(1):252–8
An interval type 2 hesitant fuzzy MCDM approach and a fuzzy c means clustering for retailer… Ren M, Wang B, Liang Q, Fu G (2010) Classified real-time flood forecasting by coupling fuzzy clustering and neural network. Int J Sediment Res 25(2):134–148 Rodriguez RM, Martinez L, Herrera F (2012) Hesitant fuzzy linguistic term sets for decision making. IEEE Trans Fuzzy Syst 20(1):109– 119 Rodriguez RM, Martinez L, Herrera F (2013) A group decision making model dealing with comparative linguistic expressions based on hesitant fuzzy linguistic term sets. Inf Sci 241:28–42 Ruspini EH (1970) Numerical methods for fuzzy clustering. Inf Sci 2:319–350 Schilke SW, Bleimann U, Furnell SM, Phippen AD (2004) Multidimensional-personalization for location and interest-based recommendation. Internet Res 14(5):379–385 Shin W, Lin T (2016) Who avoids location-based advertising and why? Investigating the relationship between user perceptions and advertising avoidance. Comput Hum Behav 63(2016):444–452 Song HY, Choi DY (2015) Defining measures for location visiting preference. Proc Comput Sci 63:142–147 Sowmya B, Rani BS (2011) Colour image segmentation using fuzzy clustering techniques and competitive neural network. Appl Soft Comput 11(3):3170–3178
4987
Sun Y, Fan H, Bakillah M, Zipf A (2013) Road-based travel recommendation using geo-tagged images. Comput Environ Urban Syst. https://doi.org/10.1016/j.compenvurbsys.2013.07.006 Torra V, Miyamoto S, Lanau S (2005) Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system. Inf Process Manag 41(3):587–598 Versichele M, De Groote L, Bouuaert MC, Neutens T, Moerman I, Van de Weghe N (2014) Pattern mining in tourist attraction visits through association rule learning on Bluetooth tracking data: a case study of Ghent, Belgium. Tour Manag 44:67–81 Vu THN, Ryu KH, Park N (2009) A method for predicting future location of mobile user for location-based services system. Comput Ind Eng 57:91–105 Yang WS, Cheng HC, Dia JB (2008) A location-aware recommender system for mobile shopping environments. Expert Syst Appl 34:437–445 Zou X, Huang KW (2015) Leveraging location-based services for couponing and infomediation. Decis Support Syst 78:93–103 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
123