Contextual information fusion for intrusion detection: a survey and taxonomy

Research in cyber-security has demonstrated that dealing with cyber-attacks is by no means an easy task. One particular limitation of existing researc...

3 downloads 96 Views 2MB Size

Download PDF

Knowl Inf Syst DOI 10.1007/s10115-017-1027-3 SURVEY PAPER

Contextual information fusion for intrusion detection: a survey and taxonomy Ahmed Aleroud1 · George Karabatis2

Received: 23 March 2016 / Accepted: 2 February 2017 © Springer-Verlag London 2017

Abstract Research in cyber-security has demonstrated that dealing with cyber-attacks is by no means an easy task. One particular limitation of existing research originates from the uncertainty of information that is gathered to discover attacks. This uncertainty is partly due to the lack of attack prediction models that utilize contextual information to analyze activities that target computer networks. The focus of this paper is a comprehensive review of data analytics paradigms for intrusion detection along with an overview of techniques that apply contextual information for intrusion detection. A new research taxonomy is introduced consisting of several dimensions of data mining techniques, which create attack prediction models. The survey reveals the need to use multiple categories of contextual information in a layered manner with consistent, coherent, and feasible evidence toward the correct prediction of cyber-attacks. Keywords Context · Contextual information · Cyber-security · Netflows · Intrusion detection · Semantics

1 Introduction While computer systems are the backbone of Information Technology (IT), malicious computer attacks pose a real problem to IT whether they originate from a person, a group, or a country. Cyber-attacks continue to rise worldwide leading to loss or misuse of information assets, thus, costing organizations huge amounts of money each year. Over the past

B

Ahmed Aleroud [email protected] George Karabatis [email protected]

1

Department of Computer Information Systems, Yarmouk University, Irbid 21163, Jordan

2

Department of Information Systems, University of Maryland, Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD 21250, USA

123

A. Aleroud, G. Karabatis

decades, intrusion detection systems (IDSs) have been employed as a major deterrence against computer attacks protecting computer systems and networks. IDSs utilize logic operations, statistical techniques, and data mining approaches to identify different types of network activities [18]. Although modern IDSs are definitely useful and continue to improve, they still generate a high amount of false alarms and fail to identify unknown attacks. Most of the existing IDSs depend on techniques that work on low-level raw network data to detect cyberattacks [14]. A recent trend is to utilize knowledge-based IDSs, which store information about cyber-attacks and the corresponding vulnerabilities, and use this stored knowledge to guide the process of attack prediction. A significant limitation of knowledge-based IDSs is the lack of contextual information for attack prediction. Contextual information refers not only to the information about the configuration on the targeted systems and their vulnerabilities, but also to any relevant pre-conditions that must exist for an attack to succeed, the possible semantic relationships between the activities of attackers (at the time of these activities) and the targeted locations. The infusion of contextual information in the intrusion detection area has not been widely explored by many researchers. Furthermore, an imprecise notion of context is neither sufficient nor effective for attack prediction. This paper introduces a novel taxonomy that focuses on the extractable contextual elements in data analyzed by IDSs, it presents techniques to model it, and to use such context-aware models to identify known and unknown attacks. The contributions of this survey are the following: First, we propose a novel multidimensional categorization scheme for data mining-based intrusion detection approaches that focus on the applicability of contextual elements for attack prediction. Second, we provide a comprehensive overview and analysis of the approaches to extract, model, and use the contextual information in creating intrusion detection techniques. Third, based on the identified limitations, we recommend a new set of approaches, which utilize contextual information in the attack prediction process. The remainder of this paper is organized as follows. Section 2 presents a background on intrusion detection research. Contextual information fusion in IDSs is discussed in sect. 3. Section 4 provides insights on modeling and using contextual features for attack prediction and introduces our research taxonomy. A comprehensive review of data mining-based intrusion detection techniques is in Sect. 5. Section 6 discusses the limitations of the existing intrusion detection techniques from the perspective of using contextual information. Section 7 summarizes our findings.

2 Intrusion detection: a background Protecting information systems from network threats are one of the main challenges of all organizations. Most security mechanisms can be breached by unknown vulnerabilities and novel hacks. The latter has been defined as any “action the user of an information system takes when he/she is not legally allowed to” [134]. Powell and Stroud [212] have also defined an intrusion as “a malicious, externally-induced fault resulting from a successful attack.” An intrusion attempt has been defined as “a sequence of actions by means of which an intruder attempts to gain control of a system” [105]. The goal of intrusion detection is to determine that an intrusion has been attempted to gain unauthorized access to a system. In the same manner, [143] consider responding to malicious actions as a part of the intrusion detection process. The existing intrusion detection techniques combat cyber-attacks at two levels of protection, the network level and the host level. The network-based IDSs (NIDSs) monitor the features

123

Contextual information fusion for intrusion...

of network connections in order to detect cyber-attacks. Conversely, host-based IDSs (HIDs) monitor the status of workstations and the internals of a computing system using specific intrusion detection techniques to discover possible attacks at the host level. There have also been additional classifications of IDSs [26,64,65] based on other perspectives, such as the data the system analyzes (log files data, network data), the time of analysis (online, offline), and the distribution mode utilized in the analysis process (centralized, distributed). Machine learning researchers classify IDSs into three major categories: signature-based, anomaly-based, and hybrid IDSs [14]. A signature-based IDS measures the similarity between the events under analysis and the patterns of known attacks. Alarms are generated if previously known patterns are detected. For instance, the Snort IDS [223] is one of the commonly used signature-based IDSs. Snort performs real-time traffic analysis, content searching, and content matching to discover attacks using pre-identified attack signatures. While these systems are accurate in identifying known attacks, they cannot recognize unknown attacks. In anomaly-based IDSs, benign activity profiles are created and used by an anomaly detection technique to detect outliers that deviate from such profiles. Anomaly-based IDSs rely on statistical techniques to create attack prediction models. The main advantage of anomaly-based IDSs is their capability to detect unknown attacks that do not have existing signatures; however, their major limitation is the difficulty in creating benign activity profiles. Intuitively, activities that deviate from benign activity profiles are not necessarily attacks. Failure to identify the boundaries of benign activities leads to incorrect prediction of benign activities as attacks, potentially resulting in a high false-positive rate. The hybrid-based IDSs combine signaturebased and anomaly-based detection techniques to discover attacks. The major disadvantage of hybrid-based approaches is the computational overhead of using both signature matching and anomaly detection to analyze the incoming network connections.

3 Contextual information fusion in IDSs Contextual information has been utilized in different computing areas where it is vital to be aware of the current situation. According to several studies, the components describing contextual information of an entity can be categorized into individuality, activity, location, time, and relation [35,96,172,196,318] as shown in Fig. 1.

Fig. 1 Contextual information categories

Time

Locaon

Relaon Contextual Informaon

Individuality

Acvity

123

A. Aleroud, G. Karabatis

In order for an IDS to be aware of context, it is essential to utilize these five context categories. First, the location category reveals the physical or virtual information about location. The IDS has to be aware of the location of victims and attackers. The source of activities is quite important to identify suspicious events and discard false positives. Furthermore, semantic correlation with respect to source and target locations is necessary to discover multi-step attacks. Second, the IDS has to be aware of time which identifies the time of events that target an entity. For instance, the occurrence of two activities in several time intervals indicates a possible relationship between them. Third, activity describes events that are applicable to the system. This category of contextual information is a major element in the intrusion detection process. The information in this category covers all events that occur during the execution time of systems. The set of activities that target a system can lead to one or more cyber-attacks. In attack prediction models, the contextual features of an activity are necessary to identify relationships between suspicious events, given a specific situation. Fourth, the relation category describes dependency between multiple events. It is identified over the other categories of context such as, time, location and activity. It is very important to capture contextual relationships and use them in attack prediction. For example, if two alerts are similar in terms of time of occurrence, targeted locations, and activities that lead to them, then such a relation needs to be captured using a specific modeling approach (e.g., using a graph with nodes and edges). Fifth, the environmental characteristics of entities are identified through the individuality category. For instance, the current characteristics of computer systems, their applications, and the patches applied are considered significant to realize the impact of different activities on the targeted system. Some suspicious events are deemed as safe or benign, when the appropriate patches are applied. While many existing IDSs rely on utilizing data mining algorithms and statistical models to achieve the objectives above, there are several challenges when applying such algorithms without contextual information. Below we identify these challenges and provide insight on possible solutions: • Analyzing data at low level: Data for intrusion detection are collected from many sources, and it contains several features. Several data mining algorithms work on intrusion detection datasets, which contain data at the TCPDUMP level (Lee and Stolfo 1998) or at the alert level [181] including features such as source and destination IP addresses, port numbers, time stamps, and the duration of each connection. When data mining techniques are applied on such low-level features, they can produce a good description of an individual connection or a flow, but they need a broader context to decide with more certainty if a given connection is a suspicious activity or not. For example, while some of those techniques may consider individual connections as DoS attacks, security experts may not consider them as malicious by themselves. In general, connections from an external machine to a single port on a machine inside the network may be classified as malicious using traditional data analytics. Unless considering context to show that there is an attempt to map all active ports on that machine, the activity is considered benign. For this reason, relationships between events are important. In fact, a substantial amount of contextual information on security is organized in ontologies or taxonomies. Extracting such background knowledge is necessary to provide an intelligent view of the events analyzed by IDSs. • Availability of public intrusion detection datasets: There is a very low number of public intrusion detection datasets. Sharing network trace data is a very sensitive issue for any organization as everyone prefers to access real (not synthetic) network trace datasets to evaluate their research approaches, but nobody wants to reveal internal and possibly sensitive information to the public. A recent survey by [44] shows that about 95% of

123

Contextual information fusion for intrusion...

data mining techniques utilize the DARPA 1998, DARPA 1999, DARPA 2000, or KDD 1999 datasets. There have been several concerns about the validity of research that has been conducted using the DARPA dataset since it is a fairly large and outdated. Along the same vein, only very few researchers utilize flow-based intrusion detection datasets. The use of flow-based datasets to create attack prediction models has been criticized since a netflow does not contain a rich set of features compared to TCPDUMP data or DARPA datasets. Context-aware approaches are needed to convert raw data and create new labeled intrusion detection datasets. The creation of prediction models that have the capability to learn how attack signatures evolve using contextual relationships between known attacks in a labeled dataset is a challenging task, yet these discovered relationships can then be used to create new labeled synthetic intrusion detection datasets. • Large set of records and data dimensionality: The efficiency of intrusion detection process has always been a major challenge due to the high volume of network data, the increasing number of incoming connections, and the large dimensionality of the data (the large number of traffic features) [33]. The dimensionality of data has been addressed using dimensionality reduction techniques such as principal component analysis (PCA) and singular value decomposition (SVD). For instance, SVD has been utilized by [11] to identify unknown attacks. Apart from dimensionality reduction, sampling techniques have been applied at the flow and packet levels to reduce the volume of events analyzed during attack prediction time. The survey in Sperotto et al. [254] categorizes the sampling techniques into systematic and random. In the systematic category, the sampling is calculated solely based on the time interval (time driven) or the sequence of packet arrivals (event driven sampling). On the other hand, the random sampling is performed based on the probability distribution of the events in the dataset. Contextual relationships between suspicious activities that have been discovered within large amounts of intrusion detection data can be stored in an ontology or taxonomy used for classifying aggregated packets without analyzing each event individually. The use of contextual information increases the effectiveness of security decisions, and the efficiency and flexibility of the resource management process, and it increases the precision of intrusion detection. For instance, Fig. 2 shows that the amount of effort decreases when contextual information is utilized for security analytics. Traditional intrusion detection systems can still work without context; however, the prediction effort increases with time since the amount of data to be analyzed and the complexity of attacks increase. Let us take a look at a traditional network flow event data without context (Table 1) and with context (Table 2). If this is a bank network and the corresponding event is a transaction, then security analysts must decide whether such an event is suspicious or benign. Without contextual information the analysts must answer several questions such as: “is it allowed to conduct this transaction from a device located in Brazil?” Traditional IDSs may not classify this transaction as suspiWith Context

Without Context

Level of Effort

Fig. 2 Monitoring effort in context-driven security systems versus systems created without context (reproduced by permission from [92])

Time

123

A. Aleroud, G. Karabatis Table 1 Traditional network flow event data (reproduced from [241]) Start time

End time

Source address

Source port

Direction

2016-01-01 12:30:04

2016-01-011 12:30:34

192.168.1.1

12525

->

Destination address

Destination port

IP protocol

Duration

Flags

10.0.1.1

80

TCP

30

E

Source packets

Destination packets

Source bytes

Destination bytes

5

53

384

12453

192.168.0.0/16 is a corporate network

Table 2 Network flow event data with context (reproduced from [241]) Start time

End time

Source address

Source port

Source network

2016-01-01 12:30:04

2016-01-011 12:30:34

192.168.1.1

12525

Unused—192.168.1.0192.168.1.255/Brazil

Direction

Destination address

Destination port

Destination network

IP protocol

−>

10.0.1.1

80

Bulgaria

TCP

Duration

Flags

Source packets

Destination packets

Source bytes

30

E

5

53

384

Destination bytes

Alert

Asset tags

12,453

Destination address on malware watch list

Unknown

cious. Without adding location information, the IDS may not be able to determine whether such a transaction is fraudulent. The question, however, remains “how to use contextual information in data mining algorithms to create intrusion detection techniques?” In order to answer this, contextual information needs to be available, and then it should be transformed and processed to create intrusion detection techniques. There are two major sources of contextual information as shown in Fig. 3, internal and external [92]. Internal information identifies the internal characteristics of the computing infrastructure, whereas external information identifies features collected from the surrounding environment. Internal information such as software and application information needs to be transformed into a specific format such as filtering rules and then used with IDSs to decrease false alarms when, for instance, the application has been recently patched. An example is the intrusion detection description language (IDDL) which is proposed by [259] to transform the contextual information into a format that can be recognized by intrusion detection techniques. Using this language, several concepts that represent contextual information are extracted, parsed, and represented as shown in Table 3. Contextual information modeled using IDDL language is used to create a reasoning engine which produces a set of prioritized alerts. IDDL has the capability to discover the most severe and complex attacks based on context. IDDL is useful to create intrusion detection techniques that rely

123

Contextual information fusion for intrusion...

Internal Information • Network addresses • Device types • System vulnerabilities • Privileged users • Application and information intelligence • Web application firewalls • Business environment

External Information • GeoIP data • White/blacklists • Known malicious hosts • Threat intelligence • Sensor data and alerts • Attack ontology • Common vulnerabilities and risk scores • Existing tools and IDSs

Fig. 3 Sources of contextual information

on logical reasoning. However, in order to detect different types of attacks, one should not only rely on correlations discovered using logical reasoning. To handle such a limitation, the authors presented correlation analysis techniques using Bayesian networks. Compared to this set of transformations, [13,16] propose several techniques to model both internal and external contextual information. A summary and description of their models are provided in Table 4. Each concept can be modeled using several techniques. For instance, a node n i which represents a specific type of attacks can be represented as a feature vector (i.e., attack profile) or a node on a graph. The models presented by [13,16] include Semantic Link Networks (SLNs) which are created as graphs and used to discover correlations among alerts. The created prediction models are then utilized at run time on top of an IDS to perform several tasks as shown in Fig. 4, such as expanding the predictions the IDS using SLNs, followed by filtering out irrelevant predictions using the created attack and host profiles.

4 Using contextual features for attack prediction Context-aware IDSs should minimize the dependency on human experts who need to perform correlation among run-time activities to determine the suspicious events and then react to avoid damages. To achieve this, contextual features need to be extracted, modeled, and used by data mining algorithms as discussed in the following sections.

4.1 Context and contextual features There have been several definitions of context. Context is a dynamic grouping mechanism that encloses all instances that are related to a particular situation. According to [43] context refers to the location, the identities of objects around an entity, the time of day, season, temperature, etc. [31]. Dey [69] views context from the perspective of obtaining meta information or contextual features (as shown in Fig. 5) that are related to an entity. Schilit et al. [234] consider the neighborhood of an entity, its physical location, and the activities that target it as part of its context. Based on the definitions of context and the categories of contextual information we introduced earlier, context is defined as follows [16]. Definition 1 [Context (ct)]: The context ct is a combination of features f 1 : di , . . . , f m : d j that identify the settings or pre-conditions under which one or more consequences N =

123

A. Aleroud, G. Karabatis Table 3 Representing security concepts using IDDL [259] Concept

Description

Information source

Representation

Network

A computer network which has an identifier called network address

Internal

Network ∀ netaddress.String = 1 netaddress

Node

Any computer that is connected to the network

Internal

Node ∀ nodeaddress. String = 1 nodeaddress ∀ hasNodeNet. Network

Gateways

Nodes that interconnect more than one network. Each gateway is part of more than one network

Internal

Gateways Node > 1 hasNodeNet. Node ¬ Gateways = 1 hasNodeNet

Software

Each software is decoded using its name, version, type, and an architecture

Internal

Software SoftwareName. String = 1 SoftwareName SoftwareVersion. String =1 SoftwareVersion SoftwareType. String = 1 SoftwareType SoftwareArchitecture. String = 1 SoftwareArchitecture

Process

A software executed by a user

Internal

Process ∀ hasSoftware. Software = 1 hasProduct ∀ hasUser. User = 1 hasUser

Service

A process listening on one port

Internal

Service ∀ hasProcess. Process =1 hasProcess ∀ port. Integer = 1 port

System vulnerability

A weakness in a system

Internal

Vulnerability ∀ sevirity {high, medium, low} ∀ requires {remote, local, user} ∀ losstype{confidentiality, integrity, availability, privilege escalation} ∀ published. Date

Alert

A suspicious activity

External

Alert ∀ messageId. String = 1 messageId hasCreateTime. Time =1hasCreateTime hasDetectTime. Time ≤1hasDetectTime hasAnalyserTime. Time ≤1hasAnalyserTime hasAnalyser. Analyser =1hasAnalyser hasSource. Source hasTarget. Target hasClassification. Classification =1 hasClassification hasAssessement.Assessment ≤1hasAssessement hasAdditionalData. AdditionalData

123

Description

An alert or a benign activity

Any computer connected to a network with hosts that run applications, services, processes, and include vulnerabilities

Concept

Node n i

Computer system

Internal

Feature vector

External

β

γ

Host profile (Conjunctive Normal form)

ExistOS (Redhat linux5.0), CausedByVulnerability (CVE-1999-1386) → Warezclient Attack HasOS (redhat linux5.0) has Application (Apache server 1.3.1) Patched (CVE-1999-1199) Patched (CVE-1999-0513)

P (W ar ezclient) = (−0.82· f 1 )+(0.67· f n ) + 0.15 Where f 1 , f n are the features of incoming connections

Attack profile (linear discriminant function Attack profile (host-based filter)

Ser vice_is(Ft pd ata), N o_ o f _Shell(0), N o_ o f _ f iles_accessed(0), Flag( S F ) Logged_in(1) → Warezclient Attack

n i → n j , n j → nr ⇒ n i → nr |α, β, γ are numerical weights on semantic links and α · β ⇒ γ.

α

A triple denoted by(N ode, SemanticLinks, Rules) A weighted semantic link li is represented as α li : n i → n j A Rule is a reasoning mechanism on semantic links, denoted by

Vni = [ f 1 . . . , f m ]

Representation

Attack profile (rule-based profile)

Semantic Link Network schema

Model

Information source

Table 4 Representing security concepts using the formal models in [13,16]

Contextual information fusion for intrusion...

123

A. Aleroud, G. Karabatis

Semantic Link Networks

IDS

Incoming activities

r1

n1

r3

n3

r4

n4

Attack& host Profiles n1 n2

n1

n3

n2 n3

Filtering

Expansion

Initial prediction Similarity

f1

f2

fm

Fig. 4 Modeling and using contextual information for attack prediction [13,16]

Contextual Feature 1

Enty

n

. . .

Context

Contextual Feature n

Fig. 5 Context and its features

{n 1 , . . . , n k } have high probability to take place in a particular location |N ⊆ N , where N is the set of all possible consequences and k < p| p is the number of consequences. The definition above describes context as a set of features that can enable one or more consequences which can be specific types of attacks or benign activities [73]. Feature f characterizes the context ct and d j refers to the type of that feature. The type of d j can be an activity, individuality, time- or location-based. When one or more of these feature types are used in context-aware systems, they can have one or both of the following two roles: (1) creating prediction models to predict or filter out specific consequences of context, and (2) generating relationships between consequences. Activity features (e.g., the features of network connections) represent the events that can lead to one or more types of attacks. Individuality features describe the current configuration on the target hosts based on which one or more attacks are possible. Activity features are used to measure the similarity between two consequences and based on events that enable them. The similarity between n i and n j is a measure of association between n i and n j based on their co-occurrence. Several measures have been utilized to discover similarity in the intrusion detection area. Pearson’s correlation and Anderberg similarity measures are representative examples. Each one of them works on different types of feature vectors. A feature vector is an n-dimensional vector of numerical features representing a specific attack or a benign activity. Pearson’s correlation is applied on feature vectors that contain continuous values, and it has been widely used in intrusion detection research [32,217]. Pearson’s coefficient (PC) between two nodes n i and n j is defined as the covariance of their feature vectors cov(σVni , σVn j ) divided by the product of their standard deviation σVni × σVn j and the formula is shown below.

123

Contextual information fusion for intrusion...

PCn i ,n j =

cov(σVni , σVn j ) σVni × σVn j

=

e (Vn i − μVni ) Vn j − μVn j σVni × σVn j

.

(1)

The Anderberg similarity measure is applied on binary feature vectors and yields similarity values between 0 and 1 [37]. The inputs to Anderberg coefficient measure are binary vectors of zeroes and ones; each feature vector Vn i can be converted into a binary vector by applying cutoff data transformation techniques [209]. Weller-Fahy et al. [284] survey appropriate distance measures for intrusion detection research. They show that failure to identify a suitable distance measure is a major problem of existing research on intrusion detection. They suggest using specific measures based on the characteristics of the intrusion detection experimental data and the type of the proposed detection techniques as follows: • Mahalanobis distance measure: used in high-throughput environments for quick computations during heavy loads. Mahalanobis distance measure is also useful for anomalybased intrusion detection techniques since it captures the variance within the sampled data, allowing anomalies to be accurately identified. • Bhattacharya coefficient: used for feature selection to distinguish one type of attacks from others. • χ 2 -distance: used to measure the distance between the matrices of an original network data and some generated datasets. • Euclidean distance: used after applying distribution functions such as discrete Fourier transform on the features of network connections. This transformation is useful to handle some problems with applying the Euclidean distance measures directly to the features of network connections. • Conditional entropy: used as a measure of an individual flow, as such, it is a good measure for flow-based intrusion detection techniques. Time and location features are dynamic; thus, they are not widely used as prediction features (e.g., predicting suspicious activities based on the time of the day). In particular, time and location features can be utilized to discover temporal associations between network activities. Temporal features are also used to discover specific types of anomalies called contextual anomalies which are considered attacks if discovered at a specific time but may be not otherwise [112]. For instance, the flows at the twentieth second in Fig. 6 are a denial of service (DoS) attempt if they target an academic registration database during the weekend, but are considered benign during the days of course registration. Discovering intrusions based on the time aspect of context requires handling time-series data. There are many challenges that need to be addressed when handling time-series data [50]. First, classifying such data when only a small portion of a time window is suspicious

Data Transfer Rate /Mbps

T ime Fig. 6 Discovering DoS attacks as contextual anomalies

123

A. Aleroud, G. Karabatis

Data collection

Data pre-processing

Contextual information identification and mining

Contextual model creation

Contextual model application and usage

Fig. 7 Contextual information processing phases in IDSs

while other activities in the same window are benign is a difficult task. Second, identifying the best size of a time window. Third, dealing with the differences between the length of time sequences in the training and testing data. Fourth, selecting the most appropriate similarity measure that leads to the highest attack detection rate when analyzing time-series data. Intrusion detection with time-series data requires three types of transformations to be performed in order to create prediction models: First, data aggregation is needed for dimensionality reduction to improve the efficiency of the prediction models to be created [50]. Second, discretization is needed to discover similarity across different time windows [17]. Third, data scaling is needed to avoid data noise problems. Once these transformations are performed, prediction models are created to discover attacks using three major techniques: • Window-based techniques: the training and testing data are divided into time windows of a specific length, and then an anomaly is discovered using similarity between a testing and a training window [59]. The major challenge in this category of techniques is the selection of the best window size. • Time-series statistical models: they rely on statistical models which presume that a benign time series is generated from a statistical process, but the anomalous data points do not fit into that process [260,273]. Other techniques model the benign data using Hidden Markov Models and classify any time-series data that does not fit into such models as attacks. • Frequency-based techniques [221]: correlation coefficients are calculated between successive windows based on the frequency of specific features. A sudden change in the correlation patterns between successive windows raises an alert.

4.2 Contextual information processing in context-aware systems Data mining-based context-aware systems process information in five fundamental phases as shown in Fig. 7. The phases are: data collection, data preprocessing, contextual information identification and mining, contextual model creation, and applying/using these models for prediction tasks. Data collection is carried out automatically using sensors, through activity monitoring, or manually by system administrators. Once the data are collected and documented, data preprocessing tasks are performed, which include selecting the most significant features, discarding the less significant ones, and feature discretization. The next three phases, namely contextual information identification and mining, creating contextual models, and using them, are the major ones in processing contextual information in context-aware systems; they are the main focus of the following sections.

4.2.1 Contextual information identification and mining During this phase, we identify the main contextual features of an entity (e.g., computer systems) at a specific point in time. The contextual features include, but are not limited to, the activities that target an entity, the time and location of these activities, the relationships between them, and the characteristics of the entity itself. Once the contextual information is identified, mining is performed on the collected features to create context-aware prediction

123

Contextual information fusion for intrusion...

models. There are several existing techniques to create these models including the manual approaches by [39] to perform query answering from multiple databases and [95] to predict user activities. Manual approaches pose some limitations regarding the quality of the created prediction models. Recent approaches have been proposed to automate mining contextual information by utilizing data mining techniques [91]. In particular, mining contextual information is performed using several techniques such as graph mining [233], anomaly detection [253], classification [91,294], clustering [297], and association rules mining [285].

4.2.2 Contextual model creation In order to trigger context actions at run time, the mined information should be represented appropriately. There are several context modeling techniques. Ontologies have been utilized as a model to represent contextual knowledge by [28]. The reasoning and power of expressiveness of ontologies are the major advantage of using them in context modeling. In ontology-based models, contextual semantic information is represented using an ontology language, e.g., OWL (Web Ontology Language). OWL has been applied for context modeling in [52,85,219,220]. Reichle et al. [220] proposes an ontology-based contextual model with three layers of abstraction: the conceptual layer, the exchange layer, and the functional layer. The context elements such as context artifacts, scope (i.e., object environment and position), and entity representations have been modeled in the conceptual layer using the OWL specification language. Due to the complexity of ontology modeling languages, simple taxonomies are used for context modeling; this approach has been utilized by [85] to identify subjective textual contexts. In some application domains, the relationships between entities have to be captured using graphs to identify contextual relations [198]. Another context modeling technique uses context profiles where the information about entities is represented as feature vectors in an n-dimensional space [253]. Features are then partitioned into environmental and indicator attributes about context. The environmental feature values are used for context profiling, while the indicator features are used to determine if the entity falls inside or “out of that context.” Schilit et al. [234] utilize feature-value context profiles to represent the location context of physical entities. Similarly, [274] utilize key-value contextual profiles to incorporate contextual information from a user’s mobile computing environment into the Web. The contextual information which creates such profiles has been then used to generate dynamic web content. One variation of this modeling approach is the logic-based models where contextual information is represented as facts, rules, or expression-based profiles. For context aggregation, markup schema languages (e.g., XML) have been used to model contextual information as profiles [94]. Markup models can be generalized to include object-oriented (OO) techniques for context modeling. OO models are mainly useful in encapsulating and reusing contextual information [235]. Additionally, graphical models (e.g., UML profiles) have been used to represent context. For instance, [119] used UML for modeling contextual knowledge of cyber-attacks.

4.2.3 Contextual model application and usage The last phase of the contextual information processing in context-aware systems is the actual application and usage of the created prediction models. The application of these models includes several aspects. First, they can be used to filter out (i.e., restrict) some predictions made by other non-context-aware systems [70] if they are not relevant based on the current settings, such as the current configuration on the target hosts. Second, contextual models

123

A. Aleroud, G. Karabatis

Fig. 8 A research taxonomy for data mining-based context-aware systems

such as graph-based models can be used for expanding the predictions of other systems by retrieving more relevant predictions when they are contextually feasible [39]. Given a specific attack, predicted by a security mechanism or an attack prediction model m, the pre-identified contextual relationships are used to expand that prediction in order to identify additional attacks that are feasible in that same situation (e.g., steps in multi-step attacks). Third, models such as rule-based attack profiles can be used for predicting known and unknown attacks [288]. They can be used to detect known suspicious activities at run time when specific contextual pre-conditions are satisfied. They can also identify unknown attacks using contextual similarity with known attacks represented using such profiles. An attacker might be able to modify the sequence of activities that lead to a specific attack n i to initiate a new attack that is unknown yet similar to the original one. However, if such a modification is bounded by a specific amount, the unknown attack can be predicted using the features that describe the known one.

5 A taxonomy for context-aware IDSs We propose a taxonomy that characterizes how contextual information has been applied in data mining-based context-aware IDSs. This taxonomy, shown in Fig. 8, provides an overview of the major data mining-based intrusion detection approaches that take advantage of contextual information for attack prediction. The taxonomy consists of two dimensions; the first one represents the categories of contextual information discussed earlier, and the second represents the major phases of contextual information processing in context-aware systems along with the major techniques utilized in each phase. In line with this taxonomy, several techniques are used for mining and modeling contextual information. For instance, activity contextual information is modeled using

123

Contextual information fusion for intrusion... Table 5 Data formats in Network-based IDSs

Flow 1

Connection features Protocol

Service

Flag

src_bytes

tcp

irc

REJ

0

0

tcp

bgp

REJ

0

0

tcp

courier

SH

0

0

udp

domain_u

SF

29

0

udp

domain_u

SF

44

115

udp

domain_u

SF

45

115

Flow 2

Flow 3

dst_bytes

Flow 4

Fig. 9 Packets aggregation to create flows

graphs, ontologies, and context profiles. Thus, the dimensions of our taxonomy are orthogonal.

5.1 Data mining-based IDSs The creation of data mining-based intrusion detection techniques consists of several phases during which contextual information may be utilized. In practice, data mining techniques have been extensively applied in creating intrusion detection techniques [26,49,66,153,154]. Our search shows that the process of creating data mining-based IDSs consists of data collection, data preprocessing, development of the intrusion detection technique, and finally its use, and evaluation phases. 1. Data collection: The network/host data are collected using various tools such as sensors, packet capturing tools, honeypots, etc. The captured data are represented either as system calls, or as TCP/IP connections. System calls are used in Host-based Intrusion Detection Systems (HIDSs). TCP/IP connections are used in Network based Intrusion Detection Systems (NIDS). In HIDS, the system calls are represented as processes with their start time. This data format has been utilized in several works [60,114,117,135,156,276]. The TCP/IP data usually take the TCP dump format that is captured by packet sniffing tools as shown in Table 5. This data format has been utilized in several works to create network-based intrusion detection techniques [25,150,247,266,291]. In addition to HIDS and NIDS, a recent trend is flow-based IDSs, which investigate the content of IP flows to detect attacks and they complement the typical packet inspection intrusion detection (see [15,102,113,123,252,254,290]). The process depends mainly on analyzing IP flows, which are collections of unidirectional network packets. Each flow consists of packets that have a set of common properties such as their source and target, the input / output interfaces, and time window [73]. Figure 9 shows an example of packets that are aggregated into four flows based on such characteristics. Packets which are close to each other in time, targeting similar locations, are aggregated into the same flow. More details on flow-based IDSs are presented in a review by [256].

123

A. Aleroud, G. Karabatis

2. Data preprocessing: During data preprocessing, several steps may be taken, such as discretizing numerical features and handling missing information. Another major step is feature selection, which is performed to select the most significant features and use them to create the attack prediction technique [34,257]. The main motivation of applying feature selection techniques is to create lightweight IDSs while guaranteeing a high detection rate. Several techniques have been utilized for feature selection, such as principal component analysis (PCA) [133], correlation-based feature selection [104] and entropy-based feature selection [53,88] 3. Development of intrusion detection techniques: the data mining-based intrusion detection approaches that are utilized to create attack prediction models belong to three categories, namely signature (misuse)-based, anomaly-based, and hybrid-based. These approaches are categorized into classification, association rule mining, anomaly detection, clustering, and attack graph techniques. Depending on the availability of labeled data, the data mining approaches are also categorized into supervised, unsupervised, and semi-supervised techniques. Supervised intrusion detection techniques are created when labeled training datasets are available [3,210]. Examples of supervised data mining-based intrusion detection techniques are artificial neural network classifiers (ANN) [289], support vector machines classifiers (SVM) [272], decision tree (DT) classifiers [81], Bayesian network (BN) classifiers [237], hidden Markov models (HMMs) [86], random forest classifiers [309], regressionbased techniques [141], and k-nearest neighbors (k-NNs) [165]. The unsupervised techniques attempt to identify intrusion patterns without relying on labeled data. They assume that the data can be separated into benign and anomaly (attack) regions. Such an assumption, if incorrect, leads to high false alarm rates. k-Means clustering [100], expectation maximum (EM) [304], PCA [281], association rules mining [175] and self-organizing maps (SOM) [166] are examples of unsupervised techniques that have been utilized in intrusion detection. The semi-supervised techniques require datasets that only contain activities labeled as benign. Therefore, such techniques are more applicable to modern IDSs than the supervised techniques. Some examples of these techniques include the one-class anomaly detection algorithms, such as one-class nearest neighbor (1-CNN) [12], one-class support vector machines (1-CSVM), and the relative density algorithms [114]. 4. Evaluation of intrusion detection techniques: In order to find the prediction error rate, it is necessary to examine the difference between the predicted output and the ground truth, which is an actual label in a labeled dataset or a decision by a domain expert. Each label specifies a specific type of attack or a benign activity. The most widely used metrics in evaluating intrusion detection systems are the true-positive (TP) and false-positive (FP) rates. The TP rate is the probability that an IDS outputs an attack alert when there is an actual attack (intrusion). The FP rate is the probability that the IDS outputs an attack alert, when there is no actual attack. Additionally, there are other measures which have been used in evaluating intrusion detection techniques, such as the false-negative (FN) rate, (FN = 1 − TP), the true-negative (TN) rate (TN = 1 − FP) and the accuracy (AC), which is the proportion of the correctly identified predictions. Some intrusion detection techniques (e.g., anomaly-based techniques) require to fine-tune the technique under evaluation by using different threshold values to adjust system parameters. In particular, there could be several TP and FP values (each one obtained with different threshold); thus, the evaluation objective is to find the threshold that gives the maximum TP rate and the minimum FP rate. A popular evaluation measure is the ROC (receiver operating characteristics) curve [93] which is used to plot different T P and FP values that are associated with different operation points. The ROC curve shows the relationship between TP and FP. It is also used to compare two or more IDSs by displaying multiple curves [14].

123

Contextual information fusion for intrusion...

Some authors propose other types of metrics such as intrusion detection capability (IDC) [99]. This metric takes into account both the TP and FP rates together. In the subsequent sections, we provide an overview of the data mining-based intrusion detection techniques, along with the contextual aspects that have been utilized in creating those techniques.

5.2 Signature-based intrusion detection techniques The signature-based intrusion detection techniques have been created using several data mining, machine learning, and statistical techniques. The approaches used in creating signature-based IDSs are categorized into classification, association rules, and attack graphs.

5.2.1 Classification approaches in signature-based intrusion detection Classification approaches have been widely utilized in devising signature-based intrusion detection techniques. Formally, the algorithm to classify network connections creates an attack prediction model using a training dataset T with m labeled connections (|T | = m), a feature space F = { f 1 , . . . , f d }, and l-dimensional set of pre-identified labels Y = {y1 , . . . , yl }. The classification model is applied on an evaluation dataset E with n connections (|E| = n). The classification algorithms perform an optimization procedure for minimization of an objective function of the learning error. The objective function aims at minimizing the false-positive rate and maximizing the true-positive rate in the evaluation set E. Next, we discuss the major categories of classification-based intrusion detection techniques. 5.2.1.1 Rule-based classifiers for signature-based intrusion detection Rule-based intrusion detection techniques learn rule models that capture the associations between features and the types of attacks/benign activities in the training data. The rule-based intrusion detection classifiers perform binary classification (i.e., classifying data into benign activities or attacks) or predict attacks with specific signatures. The rule-based intrusion detection techniques are categorized into decision trees and fuzzy rule-based classifiers. The decision tree classifiers recursively partition the instance space to create attack signatures as rules. The tree has one root node; all other nodes have one parent. Nodes with incoming edges are called internal nodes, and leaf nodes are called decision nodes. In network intrusion detection systems (NIDS) internal nodes represent the connection features and leaf nodes represent the type of attack/benign activity. Each internal node splits the instance space into two or more sub-spaces using an optimization function that addresses the correlation between the input features. The instance space is partitioned based on the values of features. If the features are numerical, they can be discretized before splitting. The most widely used splitting (feature selection) function is information gain (IG) [118] which is calculated using entropy measures. IG measures the reduction in uncertainty about attacks when we have knowledge about another variable such as a connection feature. The features with the highest IG are selected to create the decision trees. Once the decision trees are created, the resulting rules are used as signatures to predict attacks and benign activities. Formally, if N = {n 1 , . . . , n k } is a set of attacks with a probability p (n i ) for each attack, and F = { f 1 , . . . , f m } is a feature vector F with a set of values where attacks are identified using the features in the set F,the conditional entropy of n i given F denoted by H (n i |F), is the average conditional entropy for all entries of F:

H (n i |F) = − p F = f j H n i |F = f j (2) 1≤ j≤m

123

A. Aleroud, G. Karabatis

The IG can be then calculated as: I G (n i |F) = H (n i ) − H (n i |F)

(3)

Decision trees have been utilized to create signature-based intrusion detection techniques [67,81,139,187]. The decision tree has been combined with SVM classifier to detect known attacks [193,207], and with PCA to detect known attacks [40]. The scalability of decision tree approaches in signature-based IDSs has been studied by [160,249]. In addition, decision tree approaches have been used to create host-based intrusion detection techniques [230,248]. 5.2.12 Bayesian classifiers for signature-based intrusion detection Bayesian networks (BNs) are directed acyclic graphs that represent the probabilistic relationship between nodes. The nodes in BNs represent variables of a specific type and the edges indicate the relationships between them. The strength of those relationships indicates the probability of one node given another. For instance, a relationship may represent the probability of an attack given a particular connection feature. Several research approaches utilize BNs in signature-based intrusion detection [168,176, 204,293]. Scott [236] utilize BNs to create latent hierarchical models that identify cyberattacks. Qin and Lee [216] create a probabilistic Bayesian-based inference system to analyze attack scenarios from low-level alerts. Burroughs et al. [45] utilize BNs in distributed IDSs to improve the capabilities of early detection of distributed attacks. BNs classifiers have been used by [4,142] to improve the aggregation of different intrusion detection model outputs. Bringas [42] utilizes Bayesian Belief Networks to detect known and unknown attacks at run time. Tylman [270] proposes a system, called Basset (Bayesian system for intrusion detection) which extends the functionality of Snort, an open-source NIDS, by incorporating BNs for an additional processing stage. BNs have been used to identify several other categories of attacks such as those that target user privacy [21]. Chebrolu et al. [51] utilized both regression tree (CART) and BNs to create a light weight intrusion detection technique by selecting only important features from network data. Xie et al. [296] build a BNs security graph model to capture uncertain relationships between attacks and measure the security of a computer network when specific mitigation techniques are applied. 5.2.1.3 Support vector machines (SVM) for signature-based intrusion detection The SVM classifier has been proposed by [272]. SVM classifiers are mainly utilized in binary classification problems. They allow users to identify the width of the decision boundary area through a user defined threshold to adjust the trade-off between the true positives and false positives. In intrusion detection, an SVM uses a portion of the intrusion detection data to create a prediction model using several support vectors to separate the hyperplane into two different regions corresponding to attacks and benign activities, respectively. There are several approaches which utilize SVM in creating signature-based intrusion detection technique, such as the weighted voting SVM (WV-SVM), which determines whether a process that targets a particular host is an intrusion [53], and twin support vector machines (TW-SVM), which creates a misuse detection system [74]. Ambwani [20] focus on using multi-class SVM to precisely identify attacks by type. Mukkamala et al. [190,315] compare the performance of neural networks and SVM classifiers for intrusion detection and conclude that SVM techniques are more efficient. SVM classifiers have been combined with genetic algorithms by [87,316] to construct a signature-based detection problem, with decision trees by [206] to create hierarchical intelligent system models to detect attacks, and with rough set theory by [54] for misuse detection. Several approaches focus on utilizing SVM classifiers with anomaly detection and clustering techniques to create misuse-based

123

Contextual information fusion for intrusion... Fig. 10 Perceptron NN applicability in signature-based IDSs

yk Attack

Benign

wjk xj

Hidden Nodes

wij

xi

IDS [159,225,261]. SVM classifiers have been used for feature selection by [103] who propose an ad hoc feature selection approach to detect known attacks. Muntean et al. [194] propose an intrusion detection technique using the cost-sensitive classification and SVM. Online SVM-based intrusion detection techniques have been introduced by [116,311,312]. 5.2.1.4 Neural networks (NNs) for signature-based intrusion detection Artificial neural networks (ANNs) have been recently considered to be one of the most effective techniques in intrusion detection. The ANN is a machine learning model that transforms a set of inputs to a set of searched outputs through simple processing units (nodes) and connections between them. The neural network (NN) consists of input nodes, output nodes, and other nodes between the input and output forming the hidden layers. The connections between two nodes have specific weights and determine how a node affects another. There are two modes to train a NN, the supervised training mode, and the unsupervised one. In the supervised training mode, the network learns the desired output for a given input pattern. In the unsupervised mode, the network learns without specifying the desired output during the network training phase. The most well-known architecture of a supervised NN is the multilayer perceptron (MLP) network, which has been employed for intrusion detection. In signature-based intrusion detection, NN has to be exposed to a dataset that contains attacks and benign activities to automatically adjust the learning coefficients during the training phase. The testing phase is then performed with a real network data that contains real attacks. As shown in Fig. 10, the perceptron NN consists of k input nodes. In signature-based IDSs, each input node xi takes a connection feature. The network consists of 2k hidden nodes, each denoted by x j , and two output nodes. The output is either an attack or a benign activity. Since the training phase consists of multiple rounds, the weights wi j associated with inputs (connection features) are adjusted using the back-propagation approach with weights being updated after each round. The purpose of updating these weights is to increase the attack detection accuracy at run time. Several signature-based intrusion detection techniques utilize the strengths of NNs [36,46, 62,132,157,246]. NNs have been applied by [63,130,226] to detect intrusions based on user activities. A hybrid decision tree and NN approach has been utilized by [203]. The NN model is updated by the decision tree rules that are mined from datasets that contain attacks. Debar et al. [63] propose a hierarchical model using PCA and NNs. Similarly [310] introduce a hierarchical intrusion detection (HIDE) system, which detects UDP flooding attacks using NNs. Lei and Ghorbani [157] used MLP for intrusion detection based on an offline analysis approach capable of detecting specific types of attacks. NNs have been also utilized for other objectives such as feature selection. For instance, [191] introduce a ranking mechanism to

123

A. Aleroud, G. Karabatis

select the key features for intrusion detection using NNs. Wang et al. [278] propose an approach called FC-ANN, based on ANN and fuzzy clustering to detect attacks. The major advantage of the proposed approach is its capability to detect less frequent attacks. Self-organizing map (SOM) is a type of artificial neural networks that works in an unsupervised competitive learning environment. The main purpose of SOM is to reduce the dimensionality of data; it projects and clusters high-dimensional input vectors into a lowerdimensional space [268]. In intrusion detection, SOM is used to discover the hidden clustering structure in the data [308]. Several approaches utilize SOM in anomaly-based intrusion detection. Xiaorong and Shanshan [295] propose a hybrid intrusion technique using PCA and SOM to establish an extensible real-time intrusion model with high detection accuracy. Ippoliti and Xiaobo [122] propose a hierarchical self-organizing map (HSOM) approach to network intrusion detection. Albayrak et al. [10] focuses on improving the usage of SOM for anomaly detection by combining the strengths of different SOM algorithms. 5.2.1.5 k-Nearest neighbors (k-NN) for signature-based intrusion detection The k-nearest neighbor (k-NN) is an instance-based learning technique that classifies a specific incoming connection based on the class of the closest training connection. k-NN is a type of lazy learning data mining technique where most computations are performed at run time. k-NN does not make any assumptions on the underlying data distribution. The connections examined at run time are compared to a set of pre-identified ones and are given the label of the k-most similar connections. The similarity between connections is measured using distance metrics such as Euclidian distance. There are several k-NN approaches used in signature-based intrusion detection, such as the approaches by [40,280], where a k-NN classifier has been applied on top of PCA to detect known attacks. Adetunmbi et al. [5] compare the performance of rough set approaches and k-NN. They argue that k-NN has “a better performance in terms of accuracy but consumes more memory and computational time in the detection process.” Li and Guo [163] propose a supervised network intrusion detection method based on TCM-k-NN (Transductive Confidence Machines for k-nearest neighbors) data mining algorithm. They state that the proposed technique uses much fewer selected data and features for training in comparison with the traditional supervised intrusion detection methods. k-NN has been used in a variety of other tasks such as feature selection [185] and sub-sequence extraction from network activities [146].

5.2.2 Association rule approaches for signature-based intrusion detection Association rule mining is one of the earliest data mining techniques that has been used to discover the correlation among features. Association rule mining was proposed by [6] to uncover relationships between seemingly unrelated data in a transactional database, relational database, or other information repository. Finding association rules consists of two steps: the first step is discovering the frequent item sets that have a support above a pre-defined threshold (i.e., the rule support). The second step is using the frequent item set to generate association rules that have a confidence above a predetermined minimum confidence (i.e., the rule coverage in the dataset). Apriori algorithm by [7] and FP-growth by [106] are the most widely used algorithms in discovering association rules. In NIDSs, the association rules that correspond to suspicious activities are extracted from audit network data and used to investigate connections at run time. For example, Table 6 shows a set of several network connections which may lead to the discovery of association rules such as, [HTTP_FileTypeURL][HTTP_FileTypeLnk]Conf = 0.90 support = 0.05

123

Contextual information fusion for intrusion... Table 6 IDS alerts that correspond to a web-based multi-step attack

Subnet

SRC IP

DST IP

Signature

1

24.9.61.170

182.168.2.4

HTTP_FileTypeURL

1

24.9.61.170

182.168.2.5

HTTP_FileTypeLnk

2

24.9.61.172

182.168.2.1

HTTP_FileTypeURL

2

24.9.61.172

182.168.2.1

HTTP_FileTypeLnk

…

…

…

This rule involves an attacker who is attempting to exploit a vulnerability in the target Web site. The first attempt shows the attacker trying to get an access to the .url file (HTTP_FileTypeURL). The second attempt targets the .lnk file (HTTP_FileTypeLnk). Under some circumstances, an attacker might use such a file to gain access to privileged information about the target system. The rule confidence indicates that 90% of attackers who tried to access to a .url file also tried to access the. lnk file, and the rule support shows that this pattern occurs in 5% of the data. Several variations of association rules approaches have been utilized in signature-based intrusion detection techniques [23,29,155,286,300]. Ma [77,178] propose a fuzzy and association rules-based approach to detect known intrusions. Similarly, [301] propose a fuzzy-based association rules mining algorithm to detect intrusions in wireless network. Association rules have been combined with rough set theory by [282] to improve the detection accuracy and reduce false alarm rates. Association rule approaches have been utilized for several other objectives such as improving the performance of the distributed intrusion detection systems [68] and dimensionality reduction for intrusion detection [24].

5.2.3 Graph-based approaches for signature-based intrusion detection Graph models are very powerful modeling and reasoning tools in intrusion detection. They model system behavior by mapping the relationships between observable events that are related and may form one or more attacks. Formally, a graph G = (V, E) consists of a set of vertices V = {v1 , v2 , .., |V |} and a set of edges E = {e1 , e2 , .., |E|}, which indicate the relationship between vertices. Some graphs are weighted, that is, the relationships between nodes are formed based on numerical weights; conversely, graphs can be un-weighted and the edges indicate that some relationship exists between the connected vertices. Several intrusion detection techniques utilize attack graphs to discover correlations or causal relationships between alerts [98,215]. In these approaches, the dependency between nodes identifies relationships between attacks, hosts, exploits, network events, etc. which describe attack scenarios. At run time, the signature-based IDSs correlate these scenarios with run-time events. Research approaches that analyze attack graphs to discover attacks are categorized into: • Attack graphs without background information (about the targeted environment). • Attack graphs that maintain background information. • Ontology-based approaches. Attack graphs without background information: Approaches in this category create attack scenarios by including host vulnerabilities as background information. The majority of approaches in this category have been introduced by [197–199,222,243] to represent attack graphs based on exploit dependency as shown in Fig. 11. The attack graph is constructed

123

A. Aleroud, G. Karabatis

<> host 1 host 2

<> host 1 host2

<> host 2 host 3

Distance =2

<> host 3 host 2

<> host 3 host2

Fig. 11 An example on Attack graphs (Noel et al. [33])

before the events occur. An alert is raised at run time if its corresponding events are mapped to adjacent exploits in the attack graph. Events that are mapped to adjacent exploits in the attack graph are considered fully correlated. Events that are mapped to non-adjacent exploits are partially correlated using a specific distance measure. Sequences of events, which are highly correlated represent a possible attack scenario. A similar approach has been applied by [283] in which the authors propose a reasoning framework that infers the roles of suspicious hosts from local observations, identifies groups of strongly correlated hosts, and derives host relationships using attack graphs. Mathew et al. [183] proposes a mechanism to predict multistage goal-oriented attacks in real time using attack graphs. The proposed technique calculates the credibility of attack scenarios based on the state of a live intrusion alert stream. Jie and Zhitang [129] used an attack graph approach to predict future attacks. The IDS alerts are correlated to create attack scenarios and ranked by their predictability scores. The attack scenarios with high predictability are used as evidence to make prediction on future attacks. In [129,224], a time and space alert analysis technique correlates related alerts with a background knowledge on vulnerabilities. Likewise, a cause and effect attack graph approach has been presented by [19] which uses a statistical model to detect intrusions. It relies on an improved knowledge-based model with vulnerability and extensional consequence parameters to provide a manageablesized graph. Attack graphs without background information: The approaches in this category apply correlation techniques on network datasets to create attack graphs and use them in the intrusion detection process. The advantage is the automatic creation of attack graphs. Examples are the works by [110,317] who utilize security parameters to create signature-based detection engines using statistical correlation methods. Similarly, in [170] a three-dimensional visualization technique is proposed to correlate alerts based on what, when, and where attributes of each network activity. Each alert is represented as a connection between two domains that represent the where attribute. The other two dimensions represent the when and what attributes. Each alert is depicted as a straight line from the what–when domain to the where domain. Some other approaches use attack graph visualization to contribute to context (situational awareness). Angelini et al. [22] propose an approach to actively monitor attack paths, perform a reactive analysis, and suggest mitigation actions. As shown in Fig. 12a, black lines represent the logical links that connect nodes. During the attack discovery phase, the computed attack paths are represented using a different color. Furthermore, the instantiated path

123

Contextual information fusion for intrusion...

Fig. 12 a Example attack graph, b mitigation actions, c evolution of the attack (reproduced from [22])

is shaded using a third color (e.g., the detected attack sequence is represented using red edges connecting compromised nodes indicated by system-raised alerts). A sequence of connected compromised nodes represents an attack. Each edge on the attack path is associated with a mitigation plan depicted as a rectangle containing triggered mitigation actions (Fig. 12b). Figure 12c visualizes the evolution of an attack. The left figure shows node A as the last compromised node, where the mitigation actions at that node are not sufficient to stop the attack; consequently, the attack spreads to node B (middle figure). While the majority of mitigation actions (b2, b3) are applied successfully, the attack proceeds to node C. Lastly, the attack spreads to node C and finally to node D (right figure). Ontology-based approaches: The third approach in graph-based intrusion detection techniques is the use of an ontology. The ontology as defined by [97] is an explicit specification of a conceptualization. It is used to provide a formal specification of the concepts and relationships that exist between entities within a domain, e.g., a relationship can be defined between IDSs, cyber-attacks, networks events. The advantage of using knowledge management techniques in information security has been already addressed; for instance, [124] demonstrates how ontology is playing a critical role in building knowledge intensive event correlation components in intrusion detection systems. Nevertheless, there have been few approaches which utilize the strengths of ontologies in signature-based IDSs. Wan and Shengfeng [277] propose a methodology to correlate intrusion alerts with attack scenarios using ontology. Vorobiev and Jun [2,275] investigate the role of semantic interoperability between IDSs using ontology. A malware ontology has been presented by [182]. The proposed ontology can be updated to include the newly discovered unknown intrusions. Saad and Traore [227] propose an ontol-

123

A. Aleroud, G. Karabatis

ogy for network forensic analysis. The created ontology is used as a knowledge repository for developing network forensic systems that support chain of reasoning.

5.2.4 Contextual information in signature-based IDSs The majority of signature-based intrusion detection technique utilizes the activity category of contextual information to create attack prediction models. Additionally, Bayesian, k-NN, and graph-based approaches utilize the relation category to create attack prediction models. There is a quite good number of rule-based classifiers and association rule approaches that utilize the time and location categories. Apart from these, the majority of attack graph approaches utilize the individuality category where the host information of the targeted systems is used for attack prediction tasks. Decision tree-based intrusion detection techniques mainly utilize the activity category of contextual information [40,67,81,193,207]. Location and individuality have been utilized by [230] to create a profile-based technique that logs the history of system calls on each host. The host resource usage and the file access events create a decision tree classifier, which is applied at run time to detect suspicious systems calls. The major context modeling technique in decision tree-based intrusion detection techniques is the rule-based profile which has been utilized for prediction by [40] and filtering [160,249] of tasks. Fuzzy rule-based intrusion detection techniques use the activity category [72,77,89,131, 174,175,178]. A few fuzzy rule-based systems take into consideration the location category [41,71,238]. Such techniques concentrate on the locations of targeted systems, source, and destination ports in creating fuzzy-based classifiers. (Dickerson 2000) focus on utilizing the time category to create a fuzzy rule-based intrusion detection technique. The proposed approach measures the trace features during successive time intervals to predict the occurrence of attacks based on time. Example of features utilized for attack prediction is the number of successfully established connections in a specific time interval. The most commonly used context modeling approach in fuzzy rule-based intrusion detection techniques is the rule-based profile. The IDSs utilize such profiles mainly for attack prediction tasks, but a few approaches utilize the time aspect of context in support of filtering operations to further reduce the flow volume in an attempt to improve the efficiency of the attack prediction process (Dickerson 2000). Probability-based classifiers such as Bayesian-based intrusion detection techniques are divided into two categories: the first category represents techniques that utilize simple BNs to model the relationships between two variables (parent and child nodes). The second category provides more inference capabilities and models the relationship between several variables. The techniques in the first category provide simple inference capabilities. Therefore, the approaches in this category model the relation between parent and child nodes on BNs to discover correlations [4,42,51,142,204,236,296]. The techniques in the second category consider the indirect relationships between variables (e.g., the relationships between attack steps) to identify multiple relations between features, leading to the discovery of multi-step attacks [216,270]. Other categories of contextual information have also been used in creating BNs. Tylman [270] consider the location of attack targets to create attack prediction models using BNs. Xu and Shelton [298] utilize time in Continuous Bayesian Networks (CTBNs) to construct a hierarchical CTBN model for the network packet traces to detect worms in a real-time basis. In order to model the relations between variables, BNs prediction models use graphs with nodes and edges to model contextual information. BNs have been mostly used for attack prediction; however, the approach by [270] uses BNs to expand the IDS alerts by performing multiple inferences on Bayesian-based alert graphs.

123

Contextual information fusion for intrusion...

Research using SVM classifiers still lacks the use of contextual information fusion, focusing instead on the activity category, although there are some SVM-based approaches that utilize other categories. Naveen [195] utilizes host location and event time to detect intrusions using SVM classifiers. The approach focuses mainly on observing the changes in data patterns collected at different times. Overall, there is no specific context modeling technique for SVM approaches. Support vectors can be represented as linear functions and applied at run time to detect attacks. Finally, SVM approaches have been mainly used so far for attack prediction tasks. Applying SVM to expand or filter the predictions of IDSs is one of the research topics that have not been undertaken yet. Similarly, the NNs approaches focus mainly on the activity category of contextual information. The approaches by [63,130,226] focus on creating user activity profiles to detect intrusions using NNs. Few NN-based approaches utilize the time category. For instance, [8] propose a host-based intrusion detection technique that uses radial basis function neural networks (RBNN) as profile containers. The system uses system calls made by privileged UNIX processes. The proposed detection algorithm provides an implementation of a time window to detect intrusions when a window has high correlation with preexisting suspicious time windows. There is no specific context modeling technique for NNs intrusion detection approaches, by default, the output of the training phase can be profiled and used for prediction. Typically, it is not straightforward to maintain background knowledge in NN classifiers. Ryan et al. [226] propose an approach that utilizes the individuality category of context to create a knowledge-based technique that relies on known attack scenarios, operating system flaws, and the security policy as defined by the security officer to detect attacks. Most of the proposed NN intrusion detection techniques utilize contextual information to predict intrusions. Expanding the results of IDSs using NNs has not been addressed by the existing NN intrusion detection techniques. The k-NN-signature-based intrusion detection techniques mainly utilize the relation and activity categories. Other categories of contextual information such as time have been used by [147] to enhance the detection accuracy of k-NN in the sequential intrusion detection data. The activity-based profile is the major context modeling approach in k-NN signature-based intrusion detection techniques. The association rule approaches utilize several categories of contextual information, specifically, activity, time, location and relation [29,68,77,155,178,300,301]. Apiletti et al. [23] perform a real-time aggregation of the captured network data based on time and location of different activities. The objective of this aggregation mechanism is to detect and identify relations between attacks. The most widely utilized context modeling technique in association rule-based intrusion detection techniques are rule-based profiles, which are used for attack prediction or/and identifying relationships between attacks. The approach utilized by [23] takes advantage of time features to perform filtering of irrelevant events. Finally, graph- and ontology-based approaches mainly focus on the relation context to identify dependency between host exploits and address the correlation between events [197– 199,222,243]. [136,224] a time and space alert analysis technique which correlates related alerts with background knowledge about vulnerabilities is proposed. In graph-based techniques, relation-based contextual information has been utilized to produce several related alerts through different expansion mechanisms [197–199,222,243]. Table 7 summarizes several characteristics of signature-based intrusion detection techniques, including the algorithms used to create each approach, the datasets used for evaluation, the software packages, and the contextual information categories involved.

123

123

Ripper; J48; tree-based SVMs; PCA

Rule-based

Rule-based

Rule-based

Sandhya Peddabachigari et al. (2004), [40,67,81,193]

[230]

[160,249]

J48; linear discriminant analysis

J48

Algorithm/technique

Category

Article Prediction

Prediction Filtering

Activity

Activity Activity

http://weka. sourceforge.net/ doc.dev/weka/ classifiers/rules/ JRip.html; http://weka. sourceforge.net/ doc.dev/weka/ classifiers/trees/ J48.html; http:// tfinley.net/ software/ svmpython2/; http://www. mathworks.com/ help/stats/pca. html NAa http://weka. sourceforge.net/ doc.dev/weka/ classifiers/trees/ J48.html; http:// www. mathworks.com/ help/stats/ discriminantanalysis.html

Yonsei University/ host-based dataset KDD 99 dataset

KDD 99 dataset

Information used for

Contextual category

Software packages

Dataset(s)

Table 7 Signature-based intrusion detection techniques and the contextual information aspects utilized

A. Aleroud, G. Karabatis

Category

Rule-based

Rule-based

Bayesian-based

Bayesian-based

Article

Dickerson 2000, [1,89,128,131]

Dickerson 2000; [238]

[176,204,293]

[298]

Table 7 continued

Continuous Time Bayesian Networks

Naïve Bayes; Bayes Network; Bayesian; Network Model Averaging

Fuzzy rules classifiers

Fuzzy rules classifiers

Algorithm/technique

Prediction and expansion

Prediction

Location

Relation

Time

NA

NA

Sequence Time-Delaying Embedding (STIDE)

NA; University of New Mexico host-based IDS data KDD 99 dataset; Synthetic data; NSL KDD

MAWI and the LBNL/ICSI Internal Enterprise Traffic; 1998 DARPA Dataset

Prediction

Prediction

Activity

NA; Matlab: https://www. mathworks.com/ matlabcentral/ fileexchange/ 47203supervisedfuzzyclustering-fortheidentificationof-fuzzyclassifiers; FIRE; Mscan; NA

DARPA; DARPA; NA; Mississippi State; University DARPA

Information used for

Contextual category

Software packages

Dataset(s)

Contextual information fusion for intrusion...

123

123

Bayes Network

Radial basis kernel function(RBF)

Bayesian-based

SVM

NNs

k-NN

Association Rules

([270])

Weiming Hu et al. (2014), [311,312]

[132,157,246]

[163]

[23,29,286]

Apriori; LCM v.2 Algorithm; Frequent Episodes

Transductive confidence machines for k-NN

Hybrid evolutionary NN; improved competitive learning NN; feedforward NN

Algorithm/technique

Category

Article

Table 7 continued

Prediction

Prediction

Prediction

Activity

Location and activity

Weka: https:// sourceforge.net/ projects/weka/ ADAM technique for intrusion detection; Network Digest Framework; NA

KDD99 data

KDD99 data; Polytechnic University of Turin; KDD99 data

Matlab: http:// www. mathworks.com/ products/neuralnetwork/

Activity

KDD99 data

Prediction and expansion Prediction

Activity and location

Activity RBF function in Matlab: http:// www. mathworks.com/ help/stats/ support-vectormachines-forbinaryclassification. html? requestedDomain= www. mathworks.com

Information used for

Basset

Contextual category

1998/1999 DARPA BSM dataset

Software packages

NA

Dataset(s)

A. Aleroud, G. Karabatis

Graph-based

[197–199,222,243]

a NA not available

Category

Article

Table 7 continued

Attack Graph and Adjacency Matrix Clustering

Algorithm/technique

Information used for Prediction and expansion

Contextual category Location and activity

Software packages TVA (Topological Analysis of Network Attack Vulnerability); Mulval; http:// people.cs.ksu. edu/~xou/ mulval/; NETSPA (A Network Security Planning Architecture)

Dataset(s) Vulnerabilities taken from X-Force, Bugtraq, CVE, CERT, Nessus, and Snort information sources

Contextual information fusion for intrusion...

123

A. Aleroud, G. Karabatis

5.3 Anomaly-based intrusion detection techniques Anomaly detection techniques identify events that fall outside the region of pre-defined sets of benign activities; they have been used in commercial IDSs to detect intrusions. Chandola et al. [48] define anomalies as “patterns in data that do not conform to a well-defined notion of benign behavior.” The semi-supervised anomaly detection techniques are usually data mining algorithms that calculate the similarity of incoming connections with preexisting benign profiles. The deviations from these profiles result in declaring a run-time connection or a set of connections as anomalies. The subsequent sections discuss the existing anomaly detection techniques in the intrusion detection area.

5.3.1 Anomaly-based intrusion detection using classification techniques This section focuses on the application of classification-based anomaly detection techniques for intrusion detection, specifically, the detection of unknown attacks using these techniques. Additionally, it addresses the application of contextual information in classification-based anomaly detection techniques. 5.3.1.1 Anomaly-based intrusion detection using rule-based techniques Rule-based anomaly detection techniques utilize benign profiles to detect attacks as anomalies. Incoming connections which are not similar to these profiles are considered anomalies. Several rule-based approaches have been investigated to detect intrusions as anomalies such as association rules and probability-based techniques [30,156,179,202,214], decision trees and entropy [79,242], and fuzzy rules classifiers [38,84,90,188]. 5.3.1.2 Anomaly-based intrusion detection using probability-based techniques Probability-based models such as BNs are characterized by finding probabilistic relationships between events that target the network to identify a connection as an anomaly. All states can be categorized by a Bayesian classifier into benign or abnormal (anomaly) events using the prior probability of the normality in the system and the prior probability of the certain network events. Different thresholds can be applied to Bayesian-based classifiers to categorize events as either benign or anomalies. Several minor variants have been proposed to detect intrusions [42,262,263,287] using BNs. [30] propose an anomaly-based technique called pseudo-Bayes estimators to enhance detection of unknown attack types. In [237], the application of Bayesian technology has been utilized in developing an anomaly detection system that can detect attacks from a third party executable code in active network infrastructure. Active networks enable quick introduction of new services in the current telecommunication infrastructure. Tylman [269] presents a system called Basset (Bayesian system for intrusion detection). The system extends the functionality of Snort (the open-source NIDS) by incorporating BNs in an additional processing stage. Lu et al. [173] propose a two stratum BN-based anomaly detection and decision model that integrates a meta-data layer into the intrusion detection process. In [180], a macroprogramming approach has been used with BNs to detect unknown types of intrusions. 5.3.1.3 Anomaly-based intrusion detection using neural networks NNs have been applied in various anomaly-based intrusion detection techniques. They are trained using benign connections to learn one or (more) pattern(s) of benign activities. The testing connections are provided as input to NNs, and they are detected as benign activities or attacks. Alternatively, a NN might give the testing connections a score that measures

123

Contextual information fusion for intrusion...

the anomaly. There are several variations of NNs that have been used as anomaly detection techniques such as Evolutionary [108], Synergetic NN [107], and Replicator NNs [111]. [152] propose a technique which is composed of a hierarchy of NNs that function as a true anomaly detector where the IDS works by monitoring selected areas of network behavior to detect anomalies. Cha et al. [47] utilize neural learning by back-propagation to improve the performance of anomaly-based intrusion detection techniques using system calls. Mora et al. [189] compare the performance of grid neural networks and the self-organizing maps (SOM) for anomaly-based intrusion detection. Song et al. [252] propose a flow-based statistical aggregation technique to detect anomalies in network flows. The proposed technique sets up flow-based statistical feature vectors and trains the NNs classifier. The NNs classifier uses back-propagation networks to classify each flow. AlSubaie and Zulkernine [9] investigate the role of sequential relationship between the events of the benign and suspicious behaviors in the anomaly detection process. They compare the performance of Hidden Markov Models (HMMs) and multilayer perceptron (MLP) NNs. They show that the detection rate of HMMs classifiers outperforms the detection of the MLP classifiers. Similarly, [306] propose a hybrid MLP/CNN (multilayer perceptron/chaotic NN) to identify unknown attacks. Qiu et al. [218] investigate data clustering and NN-based approaches to detect intrusions using anomaly detection. Their approach addresses the situation in which benign operation might exhibit multiple hidden modes. Wavelet-based NN have been also used to create NN that identify attacks. Liu [169] utilize wavelet neural network (WNN) using a modified quantum-behaved particle swarm optimization (MQPSO) algorithm to detect network intrusions. 5.3.1.4 Anomaly-based Intrusion Detection Using SVM Anomaly-based SVM intrusion detection approaches rely on one-class classification functions to discover attacks. One-class anomaly detection techniques assume that all training instances belong to one class (i.e., the benign activity class). Originally proposed by [232] the one-class SVM has been utilized in several works as an intrusion detection technique [78,148,156,177]. Attacks are detected by determining which points lie in a sparse region of the feature space. Lazarevic et al. [151,258] evaluate the performance of several existing supervised, semi-supervised, and unsupervised anomaly detection schemes and their variations. The results indicate that some schemes, such as the one-class SVM appear very promising when applied in detecting novel intrusions. Jun et al. [138] present a network anomaly detection system using Dissimilarity-based one-class Support Vector Machine (DSVM). The raw data are transformed into a dissimilarity space using dissimilarity representations (DR). Comparisons are performed between traditional one-class classifiers and the DSVM classifiers. The results show that DSVM achieves better performance. In [307], an imbalanced classification anomaly detection algorithm called “I-SVDD” is utilized for detecting botnets. The algorithm combines the one-class SVM classification with the known intrusion behaviors. To identify User to Root (U2R), and Remote to Local (R2L) Intrusions, [314] utilize a popular nonlinear dimensionality reduction tool and one-class SVM . 5.3.1.5 Anomaly-based intrusion detection using k-nearest neighbor (k-NN) The k-NN anomaly detection techniques work in a semi-supervised manner, requiring labeled datasets of benign activities. The k-NN anomaly detection techniques for intrusion detection are split into two categories: distance-based and density-based. Distance-based techniques utilize the distance between each connection and its kth benign nearest neighbor connection to calculate an anomaly score to declare attacks. The major assumption in this category is that the larger the distance between the connection and its nearest neighbor, the higher the likelihood that it is an attack.

123

A. Aleroud, G. Karabatis

Density-based techniques rely on the relative density of the neighborhood of a specific connection to declare that connection as an attack. If the connection is in a region with low density, it is very possible that it is not a benign activity. The distance to the kth nearest neighbors for a given connection can be used as a metric to estimate the inverse of the density of that connection in the dataset. Distance-based techniques comprise the majority of research approaches, as they are more efficient when applied at run time. For instance, [144] propose an anomaly-based intrusion detection method using the Combined Strangeness and Isolation k-nearest neighbor (CSIk-NN) algorithm. The proposed approach analyzes different characteristics of network data using two measures: strangeness and isolation. Using both measures, a correlation unit raises alerts that are associated with confidence estimates. Similarly, [162] introduce a transductive confidence machines for k-nearest neighbor (TCM-k-NN) to detect unknown attacks. The proposed technique is not sensitive to the curse of dimensionality. Ye and Tong [303] propose an anomaly detection method to detect intrusions from Unix shell commands. The k-NN algorithm is selected as a learning method, and a kernel function is used to calculate the deviation between benign and intrusive activities. TeShun and Yen [82,264] propose a fuzzy k-NN classifier to reduce uncertainty in the detection process. Cheng et al. [55] propose a multi-class k-NN anomaly-based approach, with a reasonable computational complexity at both training and run time. In [61], an independent component analysis approach is applied for feature extraction. The selected features are fed into SVM and k-NN classifiers to detect intrusions. Wang and Stolfo [279] present a payload-based anomaly detector using the k-NN classifier and using the Mahalanobis distance for attack detection. In [299], a technique that utilizes k-NN algorithm of kernel and the active defense has been used to detect unknown intrusions. In [208], a k-NN schema is proposed to reduce the dimensionality of IDS data using Manifold Learning Nonlinear dimensionality reduction technique. Density-based techniques require computation of density, while this is not efficient at run time there are some approaches that have adopted them. For instance, [78] propose an unsupervised geometric detection framework. The data have been mapped to a d-dimensional space to identify intrusion features. Connections are classified as intrusions based on their position in the space. The connections that are in relatively sparse regions of the feature space are declared as attacks.

5.3.2 Anomaly-based intrusion detection using clustering techniques Clustering is a learning process that is used to find the group of objects that share similar features in a collection of unlabeled data. During the cluster creation phase, the similarities between pairs of connections are calculated and a set of clusters is generated as output with similar connections in the same cluster and dissimilar ones in different clusters. Clustering for intrusion detection has been applied in two modes: The first mode is the semi-supervised one where a clustering technique generates one or more clusters for benign activities and then a distance-based metric checks if a testing connection belongs to any of these clusters to declare it as a benign activity. Otherwise, the connection is declared as an attack. The second mode is the unsupervised one, where all training connections are unlabeled. Therefore, the unsupervised algorithm discovers the essential differences between benign connections and abnormal ones to create clusters. During evaluation, an incoming connection takes the label of the cluster to which it has been assigned. The most widely used clustering algorithms for intrusion detection are k-means, DBscan and self-organizing map (SOM). The k-means clustering algorithm splits and groups the data into benign and attack instances. It gives

123

Contextual information fusion for intrusion...

each cluster a centroid which is the central vector in that cluster. The centroids are updated recursively until they converge. Consequently, the initial centroids are not necessarily the final ones. Each connection is expected to be similar to its cluster centroid compared to the centroids of other clusters. A similarity measure such as Euclidian distance measures the similarity of that connection with the pre-computed clusters. There are several approaches that utilize k-means clustering algorithms in intrusion detection. Meng et al. [184] propose a k-means technique and tested it on an audit network intrusion detection dataset. Similarly, [240] apply k-means clustering via Naïve Bayes classification for anomaly-based intrusion detection. In [231], a clustering technique called k-map is proposed. The k-map clustering works in the same way as k-means; however, k-map is applied in a multilayer hierarchical approach. In [100], a clustering algorithm called Y-means is utilized to detect unknown intrusions. The proposed heuristic is based on the k-means algorithm, and it overcomes the number of clusters and the degeneracy shortcomings of k-means. JiQing et al. [126] modify the k-means clustering algorithm based on clonal selection algorithm (CSA). Lizhong et al. [171] propose a modification on k -means clustering algorithm using particle swarm optimization (PSO-KM) to detect new types of cyber-attacks. Several authors propose efficient versions of k-means to deal with the scalability issue, including MapReduce [313] and random Fourier features [57], or the global k-means clustering algorithm [167]. DBscan is a density-based clustering algorithm which finds clusters that have arbitrary shape in a database that contains noise [158]. It assumes that most connections are benign activities that are grouped in a region with high density. Based on this assumption, the suspicious data are very different when compared to benign data and it exists in regions with low density. There have been several approaches that utilize DBscan for intrusion detection. In [161], a modified IDBSCAN (Improved DBscan) is proposed with a different distance calculation formula and cluster merger process. It has been tested using an audit network intrusion dataset. Handra and Ciocarlie [109] demonstrate the importance of combining DBSCAN with filtering and refinement techniques to improve the efficiency of the detection process using DBSCAN by dedicating most of the computation resources to the most suspicious connections. Sang Hyun et al. [229] propose a density-based clustering algorithm to identify cyber-attacks. The algorithm works in a similar way to DBSCAN; however, it is designed to continuously model a data stream. In addition, clustering algorithms such as expectation maximization (EM) have been utilized as an intrusion detection technique by [205] to detect denial-of-service attacks and [250] to carry out alert correlation. Finally, fuzzy clustering approaches such as Fuzzy C-means have been applied to create anomaly-based intrusion detection techniques [56,239,265,292]. When clustering is completed, a set of cluster centers and a membership partition matrix are generated. Membership scores represent the confidence of a connection belonging to each cluster. Wang et al. [278] propose a fuzzy clustering technique to generate training subsets of network traces. Subsequently, different subsets are trained to create attack prediction models. A wide variety of techniques have been utilized for visualization in network intrusion detection. In fact, it is common to combine cluster analysis with visualization for additional interpretation [127,228]. For instance, in [83], the NFlowVis system is introduced to analyze data and discover intrusions. The user interface provides a drill-down mechanism that allows users to go from an abstract overview of the network activities to more aggregated view of the IDS data. The proposed system combines a TreeMap visualization, a clustering algorithm, and hierarchical edge bundles to group flows in a meaningful way.

123

A. Aleroud, G. Karabatis

5.3.3 Other anomaly-based intrusion detection techniques In this section, we describe four anomaly-based intrusion detection techniques: entropy, hidden Markov model, spectral clustering, and histogram-based techniques. The entropy-based techniques study the intrinsic characteristics or regularity in audit data to distinguish between benign and abnormal behavior. The process of using entropy measures to create anomaly-based intrusion detection techniques consists of studying the characteristics of data and then selecting a model that best represents these characteristics. Entropy measures, such as the joint and relative entropy, have been used in anomaly-based intrusion detection techniques. The entropy values are lower when the distribution of features between different labels is skewed. [76] utilize an information distance measure to detect zeroday system scans. Based on the entropy difference between packet blocks sent by legitimate and illegitimate users, the proposed approach detects zero-day scans using the Kolmogorov complexity entropy measure. Ukil [271] utilize the Kolmogorov complexity entropy measure to detect anomalies by investigating the network flow signature. Giseop and Ilkyeun [88] utilize an entropy-based approach to detect distributed denial of service attacks. Shokri et al. [244] apply information theory concepts to move the center of cluster results to the most important areas in the domain of the selected network flow features. Among categories of contextual information, the activity category has been mainly used to create context profiles that describe benign behavior [213]; these profiles are used to predict attacks as anomalies [88] and filter out false positives [213]. A hidden Markov model (HMM) is a powerful statistical method that characterizes the observed data sample, which is arranged in a discrete time series. It is capable of processing the nonlinear and time-variant systems. Several authors utilize HMM to create anomaly-based intrusion detection techniques. [305] propose an anomaly-based intrusion detection method based on HMM. The benign trace of system calls is used to train the HMM and create a sequence of benign state transitions. Markov chains that model event transitions in a benign/usual operating condition of network systems have been utilized by [164,200,213,302]. [125] propose an incremental approach to create HMM and use it for attack prediction. The proposed scheme first divides the long observation sequence into multiple subsets of sequences. Each subset infers one sub-model, which is incrementally merged into a final HMM model for attack prediction. The HMM approaches mainly utilize the time category of contextual information to create time-variant prediction models and state transition sequences. These techniques use graphs to model the system states that change with time and use them mainly for attack prediction [164,200,213,302]. Spectral clustering is an approach that considers clustering as a graph partitioning problem. The main principle of spectral clustering is straightforward: Given some data points, we can form their similarity matrix, then the components in the top eigenvectors are calculated and used to partition those points into several clusters. Usually, the graph is partitioned such that edge weights between the partitions are minimized and edge weights within the clusters are maximized, grouping similar points together in the same cluster, while dissimilar points gather in different clusters. There are few approaches that utilize spectral clustering for intrusion detection problem. Gujral et al. [101] propose a Spectral Graph Transducer and Gaussian Fields approach to detect unknown attacks. In [58,186], an unsupervised spectral clustering-based intrusion detection technique is introduced. The representative clusters are labeled as benign activities or attacks according to an assignment heuristic. The resulting structure is then used at run time for attack identification. Histogram techniques create several bins using the features and labels extracted from intrusion detection data and then count the number of connections they belong to in each bin.

123

Contextual information fusion for intrusion...

For instance, to create attack profiles, the intrusion detection data is discretized into benign and suspicious regions using labeled data. Afterward, the suspicious regions of the discretized features are used in creating attack profiles. There are several variations of histogram-based techniques. In [201], a real-time packet-level intrusion prevention technique is created that “maps the payload histogram onto a pair of features using hypercube hash functions. The two-dimensional feature space is quantized into a binary bitmap representing the benign and suspicious regions.” Kind et al. [80,140] propose a feature-based anomaly detection approaches that construct histograms of different traffic features to model histogram patterns and identify deviations from the created models. In [137], a hierarchical, multi-tier, multiobservation-window, and histogram-based network intrusion detection technique is proposed. The approach constructs a probability density function for the benign behavior. A similarity metric is used to formulate an anomaly status as a vector that can be classified by a NN classifier. Kulsoom et al. [145] propose a network traffic visualization technique that assists administrators to recognize attacks utilizing stacked histogram to aggregate port activity in order to detect anomalies at run time. The approach enables the network administrator to drill down and roll-up using a visualization model. Similarly [149] propose a statisticalbased method to identify attacks using PCA and a visualization technique. The techniques in this category mainly utilize the trace features to create context profiles. Activity is the major category of context that is modeled as feature-based profiles to create histogram-based intrusion detection techniques.

5.3.4 Contextual information in anomaly-based IDSs Most anomaly-based approaches utilize the time category of contextual information, where anomalies are usually detected based on time windows using a temporal-based analysis of the collected traces. The relation category of context is mainly used in Bayesian, knearest neighbors, clustering, and hidden Markov models. The activity category has been utilized in creating rule anomaly-based intrusion detection techniques [30,156,179,202,214]. Other contextual aspects such as time have been utilized to detect contextual anomalies and declare them as intrusions [38,79]. For instance, [38] propose a context-dependent and time-window model to discover intrusions as anomalies using fuzzy rules. The common contextual modeling technique in this category is the feature-based profile. IDSs, with rulebased anomaly detection techniques utilize contextual information to filter out the benign activities and keep the suspicious ones. The majority of NN approaches use the activity category in their implementations, although ([9,169]) propose a NN aggregation approach that utilizes the time category to detect intrusions from system calls. Some approaches, such as the one by [169], utilize NNs in an anomaly detection mode not only to distinguish between benign activities and attacks but also between attack types based on their contextual pre-conditions. We are not aware of NN anomaly detection techniques that utilize relation context to identify related attacks. While these approaches utilize mainly the activity category , there are few approaches that focus on the remaining ones: Ma and Perkins [177] utilize the time category to discover anomalies in time-series data. An algorithm to identify anomalies in time-series data using one-class SVM is proposed. The time series are converted into a set of vectors in the (projected) phase spaces. The anomaly events in time series are detected as outliers of the “benign” distribution of the converted vectors. On the same venue, [245] utilize the temporal relationships between flows of packets during data preprocessing. The temporal relationships between the inputs are used in SVM learning. The major use of contextual information in one-class SVM is to predict attacks. They

123

A. Aleroud, G. Karabatis

utilize ”Passive TCP/IP Fingerprinting (PTF) in order to reject incomplete network trace that either violates the TCP/IP standards or generation policy inside well-known platforms”. BNsbased anomaly detection techniques utilize several contextual categories. [262,263] combine Bayesian statistical models with a function of time slicing to detect anomalies. The majority of anomaly-based BNs utilize activity and relation categories [42,262,263,287]. Graphs and feature profiles are the common contextual models in Bayesian-based anomaly detection techniques. In k-NN-based intrusion detection approaches, the relation category of context is very significant [82,264]. In density-based techniques, the connection is compared to other connections in its neighborhood; therefore, its similarity with other k instances decides if that connection is an attack [78]. The contextual information in k-NN-based approaches is usually modeled using several types of profiles such as benign activity [144], and relation-based profiles [279] which are used to model the similarity between nodes that belong to same neighborhood so that they can be predicted together. Some approaches such as the one by [279] use k-NN to retrieve relevant predictions that belong to the same neighborhood. Clustering techniques focus on the relation category. Although relationships exist between the connections that belong to the same cluster, it is important to focus on discovering semantic relationships between connections in different clusters. The types of these latter relationships, their strength, and their effect on attack prediction process have not been studied by the existing intrusion detection approaches. The activity and time categories have been widely used in clustering approaches discussed in this section. For instance, an eigenspace clustering approach which uses time sequences of graphs is proposed and used by [120] to discover cyber-attacks. The technique utilizes the principal eigenvector to partition the graph, then it derives a probability distribution for an anomaly measure that is defined for a time-series data. In general, context profiles which store information about nodes and their clusters are the main context modeling technique in clustering approaches to predict attacks. Few approaches utilize such information in filtering the predictions of IDSs [239,265]. Table 8 summarizes several characteristics of anomaly-based intrusion detection techniques, including the algorithms used to create each approach, the datasets utilized during experiments, the software packages or tools utilized, and the contextual information used to create those approaches.

5.4 Hybrid intrusion detection techniques This section discusses the major hybrid intrusion detection techniques that utilize anomaly and misuse (signature)-based detection techniques in a hybrid manner to identify intrusions. The hybrid intrusion detection techniques can be categorized into three groups. First, approaches which utilize signature-based techniques on top of anomaly-based ones. In this group, the anomaly detection technique generates an initial prediction based on the nature of incoming activities. If the connection is predicted as benign, no further action is taken. If the connection is predicted as suspicious, it is forwarded to the misuse detection component, which applies similarity measures to find the similarity between the discovered suspicious patterns and known attack signatures. The suspicious pattern will be declared as an attack if the similarity is high. Second, techniques which utilize anomaly-based intrusion detection techniques on top of signature-based techniques. In this group, the connections are initially processed by a signature-based module. If such a module labels the connection as an attack, it is further processed by an anomaly detection module. An anomaly score is calculated and given to each new connection. Such a score is used to declare the instance as an attack when its anomaly score is greater than a specific threshold.

123

Category

Rule-based techniques

Probability-based techniques

Neural networks

SVM

k-NN

Article

[79,242]

[42,263,287]

([9,169])

[245]

[82,264]

Fuzzy Belief k-NN

One-class SVM & selforganized feature map (SOFM)

Multilayer perceptron (MLP) neural network; modified quantum-behaved particle swarm optimization (MQPSO) algorithm

Bayesian Belief Networks; Naïve Bayes; Dirichlet; Mixture models

Entropy and sparse Markov transducers; k-means+ID3

Algorithm/technique

https://www. mathworks.com/ matlabcentral/ fileexchange/21326fuzzy-k-nn

Activity

Matlab: http://www.cis. Time and Activity hut.fi/projects/ somtoolbox/

DARPA 1999

DARPA 1999

Matlab: http:// www.mathworks. com/products/ neural-network/

Computer Immune Systems Project (live login, Synthetic FTP and Synthetic Xlock); DARPA1999

Time and activity

Prediction

Filtering

Prediction

Prediction and filtering

Time, location, and Snort-Stimulator, Collected at the activity University of Deusto; Metasploit, Netdude, Ettercap, and Packit DARPA1999; tools; NA; NA DARPA1999

Information used for Prediction and filtering

Contextual category Time, location, and activity

Software packages/tools used

NA University of New Mexico Data; Lincoln Lab Data DARPA Dataset

Dataset

Table 8 Anomaly-based intrusion detection techniques and the contextual information used

Contextual information fusion for intrusion...

123

123

Histogram-based Techniques

Entropy-based measures

[80,140]

[271]

https://www. mathworks.com/ matlabcentral/ fileexchange/34412fast-and-efficientspectral-clustering

http://www.mathworks. com/help/fuzzy/fcm. html

Software packages/tools used

Netflow Packets Collected from the Campus of IBM Research at Zurich; DARPA1999

NA

NA

Dataset of New Mexico, https://www.cs.ubc.ca/ ~murphyk/Software/ Mill Dataset and a HMM/hmm.html Pascal Dataset MIT Lincoln Lab, Forrest Dataset: https://www. cs.unm.edu/~forrest/ publications/ieee-sp96-unix.pdf

DARPA 1999

DARPA 1999

Dataset

Kolmogorov complexity Synthetic data

Manhattan distance, Mahalanobis distance, normalized Euclidean distance, Hamming distance; Payload histogram Using HMM

hidden Markov model and Markov chain

Markov-based techniques

[164,200,213,302]

Fuzzy C-means

Spectral clustering

Clustering Techniques

[56,239,265,292]

Algorithm/technique

[58,186]

Category

Article

Table 8 continued

Prediction

Prediction

Activity

Time

Activity

Prediction

Prediction

Prediction

Activity

Time and Location

Information used for

Contextual category

A. Aleroud, G. Karabatis

Contextual information fusion for intrusion...

Third, techniques which apply anomaly-based detection and misuse-based detection techniques in parallel to identify intrusions. In this type of techniques, a correlation engine is created and utilized to analyze the suspicious patterns sent by the misuse and anomaly detection techniques. There are several authors who utilize hybrid intrusion detections approaches, such as: 1. Combining anomaly and misuse detection techniques by [267] 2. Combining neural networks and fuzzy logic with network profiling to process the network data (Dwen-Ren et al. 2003; Idris and Shanmugam 2005) 3. Combining neural networks with self-organizing maps by [121] 4. Combining Packet Header Anomaly Detection (PHAD) and Network Traffic Anomaly Detection (NETAD) by (Ayd et al. 2009). 5. Combining the artificial neural networks (ANNs), support vector machines (SVMs), and multivariate adaptive regression splines (MARS) to create a layered attack detection system by [192]. In summary, Table 9 displays several characteristics of hybrid-based intrusion detection techniques, including the algorithms used to create each approach, the datasets utilized during experiments, the software packages or tools utilized, and the contextual information utilized to create those approaches. In addition, Table 10 lists several current intrusion detection platforms that include signature, anomaly, network, or host-based detection modules. The majority of these IDSs are rule-based systems, and they do not utilize context. Only few IDSs utilize event correlation as a detection mechanism.

6 Limitations and future directions The research approaches discussed in previous sections show that the existing context-aware intrusion detection techniques have a number of limitations. First, although some existing intrusion detection approaches do utilize contextual information in their operations, they do not utilize semantic inference to automatically generate contextual relationships between security incidents. Context with semantics are quite significant to predict cyber-attacks that are feasible under specific circumstances. Semantic reasoning can lead to the discovery of unseen relationships that cannot be identified either by domain experts due to large amount of network data or by other data mining techniques where no semantic inference is available. While attack graphs incorporate some contextual information such as the relationships between vulnerabilities, they contain too many paths; traversing all such paths to discover possible attacks leads to a non-efficient detection process. Second, the focus of most existing approaches is to use contextual information for a single objective: to predict attacks by analyzing raw data. Few approaches discover relevant attacks based on pre-identified relations and filter out irrelevant ones based on context. Thus, new approaches are needed to identify contextual relationships between attacks, improve the quality of predictions produced by IDSs, and then apply context-based filtering to restrict some of these predictions. Third, we found a limited number of approaches identifying unknown attacks using the contextual relationships that exist between known attacks. An unknown attack is a sequence of activities where an attacker employs an unknown vulnerability in a computer application to initiate an attack. An unknown attack occurs on day-zero of awareness of the vulnerability that causes it. Code developers have no knowledge about zero-day vulnerabilities to address and patch systems against unknown attacks. Since the approaches that utilize vulnerability

123

123

Category

Anomaly and misuse

Anomaly and misuse; host-based, anomaly, and misuse

Packet Header Anomaly Detection (PHAD) & Network Traffic Anomaly Detection (NETAD)

Misuse detection

Article

[267]

[75,121]

[27]

[192]

NA

http://www. mathworks. com/help/ nnet/index. html

DARPA 1999

DARPA 1999

Rule-based

Artificial neural networks (ANNs), (SVMs), and multivariate adaptive regression splines (MARS)

NA

Synthetic data

NN & fuzzy inference; fuzzy rules, NN, and SOM

Prediction

Prediction

Activity

Prediction

Prediction and filtering

Information use

Activity

Activity

Activity

WIDS

Contextual category

Software packages/ Tools used

HTTP traffic of two web servers: an academic one (SUPELEC) and industrial one (FT)

Dataset

Rule-based

Algorithm/ technique

Table 9 Signature-based intrusion detection techniques and the contextual categories used

A. Aleroud, G. Karabatis

Contextual information fusion for intrusion... Table 10 Enterprise Network and Host-based IDSs IDS

Category

Detection technique(s)

Bro

NIDS

Rule-based application-level semantics

Snort

NIDS

Rule-based

BlackIce

HIDS, NIDS

Rule-based

eEye Digital Security

HIDS

Rule-based

McAfee Host Intrusion Prevention

HIDS

Rule-based

Symantec Critical System Protection

HIDS

Rule-based

DefenseWall HIPS

HIDS

Rule-based

Cisco Security Agent

HIDS

Rule-based, event correlation

RealSecure

HIDS

Protocol analysis, behavioral pattern sets, event correlation

Proventia

HIDS

Event filters

OpenDLP

HIDS

Rule-based

eTrust Intrusion Detection

NIDS

Rule-based

McAfee IntruShield

NIDS

User-configured policies, Rule-based

Ipolicy Intrusion Prevention

NIDS

Statistical traffic anomaly detection rule-based engine

NetRanger

NIDS

Context-oriented rules

Polycenter

NIDS

Knowledge-based analysis of audit data

MIDAS

HIDS

Statistical profiles

NIDES

HIDS

Statistical algorithms

SIDS

HIDS

Statistical algorithms

CMDS

HIDS

Statistical detection, rule-based

EMERALD

NIDS

Statistical profiles, Bayesian inference

IDES

HIDS

Rule-based

NPatrol

NIDS

Statistical profiles

DeepNines BBX Intrusion Prevention (IPS)

NIDS

Event correlation

CISCO Intrusion Prevention System

NIDS

Event correlation

Suricata

NIDS

Rule-based

Context-aware?

√

√

√ √

√

√

databases fail to detect zero-day attacks, there have been few studies addressing zero-day attacks using machine learning techniques such as unsupervised anomaly detection techniques [158,211], SVMs [251], or clustering [115,211]. The main assumption is twofold: first to presume any set of activities that do not conform to a well-defined notion of benign activity as an unknown attack. Second, to consider any connection that does not match the signatures of known attack as a benign activity. However, there are some shortcomings in

123

A. Aleroud, G. Karabatis

these assumptions; for the first one, it is not always possible to generate well-defined benign activity profiles. The second assumption ignores the fact that although unknown attacks have some unique characteristics, some steps in these attacks have an overlap with one or more steps used to initiate known attacks; thus, there is a degree of similarity between an unknown attack and one or more known attacks. Existing approaches try to discover unknown attacks using a semi-supervised anomaly detection process based on deviation from benign activity profiles. Since it is not easy to define the profiles of benign activities, because there are too many patterns of benign activity, these techniques lead to a high false-positive rate (more than 48% as reported by [158]. A recent trend in mitigating unknown attacks is to utilize software defined networks (SDNs). A question is whether one can decrease the risk of zero-day attacks on a network using SDNs and OpenFlow architecture. One extension to existing research approaches is to integrate such approaches with the OpenFlow model mainly to decrease the risk of zero-day attacks. The controller of the SDN can be programmed to forward all packets which are part of the same connection and not to trigger a matching rule in the flow table to a host that contains the IDS. The IDS measures the risk of zero-day attacks in the network by calculating a numeric score. Definitely, there is a need for further research in this direction.

7 Conclusion The growth of computer networks has increased the significance of cyber-security, ranging from homeland security to personal life, where systems become vulnerable and can be misused. In this review, we provided an outline of research approaches that utilize contextual information in creating misuse (signature)-based and anomaly-based IDSs to create intrusion detection techniques. The taxonomy proposed and the research approaches discussed indicate several limitations of existing techniques that partially utilize contextual information for attack discovery. Our taxonomy shows that there is still a need for automated reasoning of contextual information in cyber-security. Although such techniques exist in other application domains, they are rarely used for intrusion detection. In addition, our taxonomy shows that context-aware attack prediction models require domain knowledge extracted from ontologies to improve the detection rate of cyber-attacks, and it reveals that the majority of existing approaches do not take into account the available contextual information about the targeted systems. As such, there is a need to identify a risk score for each security incident based on the configuration of the target host. If there is strong evidence that the host is patched against attacks, IDS alerts that correspond to such attacks need to be discarded. In summary, we hope that the taxonomy and the discussion on incorporating contextual information in intrusion detection will encourage further research on intrusion detection and prevention. Acknowledgements This research has been partially funded by grants from the State of Maryland, TEDCO (MII), and Northrop-Grumman Corporation, USA.

References 1. Abadeh MS, Habibi J (2007) Computer intrusion detection using an iterative fuzzy rule learning approach. In: IEEE international fuzzy systems conference, Imperial College, London, UK, 23–26 July 2007, pp 1–6. doi:10.1109/FUZZY.2007.4295375 2. Abdoli F, Kahani M (2009) Ontology-based distributed intrusion detection system. In: 14th international CSI computer conference, Tehran, Iran, 20–21 Oct 2009, pp 65–70. doi:10.1109/CSICC.2009.5349372

123

Contextual information fusion for intrusion... 3. Abe N, Zadrozny B, Langford J (2006) Outlier detection by active learning. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, USA. 1150459. ACM, pp 504–509. doi:10.1145/1150402.1150459 4. Abouzakhar NS, Gani A, Manson G (2003) Bayesian learning networks approach to cybercrime detection. In: Proceedings of the PostGraduate networking conference (PGNET’03), Liverpool, UK 5. Adetunmbi AO, Falaki SO, Adewale OS, Alese BK (2008) Network intrusion detection based on rough set and k-nearest neighbour. Int J Comput ICT Res 2(1):60–66 6. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Washington, D.C., USA, 170072. ACM, pp 207–216. doi:10.1145/170035.170072 7. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Paper presented at the proceedings of the 20th international conference on very large data bases, Santiago de Chile, Chile 8. Ahmed U, Masood A (2009) Host based intrusion detection using rbf neural networks. In: International conference on emerging technologies (ICET’09), Slamabad, Pakistan, 19–20 Oct 2009, pp 48–51. doi:10. 1109/ICET.2009.5353204 9. Al-Subaie M, Zulkernine M (2006) Efficacy of hidden Markov models over neural networks in anomaly intrusion detection. In: 30th annual international computer software and applications conference (COMPSAC’06), Illinois, USA. IEEE, pp 325–332 10. Albayrak S, Muller A, Scheel C, Milosevic D (2005) Combining self-organizing map algorithms for robust and scalable intrusion detection. In: International conference on computational intelligence for modelling, control, and automation, Vienna, Austria, 28–30 Nov 2005, vol 2, pp 123–130. doi:10.1109/ CIMCA.2005.1631456 11. AlEroud A, Karabatis G (2013a) A contextual anomaly detection approach to discover zero-day attacks. In: ASE international conference on cyber security, Washington, D.C., USA, pp 40–45 12. AlEroud A, Karabatis G (2013b) A contextual anomaly detection approach to discover zero-day attacks. ASE international conference on cyber security, Washington, D.C, USA, pp 386–388 13. AlEroud A, Karabatis G (2013c) A system for cyber attack detection using contextual semantics. In: 7th international conference on knowledge management in organizations: service and cloud computing, vol 172 (Advances in Intelligent Systems and Computing). Springer, Berlin, pp 431–442 14. AlEroud A, Karabatis G (2013d) Toward zero-day attack identification using linear data transformation techniques. In: IEEE 7th international conference on software security and reliability (SERE’13), Washington, D.C., 18–20 June 2013, pp 159–168. doi:10.1109/SERE.2013.16 15. Aleroud A, Karabatis G (2014a) Context infusion in semantic link networks to detect cyber-attacks: a flow-based detection approach. In: IEEE international conference on semantic computing (ICSC) LA, California 16–18 June 2014, pp 175–182. doi:10.1109/ICSC.2014.29 16. AlEroud A, Karabatis G (2014b) Context infusion in semantic link networks to detect cyber-attacks: a flow-based detection approach. In: Eighth IEEE international conference on semantic computing, Newport Beach, California, USA, IEEE 17. AlEroud A, Karabatis G (2016) Queryable semantics for the detection of cyber-attacks a flow-based detection approach. IEEE transactions on systems, man, and cybernetics: systems 18. AlEroud A, Karabatis G, Sharma P, He P (2014) Context and semantics for detection of cyber attacks. Int J Inf Comput Secur 6(1):63–92. doi:10.1504/ijics.2014.059791 19. Alserhani F, Akhlaq M, Awan IU, Cullen AJ, Mirchandani P (2010) MARS: multi-stage attack recognition system. In: 24th IEEE international conference on advanced information networking and applications (AINA’10), Perth, Australia, 20–23 April 2010, pp 753–759. doi:10.1109/AINA.2010.57 20. Ambwani T (2003) Multi class support vector machine implementation to intrusion detection. In: Proceedings of the international joint conference on neural networks, Portland, vol 3. IEEE, pp 2300–2305 21. An X, Jutla D, Cercone N (2006) Privacy intrusion detection using dynamic Bayesian networks. In: Proceedings of the 8th international conference on electronic commerce, Fredericton, New Brunswick, Canada. 1151493. ACM, pp 208–215. doi:10.1145/1151454.1151493 22. Angelini M, Prigent N, Santucci G (2015) PERCIVAL: proactive and reactive attack and response assessment for cyber incidents using visual analytics. In: IEEE symposium on visualization for cyber security (VizSec), 25–25 Oct 2015, pp 1–8. doi:10.1109/VIZSEC.2015.7312764 23. Apiletti D, Baralis E, Cerquitelli T, D’Elia V (2008) Network digest analysis by means of association rules. In: 4th international IEEE conference on intelligent systems(IS ’08), Varna, 6–8 Sept 2008, vol 2, pp 11–32. doi:10.1109/is.2008.4670505 24. Arya A, Kumar, S (2014) Information theoretic feature extraction to reduce dimensionality of Genetic Network Programming based intrusion detection model. In: Issues and challenges in intelligent computing techniques (ICICT). IEEE, pp 34–37

123

A. Aleroud, G. Karabatis 25. Atallah M, Szpankowski W, Gwadera R (2004) Detection of significant sets of episodes in event sequences. In: Fourth IEEE international conference on data mining (ICDM’04) Brighton, UK. IEEE, pp 3–10 26. Axelsson S (2000) Intrusion detection systems: a survey and taxonomy. Accessed (2000) 27. Ayd MA, Zaim AH, Ceylan K (2009) A hybrid intrusion detection system design for computer network security. Comput Electr Eng 35(3):517–526. doi:10.1016/j.compeleceng.2008.12.005 28. Baldauf M, Dustdar S, Rosenberg F (2007) A survey on context-aware systems. Int J Ad Hoc Ubiquitous Comput 2(4):263–277. doi:10.1504/ijahuc.2007.014070 29. Barbar D, Couto J, Jajodia S, Wu N (2001) ADAM: a testbed for exploring the use of data mining in intrusion detection. SIGMOD Rec 30(4):15–24. doi:10.1145/604264.604268 30. Barbara D, Wu N, Jajodia S (2001) Detecting novel network intrusions using Bayes estimators. In: First SIAM conference on data mining, Chicago IL, Citeseer, pp 1–17 31. Bazire M, Brézillon P (2005) Understanding context before using it. In: Proceedings of the 5th international conference on modeling and using context, Paris, France, pp 113–192 32. Beauquier J, Hu Y (2007) Intrusion detection based on distance combination. In: World Acacemy of Science and Engineering (CESSE’07), Venice, Italy 33. Bloedorn E, Christiansen AD, Hill W, Skorupka C, Talbot LM, Tivel J (2001) Data mining for network intrusion detection: how to get started. Accessed (2001) 34. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1):245–271 35. Böhmer M, Bauer G, Krüge A (2011) Context tags: exploiting user-given contextual cues for disambiguation. In: Proceedings of the 13th international conference on human computer interaction with mobile devices and services, Stockholm, Sweden. ACM, pp 611–616, 2037469. doi:10.1145/2037373. 2037469 36. Bonifacio JM, Jr Cansian AM, de Carvalho A, Moreira ES (1998) Neural networks applied in intrusion detection systems. In: The IEEE international joint conference on neural networks, Anchorage, AK, 4–8 May 1998, vol 1, pp 205–210. doi:10.1109/IJCNN.1998.682263 37. Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: In Proceedings of the eighth SIAM international conference on data mining, Atlanta, Georgia 38. Botha M, von Solms R (2003) Utilising fuzzy logic and trend analysis for effective intrusion detection. Comput Secur 22(5):423–434. doi:10.1016/S0167-4048(03)00511-X 39. Bouramoul A, Kholladi MK, Doan BL (2011) Using context to improve the evaluation of information retrieval systems. Int J Database Manag Syst (IJDMS ) 3(2):22–39 40. Bouzida Y, Cuppens F, Cuppens-Boulahia N, Gombault S (2004) Intrusion detection using principal component analysis. In: In proceedings of the 7th world multiconference on systemics, cybernetics and informatics, Orlando, USA 41. Bridges SM, Vaughn RB (2000) Fuzzy data mining and genetic algorithms applied to intrusion detection. In: In Proceedings of the national information systems security conference (NISSC), Baltimore, MD 42. Bringas PG (2007) Intensive use of Bayesian belief networks for the unified, flexible and adaptable analysis of misuses and anomalies in network intrusion detection and prevention systems. In: 18th international workshop on database and expert systems applications(DEXA ’07), Regensburg, Germany, 3–7 Sept 2007, pp 365–371. doi:10.1109/DEXA.2007.38 43. Brown PJ, Bovey JD, Chen X (1997) Context-aware applications: from the laboratory to the marketplace. IEEE Pers Commun 4(5):58–64 44. Buczak AL, Guven E (2015) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutor 18(2):1153–1176 45. Burroughs DJ, Wilson LF, Cybenko GV (2002) Analysis of distributed intrusion detection systems using Bayesian methods. In: 21st IEEE international performance, computing, and communications conference, Austin, Texas, USA, pp 329–334. doi:10.1109/IPCCC.2002.995166 46. Cannady J (1998) Artificial neural networks for misuse detection. In: National information systems security conference, Crystal City Arlington, Virginia, USA, pp 368–381 47. Cha BR, Vaidya B, Han S (2005) Anomaly intrusion detection for system call using the soundex algorithm and neural networks. In: 10th IEEE symposium on computers and communications (ISCC’05), Cartagena, Spain. IEEE, pp 427–433 48. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58. doi:10.1145/1541880.1541882 49. Chandola V, Eilertson E, Ertoz L, Simon G, Kumar V (2006) Data mining for cyber security, book chapter in data warehousing and data mining techniques for computer security, 1st edn. Springer, Berlin 50. Cheboli D (2010) Anomaly detection of time series. PhD Thesis, University of Minnesota

123

Contextual information fusion for intrusion... 51. Chebrolu S, Abraham A, Thomas JP (2005) Feature deduction and ensemble design of intrusion detection systems. Comput Secur 24(4):295–307. doi:10.1016/j.cose.2004.09.008 52. Chen H, Finin T, Joshi A (2003) An ontology for context-aware pervasive computing environments. Knowl Eng Rev 18(3):197–207. doi:10.1017/s0269888904000025 53. Chen RC, Chen SP (2008) Intrusion Detection Using a Hybrid Support Vector Machine Based on Entropy and TF-IDF. Int J Innov Comput Inf Control 4(2):413–424 54. Chen RC, Cheng KF, Chen YH, Hsieh CF (2009) Using rough set and support vector machine for network intrusion detection system. In: First Asian conference on intelligent information and database systems (ACIIDS’09), Quang binh, Vietnam. IEEE, pp 465–470 55. Cheng X, Liu B-x, Li K, Yan J (2009) Intrusion detection system based on KNN-MARS. In: WRI world congress on software engineering (WCSE ’09), Xiamen, China, 19–21 May 2009, vol 1, pp 392–396. doi:10.1109/WCSE.2009.79 56. Chimphlee W, Abdullah AH, Noor Md Sap M, Srinoy S, Chimphlee S (2006) Anomaly-based intrusion detection using fuzzy rough clustering. In: International conference on hybrid information technology (ICHIT ’06), Jeju Island, Korea, 9–11 Nov 2006, vol 1, pp 329–334. doi:10.1109/ICHIT.2006.253508 57. Chitta R, Jin R, Jain AK (2012) Efficient kernel clustering using random fourier features. In: IEEE 12th international conference on data mining, IEEE, pp 161–170 58. Chuanliang C, Yunchao G, Yingjie T (2008) Semi-supervised learning methods for network intrusion detection. In: IEEE international conference on systems, man and cybernetics (SMC’08), Seoul, Korea, 12–15 Oct 2008, pp 2603–2608. doi:10.1109/ICSMC.2008.4811688 59. Dasgupta D, González F (2002) An immunity-based technique to characterize intrusions in computer networks. IEEE Trans Evol Comput 6(3):281–291 60. Dasgupta D, Nino F (2000) A comparison of negative and positive selection algorithms in novel pattern detection. In: IEEE international conference on systems, man, and cybernetics, Nashville, TN, vol 1. IEEE, pp 125–130 61. Dayu Y, Hairong Q (2008) A network intrusion detection method using independent component analysis. In: 19th international conference on pattern recognition (ICPR’08), Tampa, Florida, USA, 8–11 Dec 2008, pp 1–4. doi:10.1109/ICPR.2008.4761087 62. de Lima IVM, Degaspari JA, Sobral JBM (2008) Intrusion detection through artificial neural networks. In: IEEE network operations and management symposium (NOMS’08), Bahia, Brazil, 7–11 April 2008, pp 867–870. doi:10.1109/NOMS.2008.4575234 63. Debar H, Becker M, Siboni D (1992) A neural network component for an intrusion detection system. In: IEEE computer society symposium on research in security and privacy, Oakland, California, 4–6 May 1992, pp 240–250. doi:10.1109/RISP.1992.213257 64. Debar H, Dacier M, Wespi A (1999) Towards a taxonomy of intrusion-detection systems. Comput Netw 31(8):805–822 65. Debar H, Dacier M, Wespi A (2000) A revised taxonomy for intrusion-detection systems. Ann Telecommun 55(7):361–378 66. Denning DE (1987) An intrusion-detection model. IEEE Trans Software Eng 13(2):222–232 67. Depren O, Topallar M, Anarim E, Ciliz MK (2005) An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert Syst Appl 29(4):713–722. doi:10.1016/j. eswa.2005.05.002 68. Desheng F, Shu Z, Ping G (2009) Research on a distributed network intrusion detection system based on association rule mining. In: 1st international conference on information science and engineering (ICISE), Nanjing, 26–28 Dec 2009, pp 1816–1818. doi:10.1109/icise.2009.929 69. Dey AK (2000) Providing architectural support for building context-aware applications. PhD Thesis , Georgia Institute of Technology 70. Dharap C (Google Patents, Patent version number: 6,256,633, 2001). Context-based and user-profile driven information retrieval. Google Patents 71. Dickerson JE, Dickerson JA (2000) Fuzzy network profiling for intrusion detection. In: 19th international conference of the North American on Fuzzy Information Processing Society, Atlanta, Georgia, 2000, pp 301–306. doi:10.1109/NAFIPS.2000.877441 72. Dickerson JE, Juslin J, Koukousoula O, Dickerson JA (2001) Fuzzy intrusion detection. In: IFSA (International Fuzzy Systems Association) world congress and 20th NAFIPS (North American Fuzzy Information Processing Society) international conference, Vancouver, British Columbia, vol 3. IEEEE, pp 1506–1510 73. Ding T, AlEroud A Karabatis G (2015) Multi-granular aggregation of network flows for security analysis. In: IEEE international conference on intelligence and security informatics (ISI). IEEE, pp 173–175

123

A. Aleroud, G. Karabatis 74. Ding X, Zhang G, Ke Y, Ma B, Li Z (2008) High efficient intrusion detection methodology with twin support vector machines. In: International symposium on information science and engineering (ISISE’08), Shanghai, China, vol 1. IEEE, pp 560–564 75. Dwen-Ren T, Wen-Pin T, Chi-Fang C (2003) A hybrid intelligent intrusion detection system to recognize novel attacks. In: IEEE 37th Annual international Carnahan conference on security technology, Taipei, Taiwan, 14–16 Oct 2003, pp 428–434. doi:10.1109/CCST.2003.1297598 76. Eiland EE, Liebrock LM (2006) An application of information theory to intrusion detection. In: Fourth IEEE international workshop on information assurance (IWIA’06), Egham, Surrey, UK, 13–14 April 2006, pp 66–81. doi:10.1109/IWIA.2006.3 77. El-Semary A, Edmonds J, Gonzalez-Pino J, Papa M (2006) Applying data mining of fuzzy association rules to network intrusion detection. In: IEEE information assurance workshop, New York, USA, 21–23 June 2006, pp 100–107. doi:10.1109/iaw.2006.1652083 78. Eskin E, Arnold A, Prerau M, Portnoy L, Stolfo S (2002) A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Proceedings of the conference on applications of data mining in computer security. Kluwer Academics, pp 78–100 79. Eskin E, Lee W, Stolfo SJ (2001) Modeling system calls for intrusion detection with dynamic window sizes. In: Proceedings of DARPA information survivability conference & exposition (DISCEX’01), Anaheim, California, vol 1. IEEE, pp 165–175 80. Estévez-Tapiador JM, Garcıa-Teodoro P, Dıaz-Verdejo JE (2004) Measuring normality in HTTP traffic for anomaly-based intrusion detection. Comput Netw 45(2):175–193. doi:10.1016/j.comnet.2003.12. 016 81. Fan W, Miller M, Stolfo S, Lee W, Chan P (2004) Using artificial anomalies to detect unknown and known network intrusions. Knowl Inf Syst 6(5):507–527 82. Fangfei W, Qingshan J, Lifei C, Zhiling H (2007) Clustering ensemble based on the fuzzy KNN algorithm. In: Eighth ACIS international conference on software engineering, artificial intelligence, networking, and parallel/distributed computing (SNPD’07), Qingdao, July 30 2007–Aug 1 2007, vol 3, pp 1001–1006. doi:10.1109/SNPD.2007.504 83. Fischer F, Mansmann F, Keim DA, Pietzko S, Waldvogel M (2008) Large-scale network monitoring for visual analysis of attacks. In: Visualization for computer security. Springer, pp 111–118 84. Florez G, Bridges S, Vaughn RB (2002) An improved algorithm for fuzzy data mining for intrusion detection. In: Annual meeting of the North American fuzzy information processing society (NAFIPS’02), Ann Arbor, MI. IEEE, pp 457–462 85. Fortu O, Moldovan D (2005) Identification of textual contexts. In: Proceedings of the 5th international conference on modeling and using context, Paris, France. 2136862. Springer, pp 169–182. doi:10.1007/ 11508373_13 86. Gao B, Ma HY, Yang YH (2002) HMMS (Hidden Markov Models) based on anomaly intrusion detection method. In: International conference on machine learning and cybernetics, Beijing, vol 1. IEEE, pp 381– 385 87. Gao M, Tian J, Xia M (2009) Intrusion detection method based on classify support vector machine. In: Second international conference on intelligent computation technology and automation (ICICTA’09), Zhangjiajie, China, vol 2. IEEE, pp 391–394 88. Giseop N, Ilkyeun R (2009) An efficient and reliable DDoS attack detection using a fast entropy computation method. In: 9th international symposium on communications and information technology (ISCIT’09), Icheon, South Korea, 28–30 Sept 2009, pp 1223–1228. doi:10.1109/ISCIT.2009.5341118 89. Gomez J, Dasgupta D (2002) Evolving fuzzy classifiers for intrusion detection. In: Proceedings of the IEEE workshop on information assurance, West Point, NY, vol 6. IEEE Computer Press, New York, vol 3, pp 321–323 90. Gómez J, González F, Dasgupta D (2003) An immuno-fuzzy approach to anomaly detection. In: The 12th IEEE international conference on fuzzy systems(FUZZ’03), St. Louis, MO, USA, vol 2. IEEE, pp 1219–1224 91. Granitzer M, Kroll M, Seifert C, Rath AS, Weber N, Dietzel O, et al (2008) Analysis of machine learning techniques for context extraction. In: Third international conference on digital information management (ICDIM’08), London, UK. IEEE, pp 233–240 92. Gray D, Kraus R (2012, Available: https://www.necam.com/docs/?id=36eda3e2-ec01-4117-a7cc3483db8422e7). Contextual security provides actionable intelligence. Accessed 2012, Available: https:// www.necam.com/docs/?id=36eda3e2-ec01-4117-a7cc-3483db8422e7 93. Green DM, Swets JA (1966) Signal detection theory and psychophysics, vol 1974. Wiley, New, York 94. Greenberg S (2001) Context as a dynamic construct. Hum Comput Interact 16(2):257–268. doi:10.1207/ s15327051hci16234_09

123

Contextual information fusion for intrusion... 95. Grobelnik M, Mladenic D, Leban G, Stajner T (2011) Context and semantics for knowledge management: technologies for personal productivity: machine learning techniques for understanding context and process (1st ed). Springer, Berlin, pp 127–145 96. Gross T, Specht M (2001) Awareness in context-aware information systems. In: Mensch & computer conference, Germany, vol 1. Citeseer, pp 173–182 97. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199– 220. doi:10.1006/knac.1993.1008 98. Gruschke B (1998) Integrated event management: event correlation using dependency graphs. In: Proceedings of the 9th IFIP/IEEE international workshop on distributed systems: operations & management (DSOM 98), Newark, DE, USA, pp 130–141 99. Gu G, Fogla P, Dagon D, Lee W, Skori´c B (2006) Measuring intrusion detection capability: an information-theoretic approach. In: Proceedings of the ACM symposium on information, computer and communications security, Taipei, Taiwan. ACM, pp 90–101 100. Guan Y, Ghorbani AA, Belacel N (2003) Y-means: a clustering method for intrusion detection. In: IEEE Canadian conference on electrical and computer engineering, Canada; Montreal, 4–7 May 2003, vol 2, pp 1083–1086. doi:10.1109/CCECE.2003.1226084 101. Gujral S, Ortiz E, Syrmos VL (2009) An unsupervised method for intrusion detection using spectral clustering. In: IEEE symposium on computational intelligence in cyber security (CICS ’09), Nashville, TN, USA, March 30 2009–April 2 2009, pp 99–106. doi:10.1109/CICYBS.2009.4925096 102. Guo C, Zhou Y-J, Ping Y, Luo S-S, Lai Y-P, Zhang Z-K (2013) Efficient intrusion detection using representative instances. Comput Secur 39:255–267. doi:10.1016/j.cose.2013.08.003 103. Haijun X, Fang P, Ling W, Hongwei L (2007) Ad hoc-based feature selection and support vector machine classifier for intrusion detection. In: IEEE international conference on grey systems and intelligent services, (GSIS07), Macau, China. IEEE, pp 1117–1121 104. Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, the University of Waikato 105. Halme LR (1995) AIN’T misbehaving-A taxonomy of anti-intrusion techniques. Comput Secur 14(7):606–606 106. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. SIGMOD Rec 29(2):1–12. doi:10.1145/335191.335372 107. Han SJ, Cho SB (2005) Evolutionary neural networks for anomaly detection based on the behavior of a program. IEEE Trans Syst Man Cybern B Cybern 36(3):559–570 108. Han W, Xiong W, Xiao Y, Ellabidy M, Vasilakos AV, Xiong N (2012) A class of non-statistical traffic anomaly detection in complex network systems. In: 32nd international conference on distributed computing systems workshops (ICDCSW), Macau, China. IEEE, pp 6400–6406 109. Handra SI, Ciocarlie H (2011) Anomaly detection in data mining. Hybrid approach between filtering-andrefinement and DBSCAN. In: 6th IEEE international symposium on applied computational intelligence and informatics (SACI), Timisoara, Romania, 19–21 May 2011, pp 75–83. doi:10.1109/SACI.2011. 5872976 110. Hassanzadeh A, Sadeghian B (2008) Intrusion detection with data correlation relation graph. In: Third international conference on availability, reliability and security (ARES’08), Washington, DC, USA, 4–7 March 2008, pp 982–989. doi:10.1109/ARES.2008.119 111. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: 4th international conference on data warehousing and knowledge discovery, Aix-en-Provence, France, pp 113–123 112. Hayes MA, Capretz MA (2014) Contextual anomaly detection in big sensor data. In: 2014 IEEE international congress on big data. IEEE, pp 64–71 113. Hellemons L, Hendriks L, Hofstede R, Sperotto A, Sadre R, Pras A (2012) SSHCure: a flow-based SSH ˇ intrusion detection system. In: Sadre R, Novotný J, Celeda P, Waldburger M, Stiller B (eds) Dependable networks and services, vol 7279 (Lecture Notes in Computer Science), Springer, Berlin, pp 86–97 114. Heller K, Svore K, Keromytis AD, Stolfo S (2003) One class support vector machines for detecting anomalous windows registry accesses. In: Workshop on data mining for computer security (DMSEC), Melbourne, FL, pp 2–9 115. Hendry GR, Yang SJ (2008) Intrusion signature creation via clustering anomalies. In: Proceeding of SPIE, Bellingham, WA, pp 69730–69731 116. Hu W, Gao J, Wang Y, Wu O, Maybank S (2014) Online Adaboost-based parameterized methods for dynamic distributed network intrusion detection. IEEE Trans Cybern 44(1):66–82 117. Hu W, Liao Y, Vemuri VR (2003) Robust anomaly detection using support vector machines. In: Proceedings of the international conference on machine learning, Washington, DC USA, pp 282–289

123

A. Aleroud, G. Karabatis 118. Hunt EB, Marin J, Stone PJ (1966) Experiments in induction, 1st ed. The University of Michigan, Academic Press, Michigan 119. Hussein M, Zulkernine M (2006) UMLINTR: A UML profile for specifying intrusions. In: Proceedings of the 13th annual IEEE international symposium and workshop on engineering of computer based systems, Potsdam, Germany. 1126211: IEEE Computer Society, pp. 279–288. doi:10.1109/ecbs.2006. 70 120. Ide T, Kashima H (2004) Eigenspace-based anomaly detection in computer systems. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WA, USA. 1014102: ACM, pp 440–449. doi:10.1145/1014052.1014102 121. Idris NB, Shanmugam B (2005) Artificial intelligence techniques applied to intrusion detection. In: EEE India conference Indicon (INDICON’05), Chennai, India, 11–13 Dec 2005, pp 52–55. doi:10.1109/ INDCON.2005.1590122 122. Ippoliti D, Xiaobo Z (2010) An adaptive growing hierarchical self organizing map for network intrusion detection. In: Proceedings of 19th international conference on computer communications and networks (ICCCN’10), Zurich, Switzerland, 2–5 Aug 2010, pp 1–7. doi:10.1109/ICCCN.2010.5560165 123. Jadidi Z, Muthukkumarasamy V, Sithirasenan E, Sheikhan M (2013) Flow-based anomaly detection using neural network optimized with GSA algorithm. In: Distributed computing systems workshops (ICDCSW), 2013 IEEE 33rd international conference on, 8–11 July 2013, pp 76–81. doi:10.1109/ ICDCSW.2013.40 124. Jakobson G (2003) The technology and practice of integrated multiagent event correlation systems. In: International conference on integration of knowledge intensive multi-agent systems, Boston MA, USA, 30 Sept–4 Oct 2003, pp 568–573. doi:10.1109/KIMAS.2003.1245102 125. Jha S, Tan K, Maxion RA (2001) Markov chains, classifiers, and intrusion detection. In: Proceedings. 14th IEEE Computer Security Foundations., Nova Scotia, Canada, 2001, pp 206–219. doi:10.1109/ CSFW.2001.930147 126. Ji-Qing X, Feng-Hua L, Xian-Lun T (2005) A novel intrusion detection method based on clonal selection clustering algorithm. In: Proceedings of international conference on machine learning and cybernetics, Guangzhou, China, 18–21 Aug 2005, vol 6, pp 3905–3910. doi:10.1109/ICMLC.2005.1527620 127. Ji S-Y, Jeong B-K, Choi S, Jeong DH (2016) A multi-level intrusion detection method for abnormal network behaviors. J Netw Comput Appl 62:9–17 128. Jianxiong L, Bridges SM, Vaughn RB Jr (2001) Fuzzy frequent episodes for real-time intrusion detection. In: The 10th IEEE international conference on fuzzy systems, Melbourne, VIC, 2001, vol 1, pp 368–371. doi:10.1109/FUZZ.2001.1007325 129. Jie L, Zhi-tang L (2007) Using network attack graph to predict the future attacks. In: Second international conference on communications and networking in China (CHINACOM ’07), Xi’an, China, 22–24 Aug 2007, pp 403–407. doi:10.1109/CHINACOM.2007.4469413 130. Jing-xin W, Zhi-ying W, Kui D (2004) A network intrusion detection system based on the artificial neural networks. In: Proceedings of the 3rd international conference on information security, Shanghai, China. ACM, pp 166–170 131. Jing Z, Hongjuan W, Yushu L (2011) Intrusion detection using evolving fuzzy classifiers. In: 6th IEEE joint international information technology and artificial intelligence conference (ITAIC’11), Chongqing, 20–22 Aug 2011, vol 1, pp 119–122. doi:10.1109/ITAIC.2011.6030165 132. Jirapummin C, Wattanapongsakorn N, Kanthamanon P (2002) Hybrid neural networks for intrusion detection system. In: International conference on multimedia technology (ICMT), Wuhan, China, pp 928–931 133. Johnson RA, Wichern DW (1992) Applied multivariate statistical analysis, vol 4, 3rd edn. Prentice Hall, Englewood Cliffs 134. Jones AK, Sielken RS (2000) Computer system intrusion detection. A survey Accessed (2000) 135. Jou YF, Gong F, Sargor C, Wu SF, Cleaveland WR (1997) Architecture design of a scalable intrusion detection system for the emerging network infrastructure. Accessed (1997) 136. Juan W, Feng-Li Z, Jing J, Wei C (2010) Alert analysis and threat evaluation in network situation awareness. In: 2010 international conference on communications, circuits and systems (ICCCAS’10), Chengdu, China, 28–30 July 2010, pp 278–281. doi:10.1109/ICCCAS.2010.5582005 137. Jun L, Manikopoulos C (2003) Early statistical anomaly intrusion detection of DoS attacks using MIB traffic parameters. In: IEEE systems, man and cybernetics society information assurance workshop, West Point, New York, USA, 18–20 June 2003, pp 53–59. doi:10.1109/SMCSIA.2003.1232401 138. Jun M, Guanzhong D, Zhong X (2009) Network anomaly detection using dissimilarity-based one-class SVM classifier. In: International conference on parallel processing workshops (ICPPW ’09), Kaohsiung, 22–25 Sept 2009, pp 409–414. doi:10.1109/ICPPW.2009.6

123

Contextual information fusion for intrusion... 139. Kim G, Lee S, Kim S (2014) A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert Syst Appl 41(4, Part 2):1690–1700. doi:10.1016/j.eswa.2013.08.066 140. Kind A, Stoecklin MP, Dimitropoulos X (2009) Histogram-based Traffic Anomaly Detection. IEEE Trans Netw Serv Manag 6(2):110–121. doi:10.1109/TNSM.2009.090604 141. Kohavi R, John GH (1995) Automatic parameter selection by minimizing estimated error. In: Proceedings of the twelfth annual international conference on machine learning, Tahoe City, California, USA. Citeseer, pp 304–312) 142. Kruegel C, Mutz D, Robertson W, Valeur F (2003) Bayesian event classification for intrusion detection. In: 19th annual computer security applications conference, Las Vegas, NV, USA, 8–12 Dec 2003, pp 14–23. doi:10.1109/CSAC.2003.1254306 143. Kruegel C, Valeur F, Vigna G (2004) Intrusion detection and correlation: challenges and solutions, vol 14). Springer, Berlin 144. Kuang L, Zulkernine M (2008) An anomaly intrusion detection method using the CSI-KNN algorithm. In: Proceedings of the 2008 ACM symposium on applied computing, Fortaleza, Ceara, Brazil. 1363897: ACM, pp 921–926. doi:10.1145/1363686.1363897 145. Kulsoom A, Lee C, Conti G, Copeland JA (2005) Visualizing network data for intrusion detection. In: Proceedings from the sixth annual IEEE SMC information assurance workshop (IAW ’05), West Point, NY, 15–17 June 2005, pp 100–108. doi:10.1109/IAW.2005.1495940 146. Kumar P, Rao M, Krishna P, Bapi R (2005a) Using sub-sequence information with K-NN for classification of sequential data. In: Distributed computing and internet technology, Bhubaneswar, India, pp 1–11 147. Kumar P, Rao M, Krishna P, Bapi R (2005b) Using sub-sequence information with kNN for classification of sequential data. In: Distributed computing and internet technology, Bhubaneswar, India, pp 1–11 148. Kun-Lun L, Hou-Kuan H, Sheng-Feng T, Wei X (2003) Improving one-class SVM for anomaly detection. In: International conference on machine learning and cybernetics, Xi’an, China, 2–5 Nov 2003, vol 5, pp 3077–3081, vol 3075. doi:10.1109/ICMLC.2003.1260106 149. Labib K, Vemuri VR (2006) An application of principal component analysis to the detection and visualization of computer network attacks. Annales des télécommunications 61(1–2):218–234 150. Lakhina A, Crovella M, Diot C (2005) Mining anomalies using traffic feature distributions. In: Proceedings of the conference on applications, technologies, architectures, and protocols for computer communications (SIGCOMM ’05), Philadelphia, PA, USA, vol 35. ACM, pp 217–228, vol 4 151. Lazarevic A, Ertoz L, Kumar V, Ozgur A, Srivastava J (2003) A Comparative study of anomaly detection schemes in network intrusion detection. In: Proceedings of the third SIAM international conference on data mining, San Francisco, CA, USA, vol 3, pp 25–36. Society for Industrial & Applied 152. Lee SC, Heinbuch DV (2001) Training a neural-network based intrusion detector to recognize novel attacks. IEEE Trans Syst Man Cybern Syst Hum 31(4):294–299 153. Lee W, Stolfo SJ (1998a) Data mining approaches for intrusion detection. In: Proceedings of the 7th conference on USENIX security symposium, San Antonio, Texas, pp 6–12. 1267555: USENIX Association 154. Lee W, Stolfo SJ (1998b) Data mining approaches for intrusion detection. In: Usenix security 155. Lee W, Stolfo SJ (2000) A framework for constructing features and models for intrusion detection systems. ACM Trans Inf Syst Secur (TISSEC) 3(4):227–261 156. Lee W, Stolfo SJ, Mok KW (2000) Adaptive intrusion detection: a data mining approach. Artif Intell Rev 14(6):533–567 157. Lei JZ, Ghorbani A (2004) Network intrusion detection using an improved competitive learning neural network. In: Second annual conference on communication networks and services research, Fredericton, N.B., Canada, 19–21 May 2004, pp 190–197. doi:10.1109/DNSR.2004.1344728 158. Leung K, Leckie C (2005) Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the twenty-eighth Australasian conference on computer science, Newcastle, NSW, Australia. Australian Computer Society, Inc, pp 333–342 159. Li H, Guan XH, Zan X, Han CZ (2003) Network intrusion detection based on support vector machine. J Comput Res Dev 6(1):799–807 160. Li X-B (2005) A scalable decision tree system and its application in pattern recognition and intrusion detection. Decis Support Syst 41(1):112–130. doi:10.1016/j.dss.2004.06.0l6 161. Li Xy, Gao Gh, Sun Jx (2010) A new intrusion detection method based on improved DBSCAN. In: WASE international conference on information engineering (ICIE), Beidaihe, 14–15 Aug 2010, vol 2, pp 117–120. doi:10.1109/ICIE.2010.123 162. Li Y, Fang B, Guo L, Chen Y (2007) Network anomaly detection based on TCM-KNN algorithm. In: Proceedings of the 2nd ACM symposium on information, computer and communications security, Singapore. 1229292: ACM, pp 13–19. doi:10.1145/1229285.1229292

123

A. Aleroud, G. Karabatis 163. Li Y, Guo L (2007) An active learning based TCM-KNN algorithm for supervised network intrusion detection. Comput Secur 26(7):459–467 164. Liang Y, Wang HQ, Cai HB, He YJ (2008) A novel stochastic modeling method for network security situational awareness. In: 3rd IEEE conference on industrial electronics and applications (ICIEA’08), Singapore, 3–5 June 2008, pp 2422–2426. doi:10.1109/ICIEA.2008.4582951 165. Liao Y, Vemuri VR (2002) Use of K-nearest neighbor classifier for intrusion detection. Comput Secur 21(5):439–448 166. Lichodzijewski P, Nur Zincir-Heywood A, Heywood MI (2002) Host-based intrusion detection using self-organizing maps. In: Proceedings of the international joint conference on neural networks (IJCNN’02), Honolulu, Hawaii, vol 2. IEEE, pp 1714–1719 167. Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit 36(2):451–461 168. Liu G, Yi Z, Yang S (2007) A hierarchical intrusion detection model based on the PCA neural networks. Neurocomputing 70(7–9):1561–1568. doi:10.1016/j.neucom.2006.10.146 169. Liu L, Liu Y (2009) MQPSO based on wavelet neural network for network anomaly detection. In: 5th international conference on wireless communications (WiCom’09), Bijing, China. IEEE, pp 1–5 170. Livnat Y, Agutter J, Moon S, Erbacher RF, Foresti S (2005) A visualization paradigm for network intrusion detection. In: Proceedings from the sixth annual IEEE SMC information assurance workshop. IEEE, pp 92–99 171. Lizhong X, Zhiqing S, Gang L (2006) K-means algorithm based on particle swarm optimization algorithm for anomaly intrusion detection. In: The sixth world congress on intelligent control and automation (WCICA’06), Dalian, China, vol 2, pp 5854–5858. doi:10.1109/WCICA.2006.1714200 172. Lopes CT (2009) Context features and their use in information retrieval. Paper presented at the proceedings of the third BCS-IRSG conference on Future directions in information access, Padua, Italy 173. Lu H, Chen J, Wei W (2008) Two stratum bayesian network based anomaly detection model for intrusion detection system. In: International symposium on electronic commerce and security, Guangzhou, China 3–5:482–487. doi:10.1109/ISECS.2008.178 174. Lu N, Mabu S, Wang T, Hirasawa K (2012) Integrated fuzzy GNP rule mining with distance-based classification for intrusion detection system. In: IEEE international conference on systems, man, and cybernetics (SMC). Seoul, Korea, 14–17 Oct 2012, pp 1569–1574. doi:10.1109/ICSMC.2012.6377960 175. Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. Int J Intell Syst 15(8):687–703 176. Mehdi MSZ, Bensebti AAaM (2007) A bayesian networks in intrusion detection systems. J Comput Sci 3(5):259–265 177. Ma J, Perkins S (2003) Time-series novelty detection using one-class support vector machines. In: Proceedings of the international joint conference on neural networks, Portland, 20–24 July 2003, vol 3, pp 1741–1745, vol 1743. doi:10.1109/IJCNN.2003.1223670 178. Ma Y (2010) The intrusion detection system based on fuzzy association rules mining. In: 2nd international conference on computer engineering and technology (ICCET), Chengdu, China, 16–18 April 2010, vol 7, pp V7-667–V667-672). doi:10.1109/iccet.2010.5485674 179. Mahoney MV, Chan PK (2002) Learning nonstationary models of normal network traffic for detecting novel attacks. Paper presented at the proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, Alberta, Canada 180. Mamei M, Nagpal R (2007) Macro programming through Bayesian networks: distributed inference and anomaly detection. In: Fifth annual IEEE international conference on pervasive computing and communications (PerCom ’07). White Plains, New York, USA, 19-23 March 2007, pp 87–96. doi:10. 1109/PERCOM.2007.19 181. Manganaris S, Christensen M, Zerkle D, Hermiz K (2000) A data mining analysis of RTID alarms. Comput Netw 34(4):571–577 182. Martinez CA, Echeverri GI, Sanz AGC (2010) Malware detection based on cloud computing integrating intrusion ontology representation. In: IEEE Latin-American conference on communications (LATINCOM’10), Belem, Brazil, 15–17 Sept 2010, pp 1–6. doi:10.1109/LATINCOM.2010.5641013 183. Mathew S, Shah C, Upadhyaya S (2005) An alert fusion framework for situation awareness of coordinated multistage attacks. In: Third IEEE international workshop on information assurance, College Park, MD, USA, 23–24 March 2005, pp 95–104. doi:10.1109/IWIA.2005.3 184. Meng J, Shang H, Bian L (2009) The application on intrusion detection based on K-means cluster algorithm. In: International forum on information technology and applications(IFITA ’09), Chengdu, China, 15–17 May 2009, vol 1, pp 150–152. doi:10.1109/IFITA.2009.34 185. Middlemiss M, Dick G (2003) Feature selection of intrusion detection data using a hybrid genetic algorithm/KNN approach. Design Appl Hybrid Intell Syst 3(1):519–527

123

Contextual information fusion for intrusion... 186. Min L, Xiaohong L, Shouhe X (2008) An intrusion detection research based on spectral clustering. In: 4th international conference on wireless communications, networking and mobile computing (WiCOM ’08), Dalian, China, 12–14 Oct 2008, pp 1–4. doi:10.1109/WiCom.2008.1100 187. Mitrokotsa A, Dimitrakakis C (2013) Intrusion detection in MANET using classification algorithms: the effects of cost and model selection. Ad Hoc Netw 11(1):226–237. doi:10.1016/j.adhoc.2012.05.006 188. Mohajerani M, Moeini A, Kianie M (2003) NFIDS: a neuro-fuzzy intrusion detection system. In: 10th IEEE international conference on electronics, circuits and systems(ICECS’03), Sharjah, United Arab Emirates, vol 1. IEEE, pp 348–351 189. Mora FJ, Macia F, Garcia JM, Ramos H (2006) Intrusion detection system based on growing grid neural network. In: IEEE Mediterranean electrotechnical conference(MELECON’06), Malaga, Spain. IEEE, pp 839–842 190. Mukkamala S, Janoski G, Sung A (2002) Intrusion detection using neural networks and support vector machines. In: Proceedings of the international joint conference on neural networks( IJCNN’02), Honolulu, Hawaii, vol 2. IEEE, pp 1702–1707 191. Mukkamala S, Sung AH (2002) Identifying key features for intrusion detection using neural networks. In: Proceedings of the 15th international conference on computer communication, Maharashtra, India. 838234: International Council for Computer Communication, pp 1132–1138 192. Mukkamala S, Sung AH, Abraham A (2005) Intrusion detection using an ensemble of intelligent paradigms. J Netw Comput Appl 28(2):167–182. doi:10.1016/j.jnca.2004.01.003 193. Mulay SA, Devale PR, Garje GV (2010) Decision tree based support vector machine for intrusion detection. In: International conference on networking and information technology (ICNIT), Manila, Philippines, 11–12 June 2010, pp 59–63. doi:10.1109/icnit.2010.5508557 194. Muntean M, Valean H, Miclea L, Incze A (2010) A novel intrusion detection method based on support vector machines. In: 11th international symposium on computational intelligence and informatics (CINTI’11), Hungary. IEEE, pp 47–52 195. Naveen N (2012) Application of relevance vector machines in real time intrusion detection. Int J Adv Comput Sci Appl 3(9):48–53 196. Niu W, Li G, Zhao Z, Tang H, Shi Z (2011) Multi-granularity context model for dynamic Web service composition. J Netw Comput Appl 34(1):312–326. doi:10.1016/j.jnca.2010.07.014 197. Noel S, Jajodia S (2005) Understanding complex network attack graphs through clustered adjacency matrices. In: 21st annual computer security applications conference, AZ, USA, 5–9 Dec 2005, pp 159– 169. doi:10.1109/CSAC.2005.58 198. Noel S, Robertson E, Jajodia S (2004) Correlating intrusion events and building attack scenarios through attack graph distances. In: 20th annual computer security applications conference, Tucson, AZ, USA, 2004, pp 350–359. doi:10.1109/CSAC.2004.11 199. Noel S, Sushil J, O’Berry B, Jacobs M (2003) Efficient minimum-cost network hardening via exploit dependency graphs. In: Proceedings 19th annual computer security applications conference, Orlando, FL USA, 8–12 Dec 2003, pp 86–95. doi:10.1109/CSAC.2003.1254313 200. Nong Y, Yebin Z, Borror CM (2004) Robustness of the Markov-Chain model for Cyber-Attack Detection. IEEE Trans Reliab 53(1):116–123. doi:10.1109/TR.2004.823851 201. Nwanze N, Summerville D (2008) Detection of anomalous network packets using lightweight stateless payload inspection. In: 33rd IEEE conference on local computer networks (LCN’08), Montreal, Que, 14–17 Oct 2008, pp 911–918. doi:10.1109/LCN.2008.4664303 202. Otey M, Parthasarathy S, Ghoting A, Li G, Narravula S, Panda D (2003) Towards NIC-based intrusion detection. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, Washington, D.C. 956847: ACM, pp 723–728. doi:10.1145/956750.956847 203. Pan ZS, Chen SC, Hu GB, Zhang DQ (2003) Hybrid neural network and C4. 5 for misuse detection. In: International conference on machine learning and cybernetics, Xi’an, China, vol 4. IEEE, pp 2463–2467 204. Panda M, Patra MR (2007) Network intrusion detection using Naïve Bayes. IJCSNS Int J Comput Sci Netw Secur 7(12):259–263 205. Patcha A, Park JM (2005) Detecting denial-of-service attacks with incomplete audit data. In: Proceedings of 14th international conference on computer communications and networks ( ICCCN’05), Washington, DC, USA, 17–19 Oct 2005, pp 263–268. doi:10.1109/ICCCN.2005.1523864 206. Peddabachigari S, Abraham A, Grosan C, Thomas J (2007) Modeling intrusion detection system using hybrid intelligent systems. J Netw Comput Appl 30(1):114–132 207. Peddabachigari S, Abraham A, Thomas J (2004) Intrusion detection systems using decision trees and support vector machines. Int J Appl Sci Comput 2:18–134 208. Peng T, Chen X, Liu H, Chen K (2010) Data reduction for network forensics using manifold learning. In: 2nd international workshop on database technology and applications (DBTA), Wuhan, Hubei, China, 27–28 Nov 2010, pp 1–5. doi:10.1109/DBTA.2010.5659004

123

A. Aleroud, G. Karabatis 209. Pensa RG, Leschi C, Besson J, Boulicaut JF (2004) Assessment of discretization techniques for relevant pattern discovery from gene expression data. In: Proceedings of ACM BIOKDD, Seattle, Washington, USA, vol 4, pp 24–30 210. Phua C, Alahakoon D, Lee V (2004) Minority report in Fraud detection: classification of Skewed Data. ACM SIGKDD Explor Newsl 6(1):50–59 211. Portnoy L (2001) Intrusion detection with unlabeled data using clustering, Accessed (2001) 212. Powell D, Stroud R (2001) Malicious-and accidental-fault tolerance for internet applications conceptual model and architecture. Accessed (2001) 213. Qiao Y, Xin XW, Bin Y, Ge S (2002) Anomaly intrusion detection method based on HMM. Electron Lett 38(13):663–664. doi:10.1049/el:20020467 214. Qin M, Hwang K (2004) Frequent episode rules for intrusive anomaly detection with internet datamining. In: USENIX security symposium, San Diego, CA 215. Qin X (2005) A probabilistic-based framework for Infosec alert correlation, PhD thesis. Georgia Institute of Technology 216. Qin X, Lee W (2004) Attack plan recognition and prediction using causal networks. In: 20th annual computer security applications conference, Tucson, AZ, USA, 6–10 Dec 2004, pp 370–379. doi:10. 1109/CSAC.2004.7 217. Qishi W, Ferebee D, Yunyue L, Dasgupta D (2009) An integrated cyber security monitoring system using correlation-based techniques. In: IEEE international conference on system of systems engineering, Albuquerque, NM, May 30 2009–June 3 2009, pp 1–6 218. Qiu H, Eklund N, Hu X, Yan W, Iyer N (2008) Anomaly detection using data clustering and neural networks. In: IEEE international joint conference on neural networks, Hong Kong, China. IEEE, pp 3627–3633 219. Ranganathan A, Campbell RH (2003) A middleware for context-aware agents in ubiquitous computing environments. In: Proceedings of the ACM/IFIP/USENIX international conference on middleware, Rio de Janeiro, Brazil. 1515926: Springer, New York, pp 143–161 220. Reichle R, Wagner M, Khan MU, Geihs K, Lorenzo J, Valla M, et al. (2008) A comprehensive context modeling framework for pervasive computing systems. In: Proceedings of the 8th IFIP WG 6.1 international conference on distributed applications and interoperable systems, Oslo, Norway. 1789105: Springer, pp 281–295 221. Ren P, Gao Y, Li Z, Chen Y, Watson B (2005) IDGraphs: intrusion detection and analysis using histographs. In: IEEE workshop on visualization for computer security, 2005 (VizSEC 05). IEEE, pp 39–46 222. Ritchey R, O’Berry B, Noel S (2002) Representing TCP/IP connectivity for topological analysis of network security. In: Proceedings of the 18th annual computer security applications conference, Las Vegas, Nevada, 2002, pp 25–31. doi:10.1109/CSAC.2002.1176275 223. Roesch M Snort intrusion detection system. http://www.snort.org. Accessed 22 Dec 2013 224. Roschke S, Feng C, Meinel C (2010) Using vulnerability information and attack graphs for intrusion detection. In: Sixth international conference on information assurance and security (IAS), GA, USA, 23–25 Aug 2010, pp 68–73. doi:10.1109/ISIAS.2010.5604041 225. Rui Z, Yongquan Y, Mingjun C (2009) An intrusion detection algorithm model based on extension clustering support vector machine. In: International conference on artificial intelligence and computational intelligence (AICI’09), Shanghai, China, vol 1. IEEE, pp 15–18 226. Ryan J, Lin MJ, Miikkulainen R (1998) Intrusion detection with neural networks. In: Proceedings of advances in neural information processing systems, Denver, Colorado, USA. Morgan Kaufmann Publishers, pp 943–949 227. Saad S, Traore I (2010) Method ontology for intelligent network forensics analysis. In: Eighth annual international conference on privacy security and trust (PST’10), Ottawa, Ontario, Canada, 17–19 Aug 2010, pp 7–14. doi:10.1109/PST.2010.5593235 228. Sánchez R, Herrero Á, Corchado E (2013) Visualization and clustering for SNMP intrusion detection. Cybern Syst 44(6–7):505–532 229. Sang-Hyun O, Jin-Suk K, Yung-Cheol B, Gyung-Leen P, Sang-Yong B (2005) Intrusion detection based on clustering a data stream. In: Third ACIS international conference on software engineering research, management and applications, Michigan, USA, 11–13 Aug 2005, pp 220–227. doi:10.1109/SERA.2005. 49 230. Sang JH, Cho SB (2003) Combining multiple host-based detectors using decision tree. In: Gedeon T, Fung L (eds) Proceedings of 16th Australian conferenceon artificial intelligence, Perth, Australia, 2003/01/01 (vol 2903, Lecture Notes in Computer Science). Springer Berlin, pp 208–220. doi:10.1007/ 978-3-540-24581-0_18 231. Sarasamma ST, Zhu QA, Huff J (2005) Hierarchical Kohonenen net for anomaly detection in network security. IEEE Trans Syst Man Cybern B Cybern 35(2):302–312. doi:10.1109/TSMCB.2005.843274

123

Contextual information fusion for intrusion... 232. Schölkopflkopf Platt JC, Shawe-Taylor JC, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471. doi:10.1162/089976601750264965 233. Schifanella C, Sapino ML, Sel K, Candan U (2012) On context-aware co-clustering with metadata support. J Intell Inf Syst 38(1):209–239. doi:10.1007/s10844-011-0151-x 234. Schilit B, Adams N, Want R (1994) Context-aware computing applications. In:First workshop on mobile computing systems and applications (WMCSA’94). Santa Cruz, CA, USA. IEEE, pp 85–90 235. Schmidt A, Beigl M, Gellersen H-W (1999) There is more to context than location. Comput Graph 23(6):893–901. doi:10.1016/S0097-8493(99)00120-X 236. Scott SL (2004) A Bayesian paradigm for designing intrusion detection systems. Comput Stat Data Anal 45(1):69–83. doi:10.1016/S0167-9473(03)00177-4 237. Sebyala AA, Olukemi T, Sacks L (2002) Active platform security through intrusion detection using Naive Bayesian network for anomaly detection. In: The London communications symposium. Citeseer, London 238. Sekeh MA, bin Maarof MA (2009) Fuzzy intrusion detection system via data mining technique with sequences of system calls. In: Fifth international conference on information assurance and security (IAS ’09), Xi’An, China, 18–20 Aug 2009, vol 1, pp 154–157. doi:10.1109/IAS.2009.32 239. Shah H, Undercoffer J, Joshi A (2003) Fuzzy clustering for intrusion detection. In: The 12th IEEE international conference on fuzzy systems (FUZZ ’03), St Louis, MO, USA, 25–28 May 2003, vol 2, pp 1274–1278. doi:10.1109/FUZZ.2003.1206614 240. Sharma SK, Pandey P, Tiwari SK, Sisodia MS (2012) An improved network intrusion detection technique based on K-means clustering via Naive Bayes classification. In: International conference on advances in engineering, science and management (ICAESM), EGS Pillay Engineering College, Nagapattinam, 30–31 March 2012, pp 417–422 241. Shaw DG (2011) Reducing false-positives and false-negatives in security event data using context. https://www.nasa.gov/ppt/583349main_2011_Present_NASA_IT_Summit_Shaw_Reducing_ False_Positives_(2).ppt. Accessed 2011 242. Shekhar RG, Vir VP, Kiran SB (2007) K-Means+ID3: a novel method for supervised anomaly detection by Cascading K-Means clustering and ID3 decision tree learning methods. IEEE Trans Knowl Data Eng 19(3):345–354. doi:10.1109/TKDE.2007.44 243. Sheyner O, Haines J, Jha S, Lippmann R, Wing JM (2002) Automated Generation and Analysis of Attack Graphs. In: IEEE symposium on security and privacy, Oakland, California, USA 2002:273–284. doi:10. 1109/SECPRI.2002.1004377 244. Shokri R, Oroumchian F, Yazdani N (2005) CLUSID: a clustering scheme for intrusion detection improved by information theory. In: 13th IEEE international conference on networks, 16–18 Nov 2005, pp 553–558. doi:10.1109/ICON.2005.1635546 245. Shon T, Moon J (2007) A hybrid machine learning approach to network anomaly detection. Inf Sci 177(18):3799–3821 246. Shun J, Malki HA (2008) Network intrusion detection system using neural networks. In: Fourth international conference on natural computation (ICNC’08), Jinan, China, vol. 5. IEEE, pp 242–246 247. Shyu ML, Chen SC, Sarinnapakorn K, Chang LW (2003) A novel anomaly detection scheme based on principal component classifier. In: Third IEEE international conference on data mining (ICDM’03), Melbourne, Florida, USA, pp 172–179 248. Sinclair C, Pierce L, Matzner S (1999) An application of machine learning to network intrusion detection. In: 15th annual computer security applications conference (ACSAC ’99), Phoenix, AZ, USA, pp 371– 377. doi:10.1109/csac.1999.816048 249. Sindhu S, Geetha S, Kannan A (2012) Decision tree based light weight intrusion detection using a wrapper approach. Expert Syst Appl 39(1):129–141. doi:10.1016/j.eswa.2011.06.013 250. Siraj MM, Maarof MA, Hashim SZM (2009) Intelligent clustering with PCA and unsupervised learning algorithm in intrusion alert correlation. In: Fifth international conference on information assurance and security ( IAS ’09), Xi’an, China, 18–20 Aug 2009, vol 1, pp 679–682. doi:10.1109/IAS.2009.261 251. Song J, Takakura H, Kwon Y (2008) A generalized feature extraction scheme to detect 0-Day attacks via IDS alerts. In: Proceedings of the 2008 international symposium on applications and the internet, Urku, Finland, 1442004. IEEE Computer Society, pp 55–61. doi:10.1109/saint.2008.85 252. Song S, Ling L, Manikopoulo C (2006) Flow-based statistical aggregation schemes for network anomaly detection. In: Proceedings of the IEEE international conference on networking, sensing and control (ICNSC’06), Hainan, China. IEEE, pp 786–791 253. Song X, Wu M, Jermaine C, Ranka S (2007) Conditional anomaly detection. IEEE Trans Knowl Data Eng 19(5):631–645. doi:10.1109/tkde.2007.1009 254. Sperotto A, Sadre R, Vliet F, Pras A (2009) A labeled data set for flow-based intrusion detection. In: Nunzi G, Scoglio C, Li X (eds) 9th IEEE international workshop on IP operations and management

123

A. Aleroud, G. Karabatis

255. 256. 257.

258. 259.

260. 261.

262.

263.

264.

265.

266. 267.

268. 269.

270. 271.

272. 273. 274.

275.

276. 277.

((IPOM’09), Venice, Italy, 2009/01/01, vol 5843. Lecture Notes in Computer Science, pp 39–50. doi:10. 1007/978-3-642-04968-2_4 Sperotto A, Schaffrath G, Sadre R, Morariu C, Pras A, Stiller B An overview of IP flow-based intrusion detection. IEEE Commun Surv Tutor 12(3):343–356 Sperotto A, Schaffrath G, Sadre R, Morariu C, Pras A, Stiller B (2010) An overview of IP flow-based intrusion detection. Commun Surv Tutor IEEE 12(3):343–356. doi:10.1109/SURV.2010.032210.00054 Stein G, Chen B, Wu AS, Hua KA (2005) Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd annual Southeast Regional Conference, Kennesaw, GA, USA. ACM, pp 136–141 Steinwart I, Hush D, Scovel C (2006) A classification framework for anomaly detection. J Mach Learn Res 6(1):211–232 Tabia K, Benferhat S, Leray P, Mé L (2011) Alert correlation in intrusion detection: combining AI-based approaches for exploiting security operators’ knowledge and preferences. In: Security and artificial intelligence (SecArt) Takeuchi J-I, Yamanishi K (2006) A unifying framework for detecting outliers and change points from time series. IEEE Trans Knowl Data Eng 18(4):482–492 Tang P, Jiang R, Zhao M (2010) Feature selection and design of intrusion detection system based on Kmeans and triangle area support vector machine. In: Second international conference on future networks (ICFN’10), Hainan, China. IEEE, pp 144–148 Tao L, Ai-ling Q, Yuan-bin H, Xin-tan C (2008a) Method for anomaly detection based on classifier with time function. In: IEEE international conference on industrial technology (ICIT’08). Chengdu, China, 21–24 April 2008, pp 1–4. doi:10.1109/ICIT.2008.4608512 Tao L, Ailing Q, Yuanbin H, Xintan C (2008b) Method for network anomaly detection based on bayesian statistical model with time slicing. In: 7th world congress on intelligent control and automation (WCICA’08), Chongqing, China, 25–27 June 2008, pp 3359–3362. doi:10.1109/WCICA.2008.4593458 Te-Shun C, Yen KK (2007) Fuzzy belief k-nearest neighbors anomaly detection of user to root and remote to local attacks. In: IEEE SMC information assurance and security workshop (IAW ’07), West Point, New York, 20–22 June 2007, pp 207–213. doi:10.1109/IAW.2007.381934 Te-Shun C, Yen KK, Pissinou N, Makki K (2007) Fuzzy belief reasoning for intrusion detection design. In: Third international conference on intelligent information hiding and multimedia signal processing ( IIHMSP’07), Kaohsiung, Taiwan, 26–28 Nov 2007, pp 621–624. doi:10.1109/IIHMSP.2007.4457786 Thottan M, Ji C (2003) Anomaly detection in IP networks. IEEE Trans Signal Process 51(8):2191–2204 Tombini E, Debar H, Me L, Ducasse M (2004) A serial combination of anomaly and misuse IDSs applied to HTTP traffic. In: Proceedings of the 20th annual computer security applications conference, Tucson, Arizona, USA. 1038335: IEEE Computer Society, pp 428–437. doi:10.1109/csac.2004.4 Tsai CF, Hsu YF, Lin CY, Lin WY (2009) Intrusion detection by machine learning: a review. Expert Syst Appl 36(10):11994–12000 Tylman W (2008a) Anomaly-based intrusion detection using bayesian networks. In: Third international conference on dependability of computer systems (DepCos-RELCOMEX ’08), Szklarska Poreba, Poland, 26–28 June 2008, pp 211–218. doi:10.1109/DepCoS-RELCOMEX.2008.52 Tylman W (2008b) Misuse-based intrusion detection using bayesian networks. In: International conference on dependability of computer systems, Zklarska Poreba, Poland, pp 203–210 Ukil A (2010) Application of Kolmogorov complexity in anomaly detection. In: 16th Asia-Pacific conference on communications (APCC), Auckland, New Zealand, Oct 31 2010–Nov 3 2010, pp 141– 146. doi:10.1109/APCC.2010.5679753 Vapnik V (1999) The nature of statistical learning theory, 2nd edn. Springer, New York Viinikka J, Debar H, Mé L, Lehikoinen A, Tarvainen M (2009) Processing intrusion detection alert aggregates with time series modeling. Inf Fusion 10(4):312–324 Voelker GM, Bershad BN (1994) Mobisaic: an information system for a mobile wireless computing environment. In: Workshop on mobile computing systems and applications, California, USA, pp 185– 190. doi:10.1109/mcsa.1994.513481 Vorobiev A, Jun H (2006) Security attack ontology for Web services. In: Second international conference on semantics, knowledge and grid (SKG ’06), Guangxi, China, 1–3 Nov 2006, pp 42–48. doi:10.1109/ SKG.2006.85 Wagner D, Soto P (2002) Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM conference on computer and communications security, Berlin, German. ACM, pp 255–264 Wan L, Shengfeng T (2009) Preprocessor of intrusion alerts correlation based on ontology. In: WRI international conference on communications and mobile computing (CMC ’09), Yunnan, China, 6–8 Jan 2009, pp 460–464. doi:10.1109/CMC.2009.63

123

Contextual information fusion for intrusion... 278. Wang G, Hao J, Ma J, Huang L (2010) A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Syst Appl 37(9):6225–6232. doi:10.1016/j.eswa.2010.02.102 279. Wang K, Stolfo S (2004) Anomalous payload-based network intrusion detection. In: Recent advances in intrusion detection, Sophia Antipolis, France. Springer, pp 203–222 280. Wang W, Battiti R (2006) Identifying intrusions in computer networks with principal component analysis. In: The first international conference on availability, reliability and security, Vienna, Austria. IEEE, pp 8–15 281. Wang W, Guan X, Zhang X (2004) A novel intrusion detection method based on principle component analysis in computer security. In: IEEE international symposium on neural networks in computer security, Dalian, China. IEEE, pp 88–89 282. Wang X, He F (2006) Improving intrusion detection performance using rough set theory and association rule mining. In: International conference on hybrid information technology (ICHIT ’06), Jeju Island, Korea, 9–11 Nov. 2006, vol 2, pp 114–119. doi:10.1109/ichit.2006.253599 283. Wei W, Daniels TE (2005) Building evidence graphs for network forensics analysis. In: 21st Annual computer security applications conference, AZ, USA, 5–9 Dec 2005, p 11, 266. doi:10.1109/CSAC. 2005.14 284. Weller-Fahy DJ, Borghetti BJ, Sodemann AA (2015) A survey of distance and similarity measures used within network intrusion anomaly detection. IEEE Commun Surv Tutor 17(1):70–91 285. Wenge R, Kecheng L, Lin L (2008) Association rule based context modeling for web service discovery. In: 10th IEEE conference on e-commerce technology, Washington, DC, 21–24 July 2008, pp 299–304. doi:10.1109/CECandEEE.2008.137 286. Wenke L, Stolfo SJ, Mok KW (1999) A data mining framework for building intrusion detection models. In: Proceedings of the IEEE symposium on security and privacy, Oakland, California 1999:120–132. doi:10.1109/secpri.1999.766909 287. Wentao F, Bouguila N, Ziou D (2011) Unsupervised anomaly intrusion detection via localized Bayesian feature selection. In: IEEE 11th international conference on data mining (ICDM’11), Vancouver, Canada, 11–14 Dec 2011, pp 1032–1037. doi:10.1109/ICDM.2011.152 288. White RW, Bailey P, Chen L (2009) Predicting user interests from contextual information. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 363–370 289. Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: Proceedings of IEEE international conference on data mining (ICDM’02), Maebashi City, Japan. IEEE, pp 709–712 290. Winter P, Hermann E, Zeilinger M (2011) Inductive intrusion detection in flow-based network data using one-class support vector machines. In: 4th IFIP international conference on new technologies, mobility and security (NTMS ’11), Paris, France. IEEE, pp 1–5 291. Wu N, Zhang J (2003) Factor analysis based anomaly detection. In: IEEE systems, man and cybernetics society information assurance workshop, West Point, New York, USA. IEEE, pp 108–115 292. Wuling R, Jinzhu C, Xianjie W (2009) Application of network intrusion detection based on fuzzy Cmeans clustering algorithm. In: Third international symposium on intelligent information technology application (IITA’09), Nanchang, China, 21–22 Nov 2009, vol 3, pp 19–22. doi:10.1109/IITA.2009.269 293. Xiao L, Chen Y, Chang CK (2014) Bayesian model averaging of Bayesian network classifiers for intrusion detection. In: 9th IEEE international workshop on security, trust, and privacy for software applications”, pp 21–15 294. Xiaolin W, Chou PA, Xiaohui X (2000) Minimum conditional entropy context quantization. In: IEEE international symposium on information theory, Sorrento, Italy, 2000, p 43. doi:10.1109/isit.2000. 866333 295. Xiaorong C, Shanshan W (2010) A real-time hybrid intrusion detection system based on principle component analysis and self organizing maps. In: Sixth international conference on natural computation (ICNC’10), Shandong, China, 10–12 Aug 2010, vol 3, pp 1182–1185. doi:10.1109/ICNC.2010.5583654 296. Xie P, Li JH, Ou X, Liu P, Levy R (2010) Using Bayesian networks for cyber security analysis. In: IEEE/IFIP international conference on dependable systems and networks (DSN), Chicago, IL, pp 211– 220 297. Xu J, Croft WB (2000) Improving the effectiveness of information retrieval with local context analysis. ACM Trans Inf Syst (TOIS) 18(1):79–112 298. Xu J, Shelton CR (2010) Intrusion detection using continuous time bayesian networks. J Artif Intell Res 39(1):745–774 299. Xuedou Y (2009) Research on active defence technology with host intrusion based on K-nearest neighbor algorithm of kernel. In: Fifth international conference on information assurance and security (IAS’09), Xi’an, China, 18–20 Aug 2009, vol 1, pp 411–414. doi:10.1109/IAS.2009.255

123

A. Aleroud, G. Karabatis 300. Ye C, Wei N, Wang T, Zhang Q, Zhu X (2009a) The research on the application of association rules mining algorithm in network intrusion detection. In: First international workshop on education technology and computer science (ETCS ’09), Wuhan, China, 7–8 March 2009, vol 2, pp 849–852. doi:10.1109/etcs. 2009.451 301. Ye C, Zhang Q, Zhou J, Wei N, Zhu X, Wang T (2009b) Improvement of association rules mining algorithm in wireless network intrusion detection. In: International conference on computational intelligence and natural computing, Wuhan, China, 6–7 June 2009, vol 2, pp 413–416. doi:10.1109/cinc.2009.19 302. Ye D, Huiqiang W, Yonggang P (2004) A hidden markov models-based anomaly intrusion detection method. In: Fifth world congress on intelligent control and automation (WCICA’04), Hangzhou, China, 15–19 June 2004, vol 5, pp 4348–4351. doi:10.1109/WCICA.2004.1342334 303. Ye D, Tong W (2008) An anomaly intrusion detection method based on shell commands. In: IEEE international symposium on knowledge acquisition and modeling workshop(KAM’08), Wuhan, China, 21–22 Dec 2008, pp 798–801. doi:10.1109/KAMW.2008.4810611 304. Yeung DY, Ding Y (2003) Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognit 36(1):229–243 305. Yoshida K (2003) Entropy based Intrusion Detection. In: IEEE Pacific RIM Conference on Communications, Computers and Signal Processing (PACRIM’03), Victoria, B.C., Canada, 28–30 Aug 2003, vol 2, pp 840–843. doi:10.1109/PACRIM.2003.1235912 306. Yu Y, Wei Y, Fu-Xiang G, Ge Y (2006) Anomaly Intrusion Detection Approach Using Hybrid MLP/CNN Neural Network. In: Kong H (ed) Sixth international conference on intelligent systems design and applications (ISDA’06), Wroclaw, Poland. IEEE, pp 1095–1102 307. Yun Y, Guyu H, Shize G, Jun L (2010) Imbalanced classification algorithm in Botnet detection. In: First international conference on pervasive computing signal processing and applications (PCSPA’10), Gjøvik, Norway, 17–19 Sept 2010, pp 116–119. doi:10.1109/PCSPA.2010.37 308. Zanero S, Savaresi SM (2004) Unsupervised learning techniques for an intrusion detection system. In: Proceedings of the 2004 ACM symposium on applied computing.ACM, pp 412–419 309. Zhang J, Zulkernine M (2006) A hybrid network intrusion detection technique using random forests. In: The first international conference on availability, reliability and security (ARES’06), Vienna University of Technology, Austria. IEEE, pp 262–269 310. Zhang Z, Li J, Manikopoulos C, Jorgenson J, Ucles J (2001) HIDE: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In: IEEE workshop on information assurance and security, West Point, NY, pp 85–90 311. Zhang Z, Shen H (2004) Online training of SVMs for real-time intrusion detection. In: 18th international conference on advanced information networking and applications(AINA’04), Fukuoka, Japan, vol 1. IEEE, pp 568–573 312. Zhang Z, Shen H (2005) Application of online-training SVMs for real-time intrusion detection with different considerations. Comput Commun 28(12):1428–1442 313. Zhao W, Ma H, He Q (2009) Parallel k-means clustering based on mapreduce. In: IEEE international conference on cloud computing. Springer, pp 674–679 314. Zheng K, Qian X, Zhou Y, Jia L (2009) Intrusion detection using ISOMAP and support vector machine. In: International conference on artificial intelligence and computational intelligence (AICI’09), Shanghai, China, vol 3. IEEE, pp 235–239 315. Zhong LL, Ming ZY, Bin ZY (2010) Network intrusion detection method by least squares support vector machine classifier. In: 3rd IEEE international conference on computer science and information technology (ICCSIT’10), Beijing, China, vol 2. IEEE, pp 295–297 316. Zhou H, Meng X, Zhang L (2007) Application of support vector machine and genetic algorithms to network intrusion detection. In: International conference on wireless communications, networking and mobile computing (WiCom 07), Shanghai, China. IEEE, pp 2267–2269 317. Zhou M, Huang H, Wang Q (2012) A graph-based clustering algorithm for anomaly intrusion detection. In: 7th international conference on computer science & education (ICCSE’12), Melbourne, Australia, 14–17 July 2012, pp 1311–1314. doi:10.1109/ICCSE.2012.6295306 318. Zimmermann A, Lorenz A, Oppermann R (2007) An operational definition of context. In: Proceedings of the 6th international and interdisciplinary conference on modeling and using context (Context’07), Roskilde University, Denmark, pp 558–571

123

Contextual information fusion for intrusion... Ahmed Aleroud is an Assistant Professor of Computer Information Systems at Yarmouk University in Jordan. He holds degrees in Information Systems (Ph.D. and M.S.) from the University of Maryland, Baltimore County, and Software Engineering (B.S.) from Hashemite University in Jordan. He was a Visiting Associate Research Scientist at the University of Maryland, Baltimore County working on Cyber Security research projects. His research work focuses on Cybersecurity, Data mining for privacy preserving network data analytics, and Detection of Social Engineering Attacks.

George Karabatis is an Associate Professor of Information Systems at the University of Maryland, Baltimore County (UMBC). He holds degrees in Computer Science (Ph.D. and M.S.) and Mathematics (B.S.). Before joining UMBC, he was a Research Scientist at Telcordia Technologies (formerly Bellcore). His research interests are on cyber-security, specifically on intrusion detection utilizing intelligent manipulation of information through semantics and context, semantic information integration, workflow systems, big data, multi-database systems, concurrency control, etc.

123

Contextual information fusion for intrusion detection: a survey and taxonomy

Recommend Documents