Int J Syst Assur Eng Manag DOI 10.1007/s13198-015-0363-5
ORIGINAL ARTICLE
A novel requirements engineering approach for designing data warehouses Manoj Kumar1 • Anjana Gosain2 • Yogesh Singh3
Received: 7 November 2013 The Society for Reliability Engineering, Quality and Operations Management (SREQOM), India and The Division of Operation and Maintenance, Lulea University of Technology, Sweden 2015
Abstract Most of the requirements engineering (RE) approaches for data warehouse (DW) do not distinguish the early and late RE phase unlike recent RE approaches for transactional systems. They captured information requirement instead of decision requirement which is the main focus of this article. In this paper we present a novel RE approach for DW consisting of three phases namely; (i) early RE (ii) late RE, and (iii) conceptual design. The early RE phase captures ‘whys’ that underlies decision requirements and the late RE phase captures ‘what’ the DW system should do. The conceptual design evolves through the early and late requirements. All the models produced (early requirements model, late requirements model and multi dimensional conceptual model) are interlinked, thus, support traceability among each other. Finally, the proposed approach has been demonstrated by a case study of a typical Indian public sector bank and supported by a CASE tool. Keywords Agent Early requirements engineering Late requirements engineering Conceptual design Data warehouse & Manoj Kumar
[email protected] Anjana Gosain
[email protected] Yogesh Singh
[email protected] 1
Ambedkar Institute of Advanced Communication Technologies & Research, Delhi, India
2
Guru Gobind Singh Indraprastha University, Delhi, India
3
Netaji Subhas Institute of Technology, Delhi, India
1 Introduction The Data Warehouse (DW) provides many years of historical information for decision making process to managers and forms the central body of most of their current decisional systems (Jeusfield et al. 1998). Studies from the Standish Group show that one third of DW projects fail to meet their objectives. This is due to DW design approaches that deal with different kind of data models (conceptual, logical and physical data models), not taking requirements into account explicitly (Salinesi and Gam 2009). Other researchers (Cabibbo and Torlone 1998; Lehner et al. 1998; Tryfona et al. 1999; Vassiliadis 2000), too feel that in the failure of DW projects lies poor consideration for requirements engineering (RE) phase. Thus, it is evident that a great deal of RE effort and planning is essential to achieve successful DW implementation (Shiefer et al. 2002). In recent past various methodologies (Boehnlein and Vom Ende 2000; Bonifati et al. 2001; Frendi and Salinesi 2003; Mazon et al. 2005, 2007; Paim and Castro 2003; Shiefer et al. 2002; Winter and Strauch 2003, 2004) for DW have been proposed suggesting prime importance to RE phase. These research trends have, actually, established RE phase as the starting phase of DW system development life cycle (SDLC) (Golfarelli and Rizzi 1999; Husemann et al. 2000; Prakash and Gosain 2003). In recent past, RE phase for transactional systems has been divided into two phases namely: early RE and late RE (Yu 1995, 1997). According to Yu, the early RE phase aims to model and analyze stakeholders’ interests and how they might be addressed, or compromised by various system and environment alternatives. The emphasis in early RE phase is on understanding the ‘whys’ that underlies system requirements (Yu and Mylopoulos 1994), i.e.
123
Int J Syst Assur Eng Manag
captured early requirements and focus of late RE phase is on the precise and detailed specification of ‘what’ the system should do, i.e. late requirements. However, most of the RE approaches (Boehnlein and Vom Ende 2000; Bonifati et al. 2001; Frendi and Salinesi 2003; Paim and Castro 2003; Shiefer et al. 2002; Winter and Strauch 2003, 2004) for DW did not distinguish the early and late RE phase. They mainly focused on specifying ‘what’ the DW system should do, i.e. captured information to be maintained in the DW. Vey few goal and model driven approaches (Giorgini et al. 2007; Mazon et al. 2005, 2007) for DW have focused on ‘whys’ that underlies DW information requirements, i.e. captured rationale of information. However, they did not capture decision requirements (Prakash and Gosain 2008; Prakash et al. 2010). They develop the notion of decision requirement as the pair (decision, information) where ‘information’ is that required for the decision maker to assess if the ‘decision’ is to be taken or not. This motivates us to capture decision requirements for DW during RE phase. Having reviewed these approaches the following points are notable(1)
(2)
(3) (4)
Decision requirements were captured without modeling stakeholders and their dependencies (Prakash and Gosain 2008; Prakash et al. 2010), i.e. early requirements were not modeled explicitly. Early RE phase captured information requirements rather than decision requirements. Beside the information requirements models were integrated with conceptual design phase (Giorgini et al. 2007; Mazon et al. 2005, 2007) without modeling late requirements explicitly. The transition-path from early RE phase to conceptual design phase through late RE phase is missing. CASE tool support for RE phase is missing too.
We, therefore, present in this paper a novel RE approach for DW consisting of three phases namely; (i) early RE, (ii) late RE, and (iii) conceptual design supported by a CASE tool as enumerated below. (1)
(2)
(3)
(4)
During early RE phase, organization modeling and decision modeling activities shall be carried out to capture ‘whys’ that underlies DW decision requirements, i.e. rationale of information to be maintained in DW. During late RE phase, information modeling activity shall be carried out to capture ‘what’ the DW system should do i.e. information to support the decisions. A transition-path shall be evolved to obtain multi dimensional (MD) conceptual model from early and late requirements models. A CASE tool shall be developed to support the above modeling propositions.
123
Organization of the paper is as follows: The Sect. 2 deals with a brief review of relevant literature. DW SDLC is proposed in Sect. 3, which shall be adopted by the proposed approach. In Sect. 4, we propose agent-goal-decision-information (AGDI) model, which is an underlying model of our proposed approach. The Sect. 5 discusses the proposed novel RE approach for DW. In Sect. 6, the proposed approach is applied on a bank. In Sect. 7, a prototype CASE tool has been developed to support the proposed approach. In Sect. 8, we present a comparison of our approach with prominent approaches for DW design. Finally, Sect. 9 presents the conclusion.
2 Relevant literature survey Earlier the DW development did not emphasize RE. Inmon (1996) argued that unlike in classical SDLC, requirements of DW are usually the last thing to be discovered. In recent years, this view has undergone a change. An SDLC for DW that gives primacy to RE was proposed by Golfarelli and Rizzi (1999). There the design of a conceptual schema was carried out by producing the fact schema for each fact and the fact schemata could be derived from an E/R schema using an algorithmic procedure. Another SDLC for DW was proposed by Husemann et al. (2000) where the requirements analysis was carried out together with domain experts, and the global operational E/R schema was analyzed to determine interesting measures, dimensions, and initial OLAP queries. The conceptual design phase, subsequently, operated on the results of requirements analysis, determining functional dependencies and produced a graphical MD schema. The logical design phase converted the conceptual schema to a logical one with respect to the targeted logical data model. Finally, the DW design process ended in a physical implementation of the logical schemata with respect to the individual properties of the target system. In yet another SDLC for DW (Prakash and Gosain 2003) it was assumed that DW development must be rooted in the real work, i.e. decisions of interests that the organization wants to support. Consequently, RE discovered the information contents of a DW through a creative process that itself discovered the decisions of interest. It was assumed that these decisions were themselves based on discovered organizational goals. The set of goals, decisions and information were organized in a GoalDecision-Information (GDI) schema. This approach deemphasized the role of the ER schema in the RE phase and the ER schema was produced as an output of RE phase. However, in that SDLC, all relevant stakeholders of the organization were not explicitly modeled. This motivated us to propose an improved SDLC for DW that we would present in the next section.
Int J Syst Assur Eng Manag
In recent past, several approaches of DW design have been suggested giving prime importance to RE phase. Some of the RE approaches were elicited requirements from business process/users. Boehnlein and Vom Ende (2000) presented an approach where DW requirements were elicited from business process models. In this approach, DW measures and dimensions were determined from an initial study of goals and services of the organization. It was similar to the data-driven approach (Inmon 1996), as it focused on the subject of monitoring, rather than on the reasons for which monitoring was required. Paim and Castro (2003) proposed the DW requirements definition (DWARF) approach that adapts a traditional RE process for information requirements definition and management supported by a CASE tool. In (Frendi and Salinesi 2003), DW requirements could be elicited using business process requirements and strategic decision processes. In this approach, DW models were produced using a combination of DW requirements and as—is data models. Once produced, DW data models could also be used to elicit new requirements. Another four step approach based on business process was also proposed by Kimball and Ross (2002). They determined facts and dimensions from the initial choice of the business process and defined it as a major operational process in the organization which is supported by a transactional system. Winter and Strauch (2003, 2004) proposed a user driven bottom up approach that initiated from the information requirements of different business users. Their assertions were then integrated and made consistent to obtain a unique set of multidimensional (MD) schemata. In these approaches they did not capture decision requirements i.e. decisions were not discovered to relate with the information. However, they pointed out that a detailed analysis of business processes is not always a good starting point for DW design. To overcome these problems, goal-driven RE approaches have been reported in the literature as described in the following. Bonifati et al. (2001) presented a goal-driven approach to extract data marts from the enterprise wide information system. It adopted the top–down Goal-Question-Metric approach for determining goals. They discovered goals, aggregated and refined in abstraction sheets, from which ideal star schemata were extracted. The ideal star schemata were matched with the candidate star schemata and were extracted from ER schemata of operational sources. Shiefer et al. (2002) proposed a holistic approach for managing requirements of DW systems. This approach was based on goal modeling at several level of abstraction. However, they did not spell out any guidelines for properly specifying the DW requirements for onward phases of DW design. All these RE approaches for DW captured information requirements rather than decision requirements. They did not distinguish early and late RE phase unlike few recent
RE approaches for DW (Giorgini et al. 2007; Mazon et al. 2007; Prakash and Gosain 2008) as described in the following. Giorgini et al. (2007) showed goal-oriented scheme for requirements analysis based on early requirements modeling (Bresciani et al. 2004). They derived a MD conceptual schema derived semi-automatically from early information requirements models. Mazon et al. (2005) defined requirements with three types of hierarchical goals; strategic, decision and information. Strategic goals were those of high level, decision goals indicated how to achieve strategic goals, and information goals defined necessary sources of information. They presented several guidelines to support the specification of i* based (Yu 1995, 1997) strategic dependency and strategic rationale models for DW and the derivation of MD conceptual. Further Mazon et al. (2007) integrated their approach with model driven architecture based approach (Mazon and Trujillo 2006) for development of DW. They specify information requirements as computation independent model and transformed into a MD conceptual model as platform independent model. Their approach supports traceability from information requirements model to conceptual MD model. However, in these approaches decision requirements were not captured, i.e. decisions were not discovered to relate with the goals and the information to be maintained in the DW. Prakash and Gosain (2008) suggested another RE approach for DW to capture decision requirements. They started with the decisional goals and worked its way through to the decisional information needed to fulfill these goals. The aim of decisional goal was to identify a set of relevant decisions. Their emphasis was on discovering the DW information to support the set of relevant decisions, i.e. late requirements were captured and did not model stakeholders and their dependencies, i.e. early requirements. They did not provide any guidance to obtain MD conceptual model from decision requirements organized as a GDI schema. This motivated us to propose a novel RE approach for DW to model decision requirements in terms of early and late requirements models and then proceeds towards conceptual design phase. This approach would be presented in Sect. 5. Finally, it is inevitable to present a review of the CASE tools that have been developed for supporting the DW design methods. In ADAPT (Bulos 1999) and in GOLD (Lujan-Mora et al. 2002), the conceptual schema for DW was drawn by the designers using a demand driven approach and no support for RE was given. In WAND (Golfarelli et al. 2002), the conceptual schema was derived semi-automatically from the source schemata, thus datadriven approach for DW design was supported. In recent past some of RE approaches for DW have been supported by their CASE tools (Paim and Castro 2003; Giorgini et al.
123
Int J Syst Assur Eng Manag
2007) for modeling information requirements. To the best of our knowledge, only the RE approaches in (Prakash and Gosain 2008, 2010) have been suggested to capture decision requirements and supported by the CASE tools. This motivated us to develop a prototype case tool for supporting our proposed novel RE approach of this communication for DW. Hence a hands-on-session with our CASE tool will be exhibited in Sect. 7.
3 The proposed new SDLC for DW In the following we describe a new proposition of SDLC for DW that consists of five phases namely; early RE, late RE, conceptual design, logical design and physical design shown in Table 1. The early RE phase identifies various agents, their goals and the decisions for achievement of the goals. The late RE phase focuses in eliciting relevant information to support various decisions as identified in early RE phase. The output of early RE is represented as organization models and decision models, whereas the output of late RE is represented as information model, as shown in Table 1. The conceptual design phase aims to obtain MD conceptual model, i.e. conceptual schema. The early and late requirements models are used in conceptual design phase to determine facts, dimensions and measures. These fact, dimensions and measures are organized in the form of a MD conceptual model. The early and late requirements models are based on our AGDI model which is detailed out in the next section. The discussion on logical design and physical design phase is beyond the scope of this work.
4 The proposed AGDI model for requirements engineering of data warehouse To capture stakeholders of organization, now we propose an AGDI model by introducing the notion of agent into the previously existing GDI model (Prakash and Gosain 2003). This AGDI model is presented below.
4.1 A relevant new classification of agents Agents were used to model the organization for which computer based system was being developed (Yu 1995; Donzelli 2000). The RE approaches based on the fundamental notion of agency, i.e. agent, goal, and intentional dependency, were recognized to lead towards a more homogeneous and natural software engineering process, ranging from high level organization needs to system deployment (Bresciani et al. 2004). In this present work the notion of agent models the stakeholder, who may play one or more roles in decision making activities of the organization. Therefore, in the following a possible new classification of agents is described to augment the approach of this paper. The agent may be classified as a simple agent or a complex agent. The simple agents capture those stakeholders who may play various roles. The complex agents capture those stakeholders, who may contain other ones within it. For instance, Bank may be modeled as a complex agent for whom DW is to be developed. This complex agent may further contain complex agents representing various departments (finance, operations, marketing, etc.) and simple agents representing employees, customers, etc. as shown through ‘contains’ relationship in Fig. 1. The refinement of complex agents is continued until all complex agents are not transformed into simple agents. We may also classify simple agents on the basis of the following roles they play. (1)
(2)
(3)
Goal-refining agent converts complex goals into simple goals, shown as ‘refines’ relationship in Fig. 1, Decision-suggesting agent identifies relevant decisions for achieving the simple goal, shown as ‘suggests’ relationship in Fig.1, Decision-refining agent decomposes complex decision into simple decisions, shown as ‘decomposes’ relationship in Fig. 1,
Table 1 The proposed new SDLC for DW Phase
Output
Early RE
Organization model, refined organization model and decision model showing dependencies among agents, i.e. early requirements models
Late RE
Information model showing dependencies among agents, i.e. late requirements model
Conceptual design
DW multidimensional conceptual schema to be derived from early and late requirements models
Logical design
DW logical schema from DW multidimensional conceptual model
Physical design
DW physical model from DW logical model
123
Int J Syst Assur Eng Manag
Fig. 1 The proposed agent-goal-decision-information (AGDI) model for requirements engineering of a data warehouse
(4)
(5)
Information identification agent explores information to support the decisions, shown as ‘identifies’ relationship in Fig. 1, and Information-sources identification agent discovers the sources that may provide the information, shown as ‘provides’ relationship in Fig. 1.
Another classification of agents may be; supporting agents and opposing agents to model stakeholders, who may have supporting/conflicting goals. For instance, bank as an agent may have a goal ‘Increase Revenue’ and a regulatory agency may have a goal ‘Provide Banking Facilities in Rural Areas’ as a social objective’. Providing banking facilities in rural areas may generate less revenue as compared to urban areas. Both of these agents are opposing each other such that achieving goal of one agent discourages achievement of the goal of another agent. The agents may interact with each other to achieve the conflicting goals in a balanced manner. A further classification of agents may be internal agent and external agent. The internal agent models the internal stakeholders of the organization and external agent models the external stakeholders of the organization. For instance, bank being a complex agent may contain simple agents
representing employees, which are internal agents, whereas regulatory agencies like Reserve Bank of India may be the external agents with respect to the bank under consideration. The external agents may provide various financial and informational resources to the internal agents to control and regulate the decisional activities of the organization. Now we present the anticipated dependencies among them. 4.2 The anticipated dependencies The agents may achieve their goals by themselves as shown through ‘achieves’ relationship in Fig. 1. The agent may also delegate the responsibility of achieving the goal to other agent. This dependency is shown through ‘delegate’ relationship in Fig. 1. On the basis of a set of roles played by the agents, this delegate dependency among agents is shown in the following: (a) (b)
The goal dependency, in which, one agent may delegate its complex goal to goal-refining agent. The decision dependency, in which, one agent may delegate its simple goal to decision suggesting agent or decision refining agent.
123
Int J Syst Assur Eng Manag
(c)
(d)
The information dependency, in which, one agent may delegate its simple decision to information identification agent or information providing agent. Now in the following section we present our RE approaches based on AGDI model.
5 The proposed novel requirements engineering approach for DW In this section we present our proposed novel RE approach systematically and in the next section we present a case study of a typical Indian public sector bank for application of the same. The legends of our proposed approach are shown graphically in the Fig. 2. All the legends excluding fact, dimension and measure are based on our AGDI model proposed above. Our novel RE approach for DW consists of three phases: (i) early RE, (ii) late RE and (iii) conceptual design as shown schematically in Fig. 3, and is discussed in the next section. 5.1 The early RE phase Here, two types of modeling activities namely; organization modeling and decision modeling are carried out to capture ‘whys’ that underlies decision requirements as presented in the next sub section. 5.1.1 The organization modeling It consists of two phases: (i) agent analysis and (ii) goal analysis, which are presented below•
Agent analysis
Complex Agent
Complex Decision Goal Dependency
123
•
Goal analysis
Here the goal refining agent may refine the complex goal into complex/simple goals and produce a hierarchy of goals. This is repeated till all the complex goals are refined into simple goals. Further, the agents may delegate these simple goals to other agents for further action. Here, the organization model produced during agent analysis is further refined. The refined organization model shows the hierarchy of goals and the goal dependencies among agents. Thereafter in the decision modeling activity, the agents may suggest necessary decisions to achieve their simple goals as discussed in the next sub section. 5.1.2 The Decision modeling The output of organization modeling derives decision modeling phase. It involves; (i) decisions identification in which agent suggests various relevant decisions to achieve their goals and (ii) decision analysis in which complex decisions are refined into simple decisions. This decision modeling activity is continued till all simple goals of refined organization are exhausted. The output is produced as decision model. The organization model, refined organization model and decision model, thus produced, become rationale for modeling information to be maintained in the DW, i.e. capture ‘whys’ that underlies decision requirements as detailed in the following. 5.2 Late RE phase
Here agents and their goals are identified for the organization. Further, the goal dependencies among agents are identified. The agents and their goal dependencies are shown as organization model. The identified agent may be a simple/complex agent or internal/external agent. The complex agent is further refined into complex/simple agents thereby producing a hierarchy of agents. This agent analysis is repeated till all the complex agents are refined
Fig. 2 Legends of a novel requirements engineering approach for data warehouse
into simple agents. The identified goals of simple agents may also be simple/complex goal. The complex goals are refined in the goal analysis, presented in the following.
The output of decision modeling derives this phase. Here, information modeling activity is carried out in which agents explore necessary information and its sources for providing the identified information for supporting the decisions under consideration. Here we capture ‘what’ the DW system should do i.e. information to support the decisions.
Simple Agent
Simple Decision
Complex Goal
Information
Decision Dependency
Simple Goal
dimension
fact
measure
Information Dependency
Int J Syst Assur Eng Manag
Fig. 3 A novel requirements engineering (RE) approach for a DW
5.2.1 The information modeling
5.3.1 Facts identification
It consists of two phases, (i) information identification, in which agent, itself, may identify the relevant information to support given decisions or may depend on other agents and (ii) information-resource identification, in which various internal/external agents (persons, systems, organizations etc.) are identified to provide the information as identified in phase (i). The output is produced as information model.
A fact is a focus of interest for the DW conceptual design and is not known in advance. The refined organization model we produce in Sect. 5.1.1 is used for identification of facts. The Fact identification follows top-down approach that starts from the complex goal and moves towards simple goals. The fact identified from the complex goal consists of sub facts identified from the refined complex/ simple goal. This fact identification is continued till all goals are considered. Thus, a set of facts are obtained. The all possible directions in which a fact may be analyzed are called dimensions as discussed in the following sub section.
5.3 Conceptual design phase The addition of certain MD concepts to the requirements analysis for DWs (Giorgini et al. 2007; Mazon et al. 2007) becomes basis for conceptual design phase. Here the purpose is to identify facts, dimensions and measures in an implementation independent manner. The formal definitions of these concepts can be found in (Golfarelli 2008). In our present proposition early and late requirements models are used to identify fact, dimension and measures and may be organized as MD conceptual schema as discussed in the following.
5.3.2 Dimension identification A dimension is a fact property that describes a possible coordinate of analysis, i.e. a possible perspective for looking at fact. Dimension represents a decisional event that may happen for achieving goals in the organization. The decision model we produced in Sect. 5.1.2 is used for
123
Int J Syst Assur Eng Manag
identification of dimensions for analyzing a fact. Dimension identification also follows top-down approach that starts from the complex decision and moves towards simple decisions. The hierarchy of decisions in the decision model may lead to dimension hierarchy. The dimension identification is continued till all the suggested decisions are considered. Here the refined organizational model and the decision model both are referred, while exploring a relationship between the identified facts and dimensions. 5.3.3 Measure identification A measure is an attribute of a fact, which is required to be viewed in various contexts for supporting the decisions. These contexts may also be used as dimensions and will augment the dimensions already identified in preceding sub-section. The measures and their context of analysis, i.e. dimensions are identified from the information model we produce in the Sect. 5.2.1. Now in the following section we present a CASE study of a Bank based on the above.
6 CASE study of a bank: an application of the proposed novel RE approach for DW We consider a public sector bank operating in the university, where one of the authors of this article, has been working. The Bank wish to achieve its goal like increase total revenue. At the same time, the regulatory agency like Reserve Bank of India may set some of the goals to be achieved by the Bank for example, support social objective and implement regulation. The Bank may need to arrange adequate fund for achieving its goals. The Bank wants to develop a DW System, which can provide relevant information to support various decisions for achieving their goals. We capture decision requirements by carrying out organization modeling activities, in which agent analysis and goal analysis are carried out. The agent analysis considering the above requirements is done as follows: 6.1 Agent analysis The agents, their goals and the dependencies among agents for the above said Bank are identified as follows. •
Agents identified: a. b.
•
Bank (complex agent) Regulatory agency (external agent)
Goals of the agent ‘Bank’ identified: (a)
Increase total revenue
123
(b) •
Arrange funds
Goals of the agent ‘Regulatory Agency’ identified (a) (b)
Support social objective Implement financial regulations
Having identified the agents now their dependencies are identified as belowThe agent ‘Regulatory Agency’ is dependent on the agent ‘Bank’ for achieving its goal ‘Support Social Objective’ and the goal ‘Implement Regulation’, whereas ‘Bank’ is dependent on regulatory agency for achieving its goal ‘Arrange Funds’, as shown through delegate dependency in Fig. 4. The Bank has to achieve its goal ‘Increase Total Revenue’, keeping in view the constraints imposed by the agent ‘regulatory agency’ in the form of various goals as shown in Fig. 4. The bank being a complex agent may contain the various agents as ‘Executive Director’, ‘General Manager Finance’ and ‘Operations Division’. The first two are simple agents, whereas the third one is a complex agent, as shown in Fig. 5. The complex agent’ Operations Division’ is further refined into the simple agents namely General Manager Operations, Deputy General Manger Operations, and Regional Managers as shown in Fig. 5. The Bank being a complex agent may delegate its goal ‘Increase Total Revenue’ to the simple agent ‘Executive Director’, who may achieve this goal or may delegate its goal further to another agent. In case the goal of an agent is complex one then it needs to be refined during goal analysis as discussed in the next sub section. 6.2 Goal analysis The goal analysis further refines the organization model. The agent ‘Executive Director’ views the goal ‘Increase Total Revenue’ as a complex goal. He refines this complex goal into the following sub-goals: (a) (b)
Increase customers-base Deploy cash-in-hand
The sub-goal (a) and (b) are simple goals. The simple goals (a) and (b) may be delegated to the agent ‘GM Operations and ‘GM Finance’ respectively for their achievement, as shown in Fig. 6. Also, the agent ‘Executive Director’ views the goal (a) ‘Support Social Objective’ of the agent ‘Regulatory Agency’ as a simple goal and may be delegated to the agent ‘General Manager Operations’ for its achievement. Further, these agents will explore all the decisions (possibilities) for achieving the goals (a) and (b) delegated to them by the agent ‘Executive Director’, as discussed in the next subsection of decision modeling.
Int J Syst Assur Eng Manag Fig. 4 Organization model for a bank showing agents and their goal dependencies
Support Social Objective delegate Bank delegate achieve
Regulatory Agency
Implement Regulation Increase Total Revenue
delegate
Arrange Funds
Fig. 5 Agents’ analysisrefining complex agent into simple agents
Bank contain
Operations Division (complex agent)
Executive Director
General Manager (GM) Finance
contain
Fig. 6 Refined organization model showing agents and their goal dependencies
Regional Managers
Deputy GM Operations
GM Operations
Increase Total Revenue
Bank
delegates
Executive Director
refines and
Increase Customers-Base
Deploy Cash -in – hand
delegates
GM Operations
GM Finance
6.3 Decision modeling The refined organization model of Fig. 6 derives information modeling. Now consider the goal ‘Increase Customer-Base’, of the agent ‘General Manager Operations’ of Fig. 6. The agent ‘General Manager Operations’ may suggest the following decisions in order to achieve the goal delegated to him, as shown in Fig. 7: (a) (b) (c)
Provide good customer services Offer new customer schemes Provide better interest rates on various accounts
The decisions (a) ‘Provide Good Customer Services’ and (b) ‘Offer New Customer Schemes’ are complex
delegates
decisions, whereas the decision (c) ‘Provide better interest rates on various accounts’ is a simple decision that may be delegated to the agent ‘General Manager Finance’, as shown in Fig. 7. The agent ‘General Manager Operations’ refines the complex decision (a) ‘Provide Good Customer Services’ into the following decisions(a:1) (a:2) (a:3)
Increase customer contact hours of a branch Provide on-line services Introduce ATM
The decisions (a.1), (a.2), and (a.3) are simple decisions and may be delegated to the agent ‘Deputy GM Operations’ as shown in Fig. 7. Similarly, the agent ‘General Manager Operations’ refines the complex decision
123
Int J Syst Assur Eng Manag Fig. 7 Decision modeling showing agents and their decision dependencies considering a goal ‘Increase Customers-Base’
Organization Model
Executive Director
Increase Customers-Base
delegates
General Manager Operations
Decision Model suggests/refines
Provide good
Offer new customer
Provide better interests
customers services
schemes
rates on various accounts
and
Increase customer
Provide online services
contact hours of branch
delegates
delegates
Introduce ATM service
delegates
delegates
Deputy
General Manager
General Manager Operations
Finance
(b) ‘Offer New Customer Schemes’ into the following decisions-
Director’, as discussed in the next sub section of information modeling.
(b:1) (b:2)
6.4 Information modeling
Provide new loan schemes Offer depositing different dues
The decisions (b.1) and (b.2) are simple decisions and may be delegated to the agent, ‘Deputy General Manager Operations’. Further, the agent ‘General Manager Finance’ may suggest the following decisions to achieve the goal (b) ‘Deploy cash-in-hand’ of the agent ‘Executive Director’:
The agent ‘Deputy General Manager Operations’ identifies the following information required to support the simple decision ‘Increase customer contact hours of a branch’ as shown in Fig. 8:
(a) (b)
(b)
Invest in other schemes Provide new loan schemes
The decisions (a) and (b) are the simple decisions and may be delegated to the agent ‘Deputy General Manager Finance’. Finally, the agents ‘Deputy General Manager Operations’, ‘General Manager Finance’ and ‘Deputy General Manager Finance’ will explore the necessary information to support the various decisions delegated to them for achieving the goals (a) ‘Increase Customers Base’ and the goal (b) ‘Deploy cash-in-hand’ of the agent ‘Executive
123
(a)
No. of customers visiting branch wise, day wise, city wise etc. Existing contact timings branch wise, day wise etc.
The agent ‘Deputy General Manager Operations’ delegates to the agent ‘Regional Managers’ for providing the identified information as mentioned in (a) and (b), as shown in Fig. 8. Similarly, the agent ‘Deputy General Manager’ may identify the following information to support the simple decision ‘Provide on line Banking Services’, as shown in Fig. 8: (a) (b)
Services in demand branch wise. No. of potential customers services wise
Int J Syst Assur Eng Manag Fig. 8 Information modeling showing agents and their information dependencies considering the decisions ‘Increase customer contact hours’and ‘Provide on line Banking Services’respectively
Decision Model
General Manager (GM) Operations
Increase customer contact hours of a branch
Provide on line Banking Services delegates
delegates Deputy General Manager (DGM) Operations
Information model identifies
No. of Customers visiting branch wise, day wise
identifies
Existing contact timings branch wise
Services in demand branch wise
delegates
No of Potential Customers services wise
delegates delegates
delegates Regional Managers
The agent’ Deputy General Manager Operations’ may also delegate to the agent ‘Regional Mangers’ for providing the identified the information as mentioned in (a) and (b), as shown in Fig. 8. Similarly, the agent ‘Deputy General Manager Operations’ may identify the following information to support the simple decision ‘ Introduce ATM’: (a) (b)
No of transactions day wise, branch wise No. of potential customers of other banks region wise, branch wise
The agent ‘Deputy General Manager Operations’ delegates to the agents ‘Regional Managers’ and ‘Other Banks’ for providing the identified information as mentioned in (a) and (b) respectively.
6.6 Dimension identification The decision model produced in Sect. 6.3 is used for identification of dimensions for analyzing a fact. From the decision model of Fig. 7 we identify ‘customer services’ as dimension from the complex decision ‘Provide good customer services’ and identify more dimensions as customer branch, online service and ATM service from the simple decisions ‘Increase customer contact hours of branch’, Provide on line services and ‘Introduce ATM service’ respectively as shown in Fig. 9. The customer services may be analyzed through branch, ATM and through on line service, thus builds a dimension hierarchy. The dimension ‘customer services’ is associated with the identified fact ‘customer’. 6.7 Measure identification
6.5 Fact identification The refined organization model produced in previous Sect. 6.2 becomes input for fact identification. From the refined organization model of Fig. 6 we identify fact ‘revenue’ from the goal ‘Increase total revenue’ of an agent ‘Executive Director’, whereas customer and cash-in-hand are identified as sub facts identified from the simple goals ‘Deploy cash in hand’ and ‘Increase customers base’ respectively as shown in Fig. 9. Similarly, we may identify the fact ‘social objective’ from the goal ‘Support Social Objective’ of an agent ‘Executive Director’. The identified facts are analyzed in different dimensions for achieving the goals, as discussed in the next sub section of dimension identification.
We identify measures and their context of analysis from the information model produced in Sect. 6.4. The identified contexts of analysis are designated as dimensions. From information model of Fig. 8 we identify the measures ‘No of customers visiting’ and ‘existing contact timings’ from the information ‘No. of customers visiting branch wise, day wise, city wise etc.’ and ‘Existing contact timings branch wise, day wise etc.’ respectively as shown in Fig. 9. The former measure may be analyzed in the context of branch, day, city, whereas the later in the context of branch and day as shown in Fig. 9. Similarly, we identify the measures ‘services in demand’ and ‘No. of potential customers’ from the information ‘services in demand branch wise’ and ‘No of potential customers service wise, branch wise for last
123
Int J Syst Assur Eng Manag
Fig. 9 Early and late requirements model extended with fact, dimension and measures
years’ shown in Fig. 9. The measure ‘services in demand’ may be analyzed in the context of branch, whereas, the measure ‘No. of potential customers’ in the context of service, branch and year as shown Fig. 9. Similarly, we identify the measures ‘No. of transactions’ and ‘No of staff members’ from the information ‘No of transactions in a day branch wise for last years’ and ‘No of staff members branch wise’ respectively. The former measure may be analyzed in the context of branch and year, whereas the later may be analyzed in the context of branch, as shown in Fig. 9. The facts, dimensions and measures thus identified may be organized in the form of a MD conceptual model, i.e. schema as shown in Fig. 10. Here, we have only considered one goal of an agent and arrived at MD conceptual schema showing only one fact and their associated dimensions and measures as an illustration of our approach. Similarly, we can identify rest of the goals of the all the agents and subsequently more facts, dimensions and measures can be identified. Finally, a set of all identified facts along with their dimensions and associated measures may be represented as a complete MD schema of a DW. Now we present a brief overview of a prototype CASE tool developed in Java language to support our proposed RE approach in the next section.
123
7 A CASE tool supporting proposed novel RE approach for DW The CASE tool is developed in JAVA language for supporting following activities by offering reasonable agreeable GUI support for the designer. 7.1 Early RE phase The CASE tool facilitates analyst during agents’ analysis and goals’ analysis as a part of organization modeling. Finally organization model and refined organization models are produced automatically by the CASE tool. Similarly, in the decision modeling, various decisions suggested by the agent are maintained for achieving the goals, complex decisions are also further refined and decision model is produced automatically. The various screen shots may be shown depicting the organization model and decision model, For instance, a Screen shot-1 and Screen shot-2 of the tools representing organization model and decision model respectively. 7.2 Late RE phase The CASE tool also supports information modeling activity as a part of late RE phase where information and their
Int J Syst Assur Eng Manag
Fig. 10 Multidimensional schema: fact, dimensions and measures identified during conceptual modeling
Screen shot 1 Organization model of bank showing goal dependencies among agents
sources are identified by the agents to support various decisions identified in the decision modeling activity, as shown in the Screen shot-3. The automatically produced information model captures ‘what’ of the decision requirements. 7.3 Conceptual design phase During conceptual design phase facts, dimensions and measures are identified from early and late requirements
models. The case tools support conceptual modeling activities by producing models showing identified facts, dimensions, measures and MD conceptual schema is produced automatically by the CASE tool. For instance, Screen shot-4 of the tool is showing MD conceptual schema of a DW. All produced models by the CASE tool may be rearranged or altered by the analyst, if need arises. Now in the next section we compare our work with the previous relevant work.
123
Int J Syst Assur Eng Manag
Screen shot 2 Decision model showing decision dependencies among agents
Screen shot 3 Information model showing information dependencies among agents
123
Int J Syst Assur Eng Manag
Screen shot 4 MD conceptual model of identified facts, dimensions and measures
Table 2 Comparison of our RE approach with other RE approaches for DW Focus
(Giorgini et al. 2007) Goal-driven
(Mazon et al. 2007) Model and goal-driven
(Prakash and Gosain, 2008; Prakash et al. 2010) Goal-driven
Proposed RE approach of present paper Agent-driven
1. Early RE phases capture ‘why’ that underlies requirements and models the following 1.1 Stakeholders
Yes
Yes
No
Yes
1.2 Goals
Yes
Yes
Yes
Yes
1.3 Decisions
No
No
Yes
Yes
1.4.1 goal dependency
Yes
Yes
No
Yes
1.4.2 decision dependency
No
No
No
Yes
1.4. Dependencies among Stakeholders
2. Late RE phase captures ‘what’ the DW system should do and models the following 2.1 Information to support decisions rather than goals
No
No
Yes
Yes
2.2 Information dependency among stakeholders
Yes
Yes
No
Yes
Information requirements model
CIM
GDI model
Organization models and decision models, i.e. early requirements
4. Conceptual design produces MD conceptual model from RE models
Yes
Yes
No support
Information model, i.e. late requirements model Yes
5. Traceability between RE and conceptual models
Yes
Yes
No support
Yes
6. CASE tool support
Yes
No
Yes
Yes
3. Output of RE phase 3.1 Early RE phase 3.2 Late RE phase
123
Int J Syst Assur Eng Manag
8 Comparison with previous relevant work The RE approach must model ‘whys’ that underlies DW decision requirements and ‘what’ the DW system should do, i.e. early and late requirements. Only few goal and model driven approaches (Giorgini et al. 2007; Mazon et al. 2005, 2007) model ‘why’ of information requirements rather than decision requirements and derived a MD conceptual schema. However, the goal driven RE approaches (Prakash and Gosain 2008; Prakash et al. 2010) for DW model decision requirements. They modeled only late requirements and did not model early requirements explicitly. In contrast our proposed capture decision requirements in terms of early and late requirements. Therefore we feel appropriate to compare our approach with RE approaches (Giorgini et al. 2007; Mazon et al. 2007; Prakash and Gosain 2008; Prakash et al. 2010 ) that modeled early or late requirements as shown in Table 2. Now in the next section we present the conclusions.
9 Conclusion and future work We have presented here a novel RE approach for DW supported with a case study of a bank. The following points are worth concluding. (a)
(b)
(c)
(d) (e)
(f)
(g) (h) (i)
An improved SDLC for DW has been proposed where RE phase has been divided into early RE and late RE phase. The new AGDI model has been suggested as an extension to the existing GDI model used for early and late RE phase. Early RE phase capture ‘why’ that underlies decision requirements and late RE phase captures ‘what’ the DW should do, information related to the decisions. All of these early and late requirements models are used for conceptual design phase. All early requirements, late requirements and MD conceptual model are interlinked with each other, thus, support the traceability. The CASE tool has been developed in Java for providing automated modeling support to our proposed approach. Future efforts are along two main directions as follows: The matching of decision requirements with data sources. The transition from conceptual model to logical model of DW.
References Boehnlein M, vom Ende U (2000) A business process oriented development of data warehouse structures. In: Proceedings of data warehousing, Physica Verlag, Heidelberg
123
Bonifati A, Cattaneo F, Ceri S, Fuggetta A, Paraboschi S (2001) Designing data marts for data warehouses. ACM Trans Softw Eng Methodol 10(4):452–483 Bresciani P, Giorgini P, Mylopoulos J, Perini A (2004) TROPOS: an agent oriented software development methodology. Auton Agent Multi-Agent Syst 8:203–236 Bulos D, (1999) Designing OLAP with ADAPT. Technical Reports, Atos Origin Cabibbo L, Torlone R (1998) A logical approach to multidimensional databases. In: Proceedings of 6th international conference on extending database technology, LNCS, vol 1377. Springer, Heidelberg, pp 183–197 Donzelli P, Moulding M (2000) Developments in application domain modelling for the verification and validation of synthetic environments: a formal requirements engineering framework. In: Proceedings of the spring 99 simulation interoperability workshop, LNCS, Springer, Orlando. Frendi M, Salinesi C (2003) Requirements engineering for data warehousing. In: Proceedings of REFSQ Workshop Giorgini P, Rizzi S, Garzetti M (2007) GRAnD: a goal-oriented approach to requirement analysis in data warehouses. Decis Support Syst 45:4–21 Golfarelli M, Rizzi S (1999) Designing the data warehouse: key steps and crucial issues. J Comput Sci Inf Manag 2(3):88–100 Golfarelli M, Rizzi S, Saltarelli E (2002) WAND: a case tool for workload-based design of a data mart. In: Proceedings of SEBD, Italy, pp 422–426 Golfarelli M (2008) The DFM: a conceptual model for data warehouse. In: Wang John (ed) Encyclopedia of data warehousing and mining, 2nd edn. IGI Global, Hershey Husemann B, Lechtenborger J, Vossen G (2000) Conceptual data warehouse design. In: Proceedings of the international workshop on design and management Inmon WH (1996) Building the data warehouse. Wiley, New York Jeusfield M, Quix C, Jarke M (1998) Design and analysis of quality information for data warehouses. In: Proceedings of 17th international conference on conceptual modeling Kimball R, Ross M (2002) The data warehouse toolkit, 2nd edn. Wiley, Hoboken Lehner W, Albrecht J, Wedekind H (1998) Normal forms for multidimensional databases. In: Proceedings of 8th international conference on statistical and scientific database management, IEEE Computer Society, Washington Lujan-Mora S, Trujillo J, Song IY (2002) The gold model case tool: an environment for designing OLAP applications. In: Proceedings of ICEIS, pp 699–707 Mazon JN, Trujillo J, Serrano M, Piattini M (2005) Designing data warehouses: from business requirements analysis to multidimensional modeling. In: Proceedings of REBNITA’05, Paris Mazon JN, Trujillo J (2006) An MDA approach for the development of data warehouses. Decis support Syst. doi:10.1016/j.dss.2006.12.003 Mazon JN, Pardillo J, Trujillo J (2007). A model-driven goal-oriented requirements engineering approach for data warehouses. In: Proceedings of ER workshop, LNCS, vol 4802. Springer, Heidelberg, pp 255–264 Paim FR, Castro JB (2003) DWARF: an approach for requirements definition and management of data warehouse systems. In: Proceeding of the 11th IEEE international requirements engineering conference, pp 1090–1099 Prakash N, Gosain A (2003) Requirements driven data warehouse development. In: Proceedings of CAiSE 03 short paper, pp 13–17 Prakash N, Gosain A (2008) An approach to engineering the requirements of data warehouses. Requirements Eng J 13(1):49 Prakash N, Prakash D, Gupta D (2010) Decision and decision requirements for data warehouse systems. In: Proceedings of
Int J Syst Assur Eng Manag CAiSE’ forum, lecture notes in business information processing, vol 72. Springer, Heidelberg, pp 92–107 Salinesi C, Gam I (2009) How specific requirements engineering be in the context of decisional information systems. In: Proceedings of third international conference on research challenges in information science, IEEE, New York city Shiefer J, List B, Bruckne RM (2002) A holistic approach for managing requirements of data warehouse systems. In: Proceedings of 8th Americas conference on information systems, pp 77–87 Tryfona N, Busborg F and Christiansen J (1999) Star ER: a conceptual model for data warehouse design. In: Proceedings of 2nd international workshop on data warehousing and OLAP, ACM, New York city, pp 3–8 Vassiliadis P (2000) Gulliver in the land of data warehousing: practical experiences and observations of a researcher. In: Proceedings of 2nd international workshop on deign and management of data warehouses, pp 12.1–12.16
Winter R, Strauch B (2003) A method for demand–driven information requirements analysis in data warehousing projects. In: Proceedings of the 36th Hawaii international conference on system sciences Winter R, Strauch B (2004) Information requirements engineering for data warehouse systems. In: Proceedings of ACM symposium on applied computing, Nicosia Yu E (1995) Modeling strategic relationships for process reengineering. Ph.D. thesis, Department of Computer Science, University of Toronto, Toronto Yu E (1997) Towards modeling and reasoning support for early-phase requirements engineering. In: Proceedings of IEEE international symposium on requirements engineering, pp 226–235 Yu E, Mylopoulos J (1994) Understanding why in requirements engineering-with an example. In: Proceedings of workshop on system requirements: analysis, management and exploitation, Germany, pp 4–7 Oct.
123