Comparative Analysis of Semantic Web Service Selection Methods

The objective is to compare semantic web service selection methods based on Quality of Service (QoS) attributes. The investigation is carried out in three phases namely: Preprocessing, discovery and QoS based selection phase. Preprocessing phase deals with registration of services in the repository. Discovery phase deals with retrieval of functionally similar services using matchmaking techniques. The third phase focuses on QoS based selection using three methods namely Analytical Hierarchical Processing (AHP), Logical Scoring Preference with Ordered Weighted Averaging (LSP and OWA) and Fuzzy with Technique for Order Preference by Similarity to an Ideal Solution (Fuzzy Topsis). Further this paper analyzes the selection methods based on range of user preference criteria and compares them using Analysis of variance (ANOVA). It is a statistical hypothesis testing used for making decisions using data. Another objective of this paper is to pro-pose a new approach to improve the performance of service selection using Iterative MapReduce. QWS dataset has been used for analyzing above mentioned three methods. Experimental results show that the Fuzzy Topsis scheme outperforms the other web service selection methods.


Introduction
The main challenge in semantic web is the process of service discovery. The discovery [1][2][3]8 process makes it efficient by decreasing the time taken to discover relevant services 1,9 depicts an ontology-based flexible discovery of web services. It says how a user's query for a Web service that meets certain selection criteria can be transformed into queries that can be processed by matchmaking based selection phase 2,21 . Match making algorithm that considers only the input and outputs of request and advertised services for comparison. This is the basic algorithm that can be tailored by using it along with other algorithms. Based on this algorithm, the inputs and outputs of the services are considered in the module, Statistical Refining, for discovering the suitable service that satisfies the user requirements 3 . An exhaustive match making algorithm has been proposed based on the concept of bipartite graphs. Model of the algorithm involves two steps: Constructing a bipartite graph and Defining a match criteria. This algorithm is combined with Hungarian algorithm which computes the matching of bipartite graph such that sum of weights of edges in the matching is minimized. The need to go for lexemic search is to eliminate the concept redundancy. Therefore, it doesn't make sense to find the same definition for the same concept and when each concept adds new information onto the ontology. This gives rise to ambiguities. To find the relation between the concepts described in the ontology. Lexemic search includes two components which they are: Word Net and Concept Match. Services are specified in OWL-S format in order to match the inputs and outputs 1,4,10,13 (precondition and effect as well). It is the process of selecting the most suitable services from the list of matching services. Also, to speed up the discovery process and retrieve the relevant services those satisfy the Quality of Service factors as well 5 . Selection of web services using methods like Logical Scoring Preference and Ordered Weighted Averaging. After discovery is done, Quality of Service factors are considered to rank them 6 . A new dynamic replication algorithm to increase the availability of the service and decrease network delays 7 . Analytic Hierarchy Process (AHP) is used to rank the web services. Thus, the knowledge gained on Quality of factors has been applied in the QoS based selection phase, the match making concept has been used in the semantic service discovery phase of the framework. The MapReduce 17 runtime that supports Iterative MapReduce computations efficiently. MapReduce programming is now an important model used for parallel computations. The Iterative MapReduce map task is responsible for performing the service selection method individually. ANOVA 11,19 testing tool is used to bring out the performance of each of these selection methods. The application considered for web services is the E-shopping system. The rest of this paper is organized as follows. Section 2 discuss about the Semantic based web service selection with QoS. Section 3 explains the prototype Implementation details. Section 4 presents the Performance Evaluation of Selection methods using ANOVA. Conclusions are discussed in Section 4.

Semantic Based Web Service Selection with QoS
The objective of the Semantic based web service selection with QoS is to select best service. It is designed to get inputs from the user. Input parameters obtained are then given to the concerned phases for processing. A list of domains is given to user to choose from. They are also asked to provide the concepts they want to have in the OWL 9 file for concept comparison. Then, to discover services, they are asked to give input(s), output(s), precondition and effect values along with the Quality of Service related parameters to filter the available services as shown in Figure 1. The proposed framework consists periodical pre processing phase, semantic service discovery phase and QoS based selection phase. The periodical pre-processing phase consists of pre-processing agent. The service provider submits the different web services and the description of each of the web services to the pre processing agent. Based on the service description the pre-processing agent builds the OWL-S and OWL-Q representation for the submitted web services.  The semantic service registry will consists of various web services along with its OWL-S and OWL-Q representations. The semantic service discovery phase consists of the discovery agent. The user will request for the service by providing various functional and non-functional requirements. The functional requirements will be taken up by the Discovery agent and it performs lexemic search over the services present in the semantic service registry. The discovery agent finally provides a set of services which functionally matches the requests of the users 16 . The QoS based selection phase consists of the QoS Agent and Ranking agent. The QoS agent takes up the Nonfunctional requirements of the users. Ranking agent which evaluates the hybrid ranking algorithm using Analysis of Variance (ANOVA) and finally provides the best service to the user.
Based on the QoS weights submitted to the Iterative MapReduce using Twister if performs three different ranking methods such as Analytic Hierarchic Process (AHP), Logical Scoring Preference and Ordered Weighted Averaging (LSP and OWA) and Fuzzy with Technique for Order Preference by Similarity to an Ideal Solution 12,15 (Fuzzy Topsis). These methods are run individually and QoS values are divided into equal sub populations. Then sub population values are given to the map function. The output of the map tasks are given to the reduce task as shown in Figure 4. Figure 2 depicts the sequence diagram for functional aspects and Figure  3 depicts the sequence diagram for Non-functional aspects.

Using Iterative MapReduce
This module performs the Iterative MapReduce 17 of Semantic based web service selection model with QoS. The Figure 4 shows the Iterative MapReduce. The input  The map task is responsible for performing the service selection method individually. The map task performs the ranking algorithm for that sub population and compute rank list for all services. The output of the map tasks are given to the reduce task, the reduce task collects all the outputs from all the map tasks. The system has to run using the Iterative MapReduce environment. The number of Iterations can be changed according to the weights of service consumer.

AHP:
Step1. Construct Service Relative Ranking Matrix (M) for all Quality of service parameters Step2. Construct Service relative ranking Vector (V) for each Quality of service parameters Step3. Construct the Service Relative Ranking Matrix for Quality of Service parameters Step4. The overall service ranking is obtained by augmenting group Vectors (V) and multiplying with the weights. The highest obtained value of V is ranked as the best service in AHP method.

LSP and OWA:
Step1: Construct Evaluation function (E) for each QoS and multiplying with Weights. Step2: Construct logical relation value(r) and finding Orness degree Step3: The overall service ranking values obtained by LSP and OWA. It gives the ranking of relevant web services.

FUZZY:
Step 1: Construct Decision Matrix (E) and multiplying with Weights (W) Step 2: Measuring the distance of each alternative from Positive and Negative Ideal Solution (PIS, NIS) using Euclidean distance.
Step 3: Calculating relative closeness coefficient and rank Preference order, the service with highest closeness coefficient represents the best service and is closest to the Fuzzy PIS and farthest from the Fuzzy NIS.

Prototype Implementation
An E-Shopping application has been taken as a case study to select the best service by using three selection methods are Analytical Hierarchy Process (AHP), Logical Scoring Preference (LSP and OWA), Fuzzy Topsis. The registry needed for storing the details of service description, OWL-S and OWL-Q. The various web services created for implementation are listed in Table 1 and 2. The QoS parameters considered for non-functional aspects and QoS values as shown in Table 6 and 7. This paper contains implementation details of the three phases of the system are periodical preprocessing, Semantic service discovery and QoS based selection phase.

Preprocessing Phase
In preprocessing phase the Service provider creates the set of services that stored in semantic service Registry in the form of OWL-S and OWL-Q as shown in Table 1 and 2.

Semantic Service Discovery Phase
Semantic service discovery phase get inputs from the user. Input parameters obtained are then given to the concerned modules for processing. WordNet is used to find the synonyms of the given domain. Output names are then compared with the domains stored in the registry. Concept match involves retrieving concepts from the OWL files and then comparing them with the input concepts. In this stage, the list produced by Lexemic search gets refined. This refining is done by searching in each OWL file in the list to find concepts matching with the required concepts. To rank the OWL files, four possible alternatives of concept-to-concept relationships. Then, to discover services, based on Matchmaking algorithm as shown in Table 3. For example, consider (Figure 4) that the input domain given by user is material. If the input doesn't match with the available domains, then WordNet is invoked to get synonyms. It yields 'book' as the output. Next, only the OWL files that are related to domain 'book' are retrieved and passed on as the input to the concepts match phase. Thus, filtering domains helps to choose only the relevant OWL files among numerous OWL files available. Let the concepts given by user be location (expected to be the super concept), city and country as two of its sub concepts. Concepts retrieved from each OWL files are compared with location, city and country and the rela-tion is identified. OWL file that has the same structure i.e. location as super concept, city and country as its sub concepts are considered as Identical and corresponding OWL file is ranked 1.

QoS Based Selection Phase
In QoS based selection phase, preferences of service providers are taken into account in order to filter the services. Each and every service provider will give their own values for the QoS factors. While ranking the discovered services (Table 4), input values given by the user are compared with the values specified by the service provider 15,16 . In this phase the service consumer contributes weights to QoS parameters and these values are applied to Iterative MapReduce Twister. The out of map task is the new generation. All the new generations collected by the reduce task and combine task Ranking is done using hybrid ranking algorithm as shown in Table 5.
According to service selection methods (in section 2.2) ranks the functionally discovered services. In this paper considered 100 services and select top 5 ranked services by using hybrid ranking algorithm as shown in Table 6 and 7. It is verified by using Analysis of Variance (ANOVA). ANOVA is used to analysis the performance of each service selection methods. Else if ((C 1 parent class of C 2 AND C 2 parent class of C 3 ) OR (C 1 parent class of C 3 AND C 3 parent class of C 2 )) then Relation-type ← Sub Rank ← 3 Else Relation-type ← Super Rank ← 2 End if End for

Degree of Match algorithm Exact
If advertisement AD and request UR are equivalent concepts, we call the match Exact. (AD = UR)

Plug-in
If request UR is super-concept of advertisement AD, and call the match Plug-in. (AD ‫כ‬ UR) Subsume If request UR is sub-concept of advertisement AD, and call the match Subsume. (AD ⊂ UR ) Fail If advertisement AD and request UR are not equivalent concepts, and call the match Fail (AD ≠UR) UR-service Requestor AD-service Provider

Performance Evaluation of Selection Methods using ANOVA
In this section assesses three selection methods available for ranking the web services based on the quality of service parameters. Specifically three algorithms namely AHP, LSP and OWA and Fuzzy Topsis were executed and the top five services are selected and service results can be tested by using ANOVA. One important question is whether any of these result sets is better than any of the others. To address this issue first it was checked whether there is a variation among the results produced by the three algorithms. Next T-test for paired means was conducted to see which algorithms are equally good. The results of a two-way ANOVA test as shown in Table 8. The F value is less than the F critical value, it means that no significance between the quality of services. If F value is greater t than the F critical value. This case 24.9>4.45, it means that there is significance difference between the three ranking methods. Next we need a paired T-test to test each pair of means as shown in Table 9.
According to the Table 9 post-hoc test result, AHP, LSP and OWA are not equally good; LSP and OWA, Fuzzy are equally good; AHP and Fuzzy are equally good. As Fuzzy is equally good always it outperforms the other two selection methods.
The above experimentation was done with some QoS values taken from the QWS dataset provided by Al-Masri 20 . When we compare the time taken for the selection methods it can be seen that Fuzzy Topsis takes very less time than LSP and OWA, AHP. This is graphically shown in Figure 5 and 6. For example the Fuzzy Topsis gives better results but 100 services takes around 10.5 to11 seconds for selecting the services, while we apply iterative MapReduce to it, the time taken gets reduce to 0.15 to 0.18 seconds and also give even better result. The parallel running map tasks can share their results for better performance. The X-axis denotes the Number of services with different sizes containing 20,40,60,80,100 and Y-axis denotes execution time for selection methods.
From Figure 5 and 6, Fuzzy Topsis has lower execution time (in seconds) when compared to AHP, LSP and OWA.

Conclusion
With proliferation of functionally similar services determining the most suitable service is important. QoS based selection methods play a significance role in further filtering of discovered services. Through this work efforts have been taken to compare the selection methods namely AHP, LSP and OWA, Fuzzy Topsis. The selection methods are compared using ANOVA, as statistical model to analyze the difference between group means and their associated procedures. The methods were also assessed by applying Iterative MapReduce. Service QoS values are divided into Table 6. Sample set of services