Objectives: A highly accurate Intrusion detection model is developed that classifies both the networkbased and hostbased intrusions without any complexity issues. Method: An optimized Deep Learning (DL) algorithm of IDS model is presented in the form of a HyperHeuristic Firefly Algorithm based Convolutional Neural Networks (HHFACNN). This proposed HHFACNN reduces false values and improves accuracy without increasing the complexities. Findings: The proposed HHFACNN system is performed on two network traffic datasets: NSLKDD and ISCXIDS. The outcomes demonstrated that the proposed HHFACNN model gives predominant execution than the other existing models. Novelty: The proposed model has employed a novel HyperHeuristic Firefly Algorithm for optimizing the hyperparameters of the CNN. This model maintains the standard guidelines of the firefly algorithm and applies the highlevel technique for controlling the exploration and determination of lowlevel heuristics.
With the introduction of advanced technologies in the recent years, the big data analytics have attained significant interest in various domain applications such as medicine, healthcare, education, smart cities, environment analytics, business analytics, data processing and cyber security
The most common duty of BDCA is to monitor the network and internet traffic to analyze the intrusions. The intrusion detection is considered as a fundamental security solution as the intrusions pave the way for other malicious events. The malicious cyberattacks lead to serious security degradation and hence the research community has insisted on the requirements of a novel, adaptive and reliable IDS. Depending upon the detected intrusion behaviors, the IDS are classified as networkbased IDS (NIDS)
This paper has suggested the use of optimized deep learning algorithm for accurately identifying the attacks in the network flow data with less false positive rate and less complexity. Previously, HyperHeuristic Improved Particle Swarm Optimization based Support Vector Machines (HHIPSOSVM)
The contributions of this paper are summarized as follows:
The Natural Language Processing (NLP) Text Representation methods are used to process the log files to determine the hostlevel events. As NLP based text representation methods identify the contextual and semantic similarity from a large amount of unstructured and fragmented texts, it enhances the detection accuracy of the IDS model.
A scalable IDS framework has been developed using an effective deep earning approach of HHFACNN to handle the deep characteristics of networklevel and hostlevel events. The collaborative combination of NIDS and HIDS increases the complexity and hence the proposed deep learning HHFACNN is introduced in this paper.
The proposed HHFACNN based IDS model is applied on benchmark datasets of NIDS and HIDS for conducting the experimental comparisons.
Recent studies have employed different types of deep learning algorithms and ensemble approaches for big data analyticsbased intrusion detection. To compete with such IDS approaches, machine learning algorithms were predominantly employed using optimization algorithms. Sabar et al.
Due to the limitations in machine learning approaches including the ELM, the researchers have started employing deep learning algorithms for the big data cyber security models. LopezMartin et al.
Convolutional Neural Network (CNN) and Long ShortTerm Memory (LSTM) neural networks have achieved maximum exposure and increased classification accuracy in IDS models. Xiao et al.
Almiani et al.
Irrespective of their advantages, CNN and LSTM have limitations in learning the spatial and temporal features. Hence some studies have combined them for increasing their effectiveness. Khan et al.
Although the combination of CNN and LSTM has provided better classification accuracy, their main drawback is the computational complexity. In some studies, the class imbalance problem is also cited as a limitation. From the literature, it has been found that the optimized CNN has provided significantly better accuracy and also has less complexity. Hence this study focuses on exploring the optimized CNN and suggests the use of advanced optimization algorithms to overcome the limitations in the GA search process.
The proposed HHFACNN methodology includes the hyperheuristic modelling of the firefly algorithm for elevating the hyperparameters of CNN to attain the best structural design of CNN.
CNN comprises four main operators namely Convolution layer, pooling layer, and fully connected layer and nonlinear activation function.
It forms the major core of CNN that analyses and extracts the desired features. This convolution task conserves the spatial connection amongst the input data by acquiring the aspects by the kernel function. The outcome of the CL will be the convolved aspect plot. The kernel points are updated automatically based on the optimal structure configuration. The magnitude of the aspect plot is reliant on the depth of the layers.
After the convolution operation, the additional nonlinear function is used before the creation of feature maps. The NLA can either be tanh, sigmoid or Rectified Linear Unit (ReLU). This NLA acts as the elementwise task to compromise the negative points of the aspects. In most cases, the sigmoid or ReLU provided better performance.
Spatial pooling is the subsampling or downsampling process in CNN, performed to reduce the dimensionality of the feature maps. It is similar to the feature reduction process that removes the less important data while retaining vital information. Kinds of pooling are average, max, stochastic and sum pooling denoted by the pooling numbers 14. In most cases, maxpooling provides the most important features.
It is a conventional multilevel neural layer employing a SoftMax initiation utility in the outcome layer. The FCL has the preceding layer nodes interlinked with the succeeding layer nodes. The complex aspects yielded from CL and PL are used by this FCL for labelling the data into classes using the past learning knowledge.
Combining all these operators, CNN is formed. The hyperparameters used in CNN are listed in
Here
Hyperparameter 
Range 
Difference 
Number of CL 
14 
1 
Number of PL 
1 
1 
Number of FCL 
15 
1 
Hidden units/layer 
2561024 
256 
Pooling type 
14 
1 
Kernel size 
18 
1 
The HHFA is developed by fusing the hyperheuristics to the multiobjective firefly algorithm. The hyperheuristic framework consists of two strategies namely highlevel and lowlevel strategies for enhancing the optimization function of the firefly algorithm. The lowlevel strategy explores the problem and forms the rules to select the solutions. Then one or more solutions is considered, combined or modified to form a new set of solutions to increase better options. The highlevel strategy initiates the heuristic search process to select the solutions from the set of possible solutions based on the rules generated by the lowlevel strategy.
The lowlevel heuristics contains the set of problemrelated rules generated to provide solutions to each selected problem instances. It forms a new set of solutions by considering one or more solutions and transforming or combining them using different search processes. In this study, the FA based search process is used as one of the search processes to generate new solutions. Once the new solutions are formed, the highlevel strategy imitates the selection process. The highlevel strategy automatically performs the heuristic selection by choosing the heuristics onebyone and applying it to the solutions. From the existing set of heuristics formed by the rules generated by lowlevel strategy, the heuristics are selected through an online heuristic selection mechanism. The empirical reward and the confidence level variables are the main metrics for measuring the efficiency of the heuristics. The rewards obtained in past performance are called empirical reward while the frequency of utilization of the heuristic denotes the confidence level. Using these two variables, the heuristics is deemed fit or unfit for the current state of operation. Thus selected heuristics are applied to the solutions through the firefly foraging process
The heuristics are initialized as the population of fireflies
The light intensity
Here
The attractiveness
Where
In CNN optimization, the computational complexity must be reduced which means the resource utilization must be less. So, the attractiveness expression is modified for the practical application as given below
The distance amongst any two fireflies (nodes) i and j positioned at
Where
The firefly moves towards the best firefly and this location is updated after each iteration using the following equation
Here
The heuristic is applied to each of the solutions obtained by the firefly based on the light intensity and the attractiveness of the firefly algorithm. The firefly which is returned as the global best solution contains the solution to be applied. The heuristic is applied with the selected solution to form a new set of solutions. In this stage, the serial scheduling and double justification are used. Serial scheduling is used to select the solutions without interleaving the feasible solutions. Likewise, the double justification is a simple local search technique which searches the solutions with exacting shifting to control the search quality. The new solutions are compared, and then they are analysed by their properties. This analysis in terms of configuration determines whitener to include them in the existing set of solutions or terminate them to accommodate newer solution from next iterations.
After the formation of new solutions by lowlevel heuristics and the selection by the highlevel strategy, they are saved in the nondominated set of solutions in the archive. The nondominated sorting procedure is used to classify the archive to create several levels for saving the newer solutions. The first level is given to the solution with high priority and the next level will be given to the secondbest priority and vice versa. The HHFA selects the solutions from this archive based on the Paretofront and returns the best configuration as the final solution. Algorithm 1 summarizes the steps involved in HHFA.
Begin Initialize population of fireflies Assign heuristics as fireflies The light intensity I at Set light absorption coefficient 𝛾 Evaluate the fireflies to determine the fitness While (m < Max_Generation) For 𝑖=1:𝑛 all 𝑛 fireflies For 𝑗=1:𝑖 all 𝑛 fireflies Call the jth lowlevel heuristics of the firefly search space Apply serial scheduling and double justification If ( Move firefly 𝑖 towards 𝑗 in 𝑑dimension; End if Estimate new solutions and update light intensity Update the location of fireflies End for 𝑗 End for 𝑖 Check the stopping criteria Update the firefly ranking list to determine current best End while Return best firefly End process
The CNN architecture is represented as
For the optimal selection of the CNN hyperparameters, each solution is made up of the problem parameters subject to optimization by the firefly search process of exploitation (intensification) and exploration (diversification). The exploitation in HHFA is controlled by the values assigned to
The hyperparameters problem is encoded as
Here
In this study, the hyperparameters like dropout rate, the learning rate, etc. are not optimized as they mostly have real values. The hyperparameters that provide the integer values are only optimized using HHFA. As the upper and lower bounds for each parameter are set high i.e. greater than 1, the equations (1) to (7) depicted in HHFA can be adaptively used for the CNN optimization problem. The classification error rate is used as the fitness function. The objective is to minimize the error rate while calculating the fitness for the ith solution which can be expressed as
Here
Layer type 
Configuration 
Kernel size 
Error rate 
CNN configuration 1 CL PL FCL 
2 layers; max pooling 3 layers; 512 units 
1*1 
16.3 
2*2 
16.7 

3*3 
16.6 

CNN configuration 2 CL PL FCL 
2 layers; max pooling 3 layers; 256 units 
1*1 
16.7 
2*2 
17.3 

3*3 
17.2 

CNN configuration 3 CL PL FCL 
2 layers; max pooling 3 layers; 512 units 
1*1 
16.8 
2*2 
17.1 

3*3 
16.9 

CNN configuration 4 CL PL FCL 
3 layer; max pooling 2 layers; 1024 units 
1*1 
17.1 
2*2 
17.8 

3*3 
17.5 
The configurations are obtained such that the CL, PL, FCL are determined and the kernel size is varied to obtain the three different error rates. This CNN can extract the spatial features by setting many kernels of varying sizes. The most common kernels are the convoluted 1*1, 2*2, and 3*3 kernels among which the 2*2, and 3*3 kernels learn the features accurately while 1*1 kernel helps in increasing the learning rate. Considering the configurations from the above table, the CNN configuration with less classification error is chosen by the HHFA. The best performance was obtained only after 13 to 18 iterations in all conducted HHFA runs. In this case, CNN configuration 1 has less classification error of 16.3 when the 1*1 kernel is used and hence it will become the optimal CNN architecture. This optimal CNN increases the classification of the intrusion datasets.
The assessment of the suggested HHFACNN prototype is achieved using two benchmark cases of cyber security problems, NSLKDD and ISCXIDS datasets. The tests are performed in MATLAB R2016b on a Windows 64 bit machine of processor Intel core i5 3470 3.2 GHz, RAM 4GB DDR3 and Storage of 500GB Intel SSD. The two benchmark instances are collected from https://www.unb.ca/cic/datasets/index.html.
The NSLKDD consists of training, testing, 20% training and 20% testing data files. It also contains a subset file with difficulty levels. The NSLKDD is an improved version of the popular KDDCUP99 dataset. NSLKDD problem instance consists of 311,027 training samples and 77,289 testing samples which are classified as either normal or malicious. ISCXIDS was created by monitoring the network activity for 7 days from Friday 11/6/2010 to Thursday, 17/6/2010. It consists of records of normal, HTTP Denial of Service attacks, Brute Force attacks and infiltration activities. Around 208,667 training samples and 78,400 testing samples that are classified as either normal or attack activities are used for this evaluation.
The proposed HHFACNN is implemented along with the existing HHIPSOSVM
Algorithm / Metrics 
Accuracy (%) 
Precision (%) 
Recall (%) 
Fmeasure (%) 
Time (seconds) 
HHSVM 
89.76 
67.10 
62.81 
62.22 
4.65 
HHIPSOSVM 
93.33 
73.99 
64.29 
68.37 
2.55 
HHFACNN 
96.6667 
93.9394 
74 
82.7860 
1.38 
DT 
80.14 
72.33 
61.25 
85.12 
5.62 
FC 
82.98 
74 
60.28 
61.35 
6.58 
GNBT 
80 
69 
70.23 
76.52 
5.35 
It can be seen that the performance values of HHFACNN are higher than the HHIPSOSVM and HHSVM. HHFACNN has 96.6667% accuracy which is 3.3% and 6.9% higher than HHIPSOSVM and HHSVM. Likewise, HHFACNN has outperformed both HHIPSOSVM and HHSVM in terms of precision, recall and fmeasure. HHFACNN has 20% and 26.9% high precision, 9.7% and 11.2% higher recall, 14.5% and 20.5% higher fmeasure than the HHIPSOSVM and HHSVM models, respectively. The execution time taken by HHFACNN is also less than the HHIPSOSVM and HHSVM.
Algorithm / Metrics 
Accuracy (%) 
Precision (%) 
Recall (%) 
Fmeasure (%) 
Time (seconds) 
HHSVM 
86.6 
63.3 
60.0 
56.19 
126 
HHIPSOSVM 
92.4 
69.65 
61.1 
59.82 
49.5 
HHFACNN 
93.33 
99.7 
93.33 
96.55 
48.2 
Similar to NSLKDD, the performance obtained on ISCXIDS shows that HHFACNN has outperformed the HHIPSOSVM and HHSVM models. HHFACNN has 0.97% and 6.7% higher accuracy, 30% and 36.3% higher precision, 32.2% and 33.3% higher recall, 36.7% and 40.4% higher fmeasure than the HHIPSOSVM and HHSVM models, respectively. HHFACNN also consumes 1.3 seconds and 77.8 seconds less than HHIPSOSVM and HHSVM models, respectively for executing the ISCXIDS data.
The performance of the proposed HHFACNN is also compared with other popular algorithms from the literature that were tested on NSLKDD dataset. The accuracy values of the algorithms namely HHSVM
Algorithm 
Accuracy (%) 
HHSVM 
89.76 
SVMIBGWO 
96 
DRL 
89.78 
MultiCNN 
86.95 
GACNN 
98.2 
DRNN 
92.18 
DLSTM 
86.99 
CNNLSTM 
96.47 
HHIPSOSVM 
93.33 


From
In this study, a hyperheuristic firefly optimization is intended for the improvement of the CNN design to determine the big data intrusion problems. In the first part, the CNN design issue is displayed as a multiobjective optimization issue dependent on the hyperparameters. This problem is addressed by adopting the proposed HHFA structure which uses the highlevel methodology and lowlevel heuristics of hyperheuristic methodology on the standard firefly optimization. The proposed HHFACNN system was assessed on two network traffic datasets: NSLKDD and ISCXIDS. The outcomes demonstrated that the proposed HHFACNN model gives predominant execution than the other existing models. In the future, the proposed hyperheuristic system can be used for multiclass attack detection. Also, other cyber security instances such as UNSWNB15 will be tested. Moreover, the impact of feature dimension reduction techniques will also be investigated.