This is a preprint version that has not undergone peer review or any post-submission improvement or corrections. The copyright lies with Springer. The final publication is published in Applied Reconfigurable Computing: Architectures, Tools and Applications (ARC2024) and is available online at Springer at https://doi.org/10.1007/978-3-031-55673-9\_4 # Reconfigurable Edge Hardware for Intelligent IDS: Systematic Approach \* $\label{eq:wadid} \begin{tabular}{ll} Wadid Foudhaili*^{[0009-0004-3544-6536]}, Anouar Nechi^{[0000-0001-9680-6145]}, \\ Celine Thermann^{[0009-0004-2354-1616]}, Mohammad Al \\ Johmani^{[0009-0009-9126-0919]}, Rainer Buchty^{[0009-0004-9413-2078]}, Mladen \\ Berekovic^{[0000-0003-1911-756X]}, and Saleh Mulhem^{[0000-0001-7380-5270]} \end{tabular}$ Institute of Computer Engineering(ITI), Universität zu Lübeck, Lübeck, Germany Corresponding author: wadid.foudhaili@uni-luebeck.de Abstract. Intrusion detection systems (IDS) are crucial security measures nowadays to enforce network security. Their task is to detect anomalies in network communication and identify, if not thwart, possibly malicious behavior. Recently, machine learning has been deployed to construct intelligent IDS. This approach, however, is quite challenging particularly in distributed, highly dynamic, yet resource-constrained systems like Edge setups. In this paper, we tackle this issue from multiple angles by analyzing the concept of intelligent IDS (I-IDS) while addressing the specific requirements of Edge devices with a special focus on reconfigurability. Then, we introduce a systematic approach to constructing the I-IDS on reconfigurable Edge hardware. For this, we implemented our proposed IDS on state-of-the-art Field Programmable Gate Arrays (FPGAs) technology as (1) a purely FPGA-based dataflow processor (DFP) and (2) a co-designed approach featuring RISC-V soft-core as FPGA-based soft-core processor (SCP). We complete our paper with a comparison of the state of the art (SoA) in this domain. The results show that DFP and SCP are both suitable for Edge applications from hardware resource and energy efficiency perspectives. Our proposed DFP solution clearly outperforms the SoA and demonstrates that required high performance can be achieved without prohibitively high hardware costs. This makes our proposed DFP suitable for Edge-based high-speed applications like modern communication technology. **Keywords:** Intrusion detection system $\cdot$ Reconfigurabile Hardware $\cdot$ Edge Device $\cdot$ FPGA-based RISC-V Soft-Core. <sup>\*</sup> This work has been partially funded by the German Ministry of Education and Research (BMBF) via project RILKOSAN (16KISR010K) and partially via project SILGENTAS (16KIS1837). #### 1 Introduction An Intrusion Detection System (IDS) is crucial in fortifying network security, such as, but not limited to inflicted Distributed Denial-of-Service attacks (DDoS). Basically, an IDS acts as an additional defense layer, detecting and responding to potential threats that may elude preemptive measures. It is also defined [32,12] as a security tool that constantly monitors host or network traffic or both to detect any suspicious behavior that violates the security policy and compromises its confidentiality, integrity, and availability. The typical outcome of the system is to generate alerts about detected malicious behavior to the host or network administrators. A successful DDoS attack that was reported in 2016 [8] leaves us with the following conclusion "If there was a distributed intrusion detection system, it might have been able to detect the attack at its early stage and limit the loss caused by the attack." [29]. This capability of distributed IDS encompasses identifying malware, phishing attacks, and other cyber threats in an interactive manner. Fig. 1. (a) Conventional IDS vs (b) Distributed IDS on the Edge Fig. 1 shows two deployment scenarios of IDS that we call *conventional* compared to *Distributed Edge-based*. The distributed IDS should satisfy special requirements to meet the hardware and power constraints of the edge level. However, it should be noted that conventional IDS leveraging reconfigurable hardware dramatically improves the detection system's performance [14]. Therefore, reconfigurable hardware such as field-programmable gate arrays (FPGAs) has become one of the foundations for IDS on the Edge as well. #### 1.1 Machine Learning-based IDS on the Edge Machine learning (ML) models, particularly deep neural networks (DNN), have shown a potential to enhance the performance of intrusion detection mechanisms[13]. For instance, support vector machines (SVM) [11] and Hidden Naïve Bayes (HNB) [17] were proposed to enhance the accuracy and speed of the detection capability. The primary goal of ML-based IDS is to increase the number of correct predictions [13], including the not-yet-known attacks (Zeroday attacks), which makes it more efficient than signature-based methods. ML model quality can be evaluated using metrics, notably accuracy and F1 score [13]. Three main technical obstacles stall the building of ML-based IDS on the edge: (1) The considerable size of such a system renders implementations at the edge level a technical challenge, (2) required inference throughput on the resourcelimited Edge-node hardware, and (3) update of the ML-based IDS requiring to re-initiate the whole system. Several approaches have been proposed to overcome these challenges, especially for the Edge-based deployment scenario. Therefore, a need exists for a clear methodology and criteria to build an ML-based IDS relying on reconfigurable Edge hardware. # 1.2 Paper Contribution In this paper, we present a systematic selection methodology to construct a machine learning-based intrusion detection system targeting reconfigurable Edge hardware. In particular, we first investigate the pros and cons of the reconfigurable Edge hardware in Section 3. Two hardware configurations are selected: an FPGA-based dataflow processor and a RISC-V soft-core as an FPGA-based soft-core processor. Further, we establish hardware/software performance evaluation criteria for ML-based IDS (Intelligent IDS) on the Edge in Section 4. Then, we construct several machine-learning models to serve as an Intelligent IDS in Section 5. Finally, we validate the established criteria against the detection systems running on the two proposed hardware configurations and evaluate their performance results in Section 5.3. Our approach aims at constructing an intelligent IDS relying on reconfigurable Edge hardware and providing high inference throughput to serve in high-performance Edge applications such as the future generation of high-speed communication technology. To the best of our knowledge, this is the first work that establishes a systematic methodology for selecting a highly accurate ML-based IDS realized on two different configurations of reconfigurable Edge hardware. #### 2 Related Work In the following, we highlight the main approaches that leverage the reconfigurable hardware to build an ML-based IDS. #### 2.1 FPGA-based Dataflow Processor for ML-based IDS While the demand for FPGA-based dataflow processors (FPGA-based DFP) to accelerate ML and DNN algorithms using FPGAs grows, research on IDS designs in this area remains limited. FPGA-based DFP for ML-based IDS has been proposed in a few works, such as [26] and [31]. For example, a multilayer perceptron (MLP) was implemented on a Xilinx Virtex-5 FPGA in [26]. The proposed network was trained on a smaller model with only six features from the NSL-KDD dataset [36]. It consists of two hidden layers. This MLP achieved a maximum throughput of 9.86 Gbps with packets containing 1500 bytes featuring a speedup of 11.6× compared to a GPU. In [31], LogicNets "a methodology that allows trained quantized networks to be directly converted to an equivalent hardware" [31] was deployed to map a quantized MLP to hardware building blocks. The resulting DFP achieves a highly efficient acceleration rate. In [20], a convolutional neural network (CNN) topology on a PYNQ-Z2 was implemented. A quantization technique to explore 8-, 4-, and 2-bit quantization was employed. Extra pre-processing steps were also applied to reshape the raw data as an image. The experimentation used the CICIDS2017 dataset to detect one of 13 possible attack categories. The demonstrated DFP achieved a throughput of 9635 inferences/s at 100 MHz with 99.4% accuracy for the 2-bit quantized design. #### 2.2 FPGA-based Soft-Core Processor for ML-based IDS The use of RISC-V in intrusion detection and IoT security is the subject of recent research. A RISC-V SoC was proposed in [9] as a platform to build a test environment for a man-in-the-middle attack simulation. In [21], a new RISC-V SoC was built based on the previous RISC-V SoC [9] to construct a rule-based intrusion detection engine. The system runs Linux and uses Snort [2] to capture network packets. If a match with the rules is found, an alarm will be triggered, and the event will be written into a log file. Several RISC-V soft cores to be used as a SCP have been proposed for performance-demanding and accelerated applications on the edge [10,35,16]. In [7], an SCP (RISC-V CV32E41P) was synthesized to run at around 65 MHz. The core is coupled to an on-FPGA tracer and arbiter to build a host-based IDS[4]. Moreover, a different IDS implementation [7] traces the hardware performance counters of the processor event values to detect any buffer overflow in the stack or heap in the Long-Range Wide Area Network (LoRaWan) protocol stack. Following the state-of-the-art, we developed our own approach towards FPGA-based DFP and SCP for use in ML-based IDS targeting reconfigurable edge hardware. We hence will first discuss the advantages of reconfigurable hardware for IDS on the edge in the following section. #### 3 Reconfigurable Edge Hardware for IDS: Pros & Cons Besides the relatively lower cost of hardware design deployment on reconfigurable hardware (RHW) compared to other technologies, RHW enables tuning the hardware to current application needs, offering flexible update and extension of an implemented design. This feature also reduces development and reengineering costs. FPGAs exhibit several advantages, such as low cost in the silicon chip area, high performance, and low power consumption [19,5]. However, there are several limitations of RHW, most notably temporal or operational granularity. In the following, we highlight the pros and cons of FPGA-based DFP and SCP. FPGA-based DFP feature both a high level of parallelism and a need for reconfigurability. Their design offers high performance and low energy consumption, as highlighted by benchmarking studies [6], particularly in DNN acceleration, making favorable comparisons with CPU and GPU platforms. However, the reconfigurability of FPGAs, while advantageous for computational acceleration, presents challenges due to the time- and power-consuming nature of the reconfiguration process. A trade-off has to be made between the static (running) phase and reconfiguration phases. Despite the long-standing proposal of reconfigurable computing architecture, it has not gained widespread popularity. One reason for this is the requirement to use hardware design languages and dedicated design environments adding complexity and costs for developers [33]. This is, however, mitigated by being able to perform a complete parallelization, hence allowing true parallel execution of operations without sacrificing inference accuracy. Parallelization and reducing an IDS's computational complexity are hence, the most prominent techniques used in an FPGA-based DFP. **FPGA-based SCP** being software-programmable by nature, are easier accessible by software programmers. They, however, also come at some cost to be considered when deciding to choose an IDS deployment platform. An FPGA-based SCP implementation offers: - Flexibility FPGA-based SCP can execute an IDS based on different computation precisions offered by the employed soft-core, such as Float32, INT8, or INT4. Orthogonally, soft-cores can be adjusted, enhanced, and extended, meeting new IDS requirements whenever needed. - Execution efficiency (performance) With the availability of vector extensions to exploit data-parallel workloads [10], very efficient intrusion detection capability can be offered. - Portability FPGA-based SCP can be implemented using cheaper FPGA resources, reducing overall system cost. Also, the code designed to run on a softcore might take advantage of high-level programming languages and libraries, thus making the developed code easily portable to other platforms. On the other hand, the development complexity and limited availability are just examples of some disadvantages of FPGA-based SCP. Table 1 shows a comparison between FPGA-based DFP and SCP based on characteristics natively supported by the hardware, namely: computation precision, ML topology, ML parameter update, and required update time, that can deliver the best achievable performance and allow reconfigurability. FPGA-based DFP can easily accommodate ML hardware designs with floating point (FP), fixed point (FxP), and integer (INT) thanks to their reconfig- Table 1. Comparison of FPGA-based DFP and SCP. | Reconfigurable | Computation | ML Topology | ML Parameters | |----------------|-------------|-------------|---------------| | Hardware | Precision | Update | Update | | FPGA-based DFP | Fixed | Not On fly | Partially | | FPGA-based SCP | Flexible | On fly | Flexible | urability. However, once an ML hardware design is programmed on the FPGA, it cannot be updated easily on the fly. Here, partial-dynamic reconfiguration could be a promising solution that allows a limited, predefined part of ML on an FPGA to be reconfigured while others continue working. Like any other CPU, FPGA-based SCP can easily accept any computation precision and update ML topology. In contrast, FPGA-based DFP requires repeating the process of generating a new hardware design to update ML topology. The same goes for updating trained parameters: FPGA-based DFP require a particular mechanism for external parameter loading. This makes FPGA-based DFP partially able to update trained parameters. In the case of FPGA-based SCP, updating the ML topology or its trained parameters is comparatively less complicated, more straightforward, and less time-consuming. # 4 Performance Evaluation Criteria for IDS on the Edge To evaluate the performance of an IDS on the Edge, specific acceleration criteria must be considered on both levels, i.e. algorithm and reconfigurable hardware. We will detail this in the following two sections. #### 4.1 IDS Algorithm Evaluation Criteria The IDS should be accurate from a software perspective, i.e., it should detect an intrusion with high accuracy and negligible false alarms. In the case of *intelligent IDS*, several metrics can be used to evaluate how efficiently the ML model performs; these metrics can be highlighted as follows: - Precision (P) This metric is fundamental as one goal of an intelligent IDS is to minimize false positives. It measures how many of the positive predictions made are correct (true positives)[24]. - Recall (R) It measures how many positive cases the classifier correctly predicted over all the positive cases in the data. This metric is also important because an IDS aims to detect as many attacks as possible. [22] - **F1 Score** (**F1**) described as the harmonic mean of the metrics *Precision* and *Recall* with both contributing equally to the score.[24]. Additionally, it should satisfy the following criterion: Even though there are several types of intrusion with different occurrence frequencies, an IDS should stay accurate when bias toward one attack over the other accrues. #### 4.2 Reconfigurable Hardware Evaluation Criteria In addition to meeting algorithmic requirements, also hardware criteria are to be met, which are: - Hardware Resource Utilization An ideal IDS for edge devices should consume as few as possible resources, especially the DSP components, which are the most power-demanding units. This criteria significantly impacts the other hardware criteria, mainly computational density. - Inference Throughput [25] This criterion measures how many packets are processed by the intrusion detection system in a given amount of time. The IDS throughput is measured by Packets/sec. It should be noted that the network capacity limits this metric. - Energy Efficiency [25] This criterion can be expressed as the inference throughput over energy consumption. For instance, the energy efficiency of the ML model for an IDS is evaluated by Packets/sec/Watt. - Computational Density [25] Computational density is a metric used in FPGA design, referring to the ratio of computations performed by a particular design over the number of resources utilized. In other words, this criterion indicates whether the hardware design suffers from resource underutilization or not. For instance, when two different accelerators deliver the same inference throughput, the one with the lower DSP usage is considered better regarding computational density. The computational density is expressed as Throughput/#DSP or Throughput/#LUT. - Flexibility used to measure and compare the complexity of development, maintenance, and new features implementation as well as maintainability and adaptability to new ML models and to new network conditions. Some of these criteria directly impact the others. For instance, the computational density is directly derived from throughput and resource utilization. Likewise, resource utilization may indirectly decrease energy efficiency if the resources are too power-demanding compared to the achieved throughput. ## 5 Proposed IDS Design Methodology The previous discussion is applied to a 4-step approach in order to design, implement, and evaluate FPGA-based DFP and SCP approaches. We 1) construct several IDSs based on state-of-the-art algorithms in the ML domain, then 2) select ML models with high precision (P) and F1 scores and smaller model sizes in terms of byte, before 3) implementing the chosen ML for IDS on FPGA-based DFP and SCP and finally 4) analyzing these implementations regarding the proposed hardware criteria and, following a dedicated edge use-case, choosing the IDS implementation that matches the high-speed requirements of modern communication technology. #### 5.1 Step 1: Intelligent IDS In this section, we outline the use and customization of several well-known ML models and MLP, on which our IDS is based. Choosing and adjusting the right model is paramount for both, detection quality and hardware use. As shown in the evaluation section, the resulting *Intelligent IDS* can offer vast performance at minimal hardware cost with NN capability at discerning intricate patterns in extensive datasets. The BOT-IoT Dataset Bot-IoT [18] was developed within a testbed environment, employing a constellation of virtual machines featuring diverse operating systems, network firewalls, network taps, the Node-red, and the Argus tools [27,28]. The Bot-IoT dataset is characterized by multiple sets and subsets, each distinguished by file format, size, and feature count variations. Fig 2 shows the dataset balance for each attack category and subcategory, reflecting the whole dataset's general imbalance. The BOT-IoT dataset includes several attack scenarios. From these, we select a subset that covers the following attacks: DoS (TCP, HTTP), reconnaissance (service scan and OS fingerprinting), theft (keylogging and data extraction), and intrusion-free. Fig. 2. Attack categories and subcategories distribution of the BOT-IoT dataset Intelligent IDS Construction This step involves partitioning the preprocessed data into an 80% training set and a 20% set for testing and evaluation. We first start with training XGBoost (XGB), Support Vector Machine (SVM), Naive Bayes (NB), Random Forest Classifier (RFC), and Decision Tree (DT). Additionally, three Multi-Layer Perceptron (MLP) models are trained, each of them tailored to distinct classification targets: attacks, categories, and subcategories. All of the three MLP models share a nearly identical topology, featuring an input layer with 24 inputs, followed by two hidden layers of sizes 32 and 64, respectively, and Rectified Linear Unit (ReLU) activation functions. The sole distinction among the MLP models resides in the configuration of the final layer, i.e., the classification layer. The Attack model is designed to discern the presence or absence of an attack; hence, its last layer has a size of 2 followed by a Softmax activation. Analogously, the Category and Subcategory models' classification layers exhibit sizes of 4 and 7, respectively, aligning with their distinct classification objectives. ### 5.2 Step 2: Intelligent IDS Selection According to the introduced algorithm evaluation criteria, a performance comparison for each ML model is made. Table 2 compares model detection accuracy and size. NB exhibits a very low F1 Score. Therefore, it will be eliminated and we focus on ML models with a high F1 Score. Overall, XGboost and MLP outperform other models. | Attack<br>Algorithm detection | | | Category classification | | | | Subcategory classification | | | | | | |-------------------------------|------|------|-------------------------|--------------------|------|------|----------------------------|--------------------|------|------|-------|--------------------| | | P | R | $F_1$ | size | P | R | $F_1$ | size | P | R | $F_1$ | size | | XGB | 1.00 | 1.00 | 1.00 | 0.38 MB | 1.00 | 1.00 | 1.00 | 1.15 MB | 0.99 | 0.99 | 0.99 | $2.15~\mathrm{MB}$ | | SVM | 0.99 | 0.99 | 0.99 | 164 KB | 1.00 | 0.99 | 1.00 | 288 KB | 0.97 | 0.89 | 0.92 | 5.7 MB | | NB | 0.57 | 0.96 | 0.62 | 1.35 KB | 0.78 | 0.94 | 0.78 | 2.23 KB | 0.78 | 0.70 | 0.60 | $3.62~\mathrm{KB}$ | | RFC | 1.00 | 0.97 | 0.98 | 123 KB | 1.00 | 0.95 | 0.97 | 157 KB | 0.70 | 0.59 | 0.62 | 0.2 MB | | DT | 0.99 | 0.99 | 0.99 | 2.23 KB | 1.00 | 0.99 | 1.00 | $3.25~\mathrm{KB}$ | 0.83 | 0.81 | 0.82 | $4.35~\mathrm{KB}$ | | MLP | 1.00 | 1.00 | 1.00 | $15.2~\mathrm{KB}$ | 1.00 | 1.00 | 1.00 | 15.8 KB | 0.99 | 0.93 | 0.96 | 16.6 KB | Table 2. Evaluation of ML algorithms for IDS: ML Metrics vs Size. #### 5.3 Step 3: FPGA-based Intelligent IDS Implementation Here, we describe the experimental setup of the FPGA-based DFP and the RISC-V soft-core as FPGA-based SCP. Both experiments are evaluated using the Xilinx ZCU104 FPGA platform. Experimental Setup The experimental setup, illustrated in Fig. 3, includes the FPGA-based RISC-V SCP and the DFP experimental process. Opting for a 64-bit Rocket core [3], configured through the Chipyard framework [1], the Rocket core stands as a 5-stage single-issue in-order processor executing the 64-bit scalar RISC-V ISA[34]. This core can accommodate operating systems and features an optional IEEE 754-2008-compliant FPU for single- and double-precision floating-point operations, including fused multiply-accumulate. MLP models are saved as ONNX models, transformed into C code for seamless porting onto a RISC-V soft-core, and compiled using the appropriate RISC-V GNU toolchain and flags<sup>1</sup> for bare-metal execution. In contrast, the approach for the FPGA-based DFP involved converting trained models to HLS projects using hls4ml. Notably, hls4ml lacked inherent support for Float32 conversion, prompting manual intervention to adjust data types for different layers and activation functions. The Softmax function is also modified to accommodate Float32 operations, ensuring a fair comparison between the FPGA-based DFP and SCP. Subsequently, IPs for various MLP models are generated and integrated into a corresponding FPGA design, and the resulting Bitstreams are deployed on the FPGA platform for benchmarking. Fig. 3. IDS Experimental Setup RISC-V Soft-core and the FPGA-based DFP. #### 5.4 Step 4: Systematic Implementation Comparison Hardware Usage Comparison: Fig. 4-(a) shows the required hardware resources to implement FPGA-based DFP as three individual MLP IPs. All the IPs exhibit identical Block RAM (BRAM) utilization ratios for FPGA-based DFP. This uniformity can be attributed to the shared topology among MLP models, except for the last layer. The shared structure comprises two layers, constituting the most memory-intensive part due to their incorporation of most model parameters. However, slight variations in other resources arise from the intentional partitioning of parameters and result arrays in the last Softmax layer, mapped as Look-Up Tables (LUTs) and First-In-First-Out (FIFO) structures. The size of the Softmax layer accounts for the marginal fluctuations in the use of the Digital Signal Processors (DSP). To compare FPGA-based DFP and RISC-V SCP, we implement them to operate at the same frequency of 100 MHz. Their respective hardware utilization ratios are illustrated in Fig. 4-(b). The high parallelization of the FPGA-based DFPs requires more resources than the FPGA-based RISC-V SCP, except for <sup>1 \$</sup> riscv64-unknown-elf-gcc -std=gnu99 -02 -Wall -lm -fno-common -fno-builtin-printf -specs=htifnano.specs $<sup>\</sup>$ riscv64-unknown-elf-gcc -static -T riscv64-unknown-elf/lib/htif.ld -lm the BRAM units, which seem to be used more by the softcore for the caches. In contrast, FPGA-based DFPs require more LUTs, FIFOs, and DSPs for parallelized processing. **Fig. 4.** Hardware Utilization of (a) our 3 MLPs on FPGA and (b) FPGA-based RISC-V SCP vs. Overall MLPs as FPGA-based DFP. Comparison of Throughput, Energy Efficiency & Logic density We analyze and compare the two designs based on the earlier-defined criteria. The FPGA-based DFP is configured so that every compute unit executes only four multiply-accumulate operations sequentially, resulting in a higher parallelism. A design parameter, namely Reuse factor, controls such a parallelism mechanism. Additionally, the last Softmax layer was fully unrolled to compensate for the extra latency overhead caused by using Float32 arithmetic. As a result, FPGA-based DFP exhibits $\approx 6\times$ higher throughput than the FPGA-based RISC-V SCP, as shown in Table 3. FPGA-based RISC-V SCP, in the term, draws only 2.34 W, almost half the power of the FPGA-based DFP. However, its throughput superiority makes the latter $\approx 3\times$ more energy efficient than FPGA-based RISC-V SCP. Also, this is why it exhibits between 5 and 6x higher logic density. These measures can undoubtedly be even higher with low-precision arithmetic such as fixed-point and integer, especially for FPGA-based DFP. Flexibility Comparison Both processing systems have been evaluated based on their flexibility as detailed in Table 4. The flexibility comparison is dedicated to the implemented processors and is based on the above-mentioned aspects: precision, topology, and parameter updates. Additionally, we investigate the required time to update. The proposed FPGA-based DFP is very flexible regarding computation precision, such as floating point (FP), fixed point (FxP), and integer (INT) due to FPGA reconfigurability. The chosen FPGA-based RISC-V SCP, in term, has a fixed data-path, which limits its precision capability to FP Table 3. FPGA-based DFP vs RISC-V SCP with Float32 Precision | MLP Model | Throughput | Energy Efficiency | Logic Density | | | | | | | |----------------------------------|------------------|-------------------|-----------------|--|--|--|--|--|--| | | Packets/sec | Packets/sec/W | Packets/sec/LUT | | | | | | | | FPGA-based DFP | | | | | | | | | | | Attack | 1166861 (1.16 M) | 265799 (265 K) | 24.55 | | | | | | | | Category | 1135073 (1.13 M) | 255589 (255 K) | 23.44 | | | | | | | | Subcategory | 1118568 (1.11 M) | 249346 (249 K) | 20.11 | | | | | | | | RISC-V SCP - Optimized Baremetal | | | | | | | | | | | Attack | 202849 (202 K) | 86650 (86 K) | 4.157 | | | | | | | | Category | 197500 (197 K) | 84365 (84 K) | 4.047 | | | | | | | | Subcategory | 197342 (197 K) | 84298 (84 K) | 4.44 | | | | | | | Table 4. Flexibility Comparison of FPGA-based DFP and RISC-V SCP. | Processor | | recis | ion | Topology | Parameters | Update | | |-----------------------|-----|-------|------------------|----------|-----------------|---------|--| | | | FxP | INT | update | update | time | | | FPGA-based DFP | yes | yes | yes | no | no <sup>1</sup> | longer | | | FPGA-based RISC-V SCP | yes | no | yes <sup>2</sup> | yes | yes | shorter | | <sup>&</sup>lt;sup>1</sup> Only possible if the design includes an external weights loading mechanism. and a specific set of integers, such as INT8, 16, and 32. Consequently, it offers fewer options to optimize the ML-based IDS through quantization. However, updating the ML topology or its trained parameters is significantly less complicated and, therefore, more straightforward in the case of FPGA-based RISC-V SCP; it only requires a new source-code compilation. Table 5. State of the Art FPGA-based DFP for ML-based IDS. | References | [20] | [31] | [26] | [15] | Tl | This work | | | |--------------------------|------------------|---------------|---------------|---------|--------------|-----------|-----------|--| | FPGA | xc7Z020 | xc7Z020 | xc5vtx | xc7Z020 | xczu7ev | | V | | | $Frequency \ (MHz)$ | 100 | 471 | 104 | 76 | 100 | | | | | Dataset | CICIDS2017[30] | UNSW-NB15[23] | NSL-KDD[36] | NSL-KDD | BOT-IoT[18 | | [18] | | | ML topology | CNN | MLP | MLP | MLP | MLP | | | | | Number<br>of layers | 4×Conv +<br>2×FC | 5×FC | $2 \times FC$ | 3×FC | $3\times FC$ | | | | | Intrusion Classes | 13 | 2 | 2 | 2 | 2 | 4 | 7 | | | Accuracy (%) | 99.4 | 91.3 | 87.3 | 80.52 | 99.9 | 99.9 | 99.9 | | | Throughput (Packets/sec) | 9635 | 754292 | 821667 | 217074 | 1.16<br>M | 1.13<br>M | 1.11<br>M | | | LUT usage | 24635 | 15494 | 117082 | 26463 | 47514 | 48413 | 55627 | | | Usage ratio (%) | 46.3 | 29.12 | 78.2 | 50 | 20.6 | 21 | 24.1 | | Proposed FPGA-based DFP Compared to the State of the Art Table 5 compares our proposed FPGA-based DFP for ML-based IDS and the state of the art in this domain. The results show our proposed intelligent IDS detects 13 different intrusion classes, and its implementation as FPGA-based DFP exhibits <sup>&</sup>lt;sup>2</sup> Supports only a subset of integers, such as INT8/16/32. very high throughput, yet low hardware resources. This makes it suitable for application at the edge level and clearly demonstrates that such a high-performance solution does not necessarily come at prohibitively high hardware costs. #### 6 Conclusion Current modern approaches to intrusion detection systems (IDS) feature the use of machine learning (ML). However, ML-based IDSs still face technical obstacles such as their considerable size, and their update requires re-initiating the whole IDS. In this paper, we investigate ML-based IDS targeting the edge level, featuring reconfigurable edge nodes. Here, typically high throughput is required in order to keep up with the real-time data transmissions, yet node resource use is constrained. Orthogonally, intrusion detection in a reconfigurable system also demands an equally flexible adaptability with respect to detection itself. We hence construct a systematic approach to ML-based intrusion detection on the edge, leading to the proposed Intelligent IDS. We discuss two possible FPGA-based implementation scenarios, one plain hardware implementation (FPGA-based dataflow processor, DFP) and one featuring a RISC-V softcore. Both implementations are evaluated and compared to each other and the state of the art. The results clearly demonstrate that the high performance of a hardware implementation does not necessarily come at prohibitively high hardware cost, with our solution exhibiting higher throughput, better energy efficiency, and better logic density in addition to an overall better configurability. Our proposed DFP hence can be employed in high-performance Edge-based applications like modern communication technology. #### Disclosure of Interests The authors have no competing interests to declare that are relevant to the content of this article. ## References - Amid, A., Biancolin, D., Gonzalez, A., Grubb, D., Karandikar, S., Liew, H., Magyar, A., Mao, H., Ou, A., Pemberton, N., Rigge, P., Schmidt, C., Wright, J., Zhao, J., Shao, Y.S., Asanović, K., Nikolić, B.: Chipyard: Integrated design, simulation, and implementation framework for custom socs. IEEE Micro 40(4), 10–21 (2020). https://doi.org/10.1109/MM.2020.2996616 - 2. Amon, C., Shinder, T.W., Carasik-Henmi, A.: Chapter 29 introducing snort. In: The Best Damn Firewall Book Period, pp. 1183–1208. Syngress, Burlington (2003). https://doi.org/10.1016/B978-193183690-6/50070-4 - Asanović, K., Avizienis, R., Bachrach, J., Beamer, S., Biancolin, D., Celio, C., Cook, H., Dabbelt, D., Hauser, J., Izraelevitz, A., Karandikar, S., Keller, B., Kim, D., Koenig, J., Lee, Y., Love, E., Maas, M., Magyar, A., Mao, H., Moreto, M., Ou, A., Patterson, D.A., Richards, B., Schmidt, C., Twigg, S., Vo, H., Waterman, A.: - The rocket chip generator. Tech. Rep. UCB/EECS-2016-17, EECS Department, University of California, Berkeley (Apr 2016) - Azad, T.B.: Chapter 7 locking down your xenapp server. In: Azad, T.B. (ed.) Securing Citrix Presentation Server in the Enterprise, pp. 487–555. Syngress, Burlington (2008). https://doi.org/10.1016/B978-1-59749-281-2.00007-X - 5. Babu, P., Parthasarathy, E.: Reconfigurable fpga architectures: A survey and applications. Journal of The Institution of Engineers: Series B **102**, 143–156 (2021) - 6. Blott, M., Kath, J., Halder, L., Umuroglu, Y., Fraser, N., Gambardella, G., Leeser, M., Doyle, L.: Evaluation of optimized cnns on fpga and non-fpga based accelerators using a novel benchmarking approach. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. p. 317. FPGA '20, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3373087.3375348 - 7. Bouazzati, M.E., Tessier, R., Tanguy, P., Gogniat, G.: A lightweight intrusion detection system against iot memory corruption attacks. In: 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS). pp. 118–123 (2023). https://doi.org/10.1109/DDECS57882.2023.10139718 - are 8. Brewster, How helping T.: hacked cameras launch the biggest attacks the internet has seen. Forbes (2016),ever https://www.forbes.com/sites/thomasbrewster/2016/09/25/brian-krebsoverwatch-ovh-smashed-by-largest-ddos-attacks-ever/ - Cai, B., Xie, S., Liang, Q., Lu, W.: Research on penetration testing of iot gateway based on risc- v. In: 2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE). pp. 422–425 (2022). https://doi.org/10. 1109/ISAIEE57420.2022.00093 - Chander, V.N., Varghese, K.: A soft risc-v vector processor for edge-ai. In: 2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID). pp. 263–268 (2022). https://doi.org/10.1109/VLSID2022.2022.00058 - 11. Chen, W.H., Hsu, S.H., Shen, H.P.: Application of svm and ann for intrusion detection. Computers & Operations Research 32(10), 2617–2634 (2005) - 12. Denning, D.E.: An intrusion-detection model. IEEE Transactions on software engineering **SE-13**(2), 222–232 (1987) - 13. Disha, R.A., Waheed, S.: Performance analysis of machine learning models for intrusion detection system using gini impurity-based weighted random forest (giwrf) feature selection technique. Cybersecurity 5(1), 1 (2022) - Hutchings, B., Franklin, R., Carver, D.: Assisting network intrusion detection with reconfigurable hardware. In: Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. pp. 111–120 (2002). https://doi. org/10.1109/FPGA.2002.1106666 - 15. Ioannou, L., Fahmy, S.A.: Network intrusion detection using neural networks on fpga socs. In: 2019 29th International Conference on Field Programmable Logic and Applications (FPL). pp. 232–238. IEEE (2019) - Kimura, Y., Ootsu, K., Tsuchiya, T., Yokota, T.: Development of risc-v based soft-core processor with scalable vector extension for embedded system. In: Proceedings of the 8th International Virtual Conference on Applied Computing & Information Technology. p. 13–18. ACIT '21, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3468081.3471061 - 17. Koc, L., Mazzuchi, T.A., Sarkani, S.: A network intrusion detection system based on a hidden naïve bayes multiclass classifier. Expert Systems with Applications **39**(18), 13492–13500 (2012) - 18. Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B.: Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Generation Computer Systems 100, 779–796 (2019) - 19. Kuon, I., Rose, J.: Measuring the gap between fpgas and asics. In: Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays. pp. 21–30 (2006) - Le Jeune, L., Goedemé, T., Mentens, N.: Towards real-time deep learning-based network intrusion detection on fpga. In: Applied Cryptography and Network Security Workshops. pp. 133–150. Springer International Publishing, Cham (2021) - Liang, Q., Xie, S., Cai, B.: Intelligent home iot intrusion detection system based on risc-v. In: 2023 IEEE 3rd International Conference on Power, Electronics and Computer Applications (ICPECA). pp. 296–300 (2023). https://doi.org/10.1109/ ICPECA56706.2023.10076248 - 22. Mishra, A.: Evaluating Machine Learning Models, chap. 5, pp. 115–132. John Wiley and Sons, Ltd (2019). https://doi.org/10.1002/9781119556749.ch5 - 23. Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS). pp. 1–6 (2015). https://doi.org/10.1109/MilCIS.2015.7348942 - 24. Müller, A.C., Guido, S.: Introduction to machine learning with Python: a guide for data scientists. O'Reilly Media, Inc. (2016) - Nechi, A., Groth, L., Mulhem, S., Merchant, F., Buchty, R., Berekovic, M.: Fpga-based deep learning inference accelerators: Where are we standing? ACM Trans. Reconfigurable Technol. Syst. 16(4) (oct 2023). https://doi.org/10.1145/ 3613963 - 26. Ngo, D.M., Tran-Thanh, B., Dang, T., Tran, T., Thinh, T.N., Pham-Quoc, C.: High-throughput machine learning approaches for network attacks detection on fpga. In: Vinh, P.C., Rakib, A. (eds.) Context-Aware Systems and Applications, and Nature of Computation and Communication. pp. 47–60. Springer International Publishing, Cham (2019) - 27. Node-RED: Low-code programming for event-driven applications (2021), https://nodered.org/ - 28. QOSIENT, L.: Argus (2023), https://openargus.org/ - 29. Sha, K., Yang, T.A., Wei, W., Davari, S.: A survey of edge computing-based designs for iot security. Digital Communications and Networks **6**(2), 195–202 (2020) - 30. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: 4th International Conference on Information Systems Security and Privacy (ICISSP. Portugal (2018) - 31. Umuroglu, Y., Akhauri, Y., Fraser, N.J., Blott, M.: Logicnets: Co-designed neural networks and circuits for extreme-throughput applications. In: 2020 30th International Conference on Field-Programmable Logic and Applications (FPL). pp. 291–297 (2020). https://doi.org/10.1109/FPL50879.2020.00055 - 32. Vasilomanolakis, E., Karuppayah, S., Mühlhäuser, M., Fischer, M.: Taxonomy and survey of collaborative intrusion detection. ACM Computing Surveys (CSUR) 47(4), 1–33 (2015) - 33. Wang, T., Wang, C., Zhou, X., Chen, H.: An overview of fpga based deep learning accelerators: Challenges and opportunities. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE - 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). pp. 1674–1681 (2019). https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00229 - 34. Waterman, A.: Design of the RISC-V Instruction Set Architecture. Ph.D. thesis, EECS Department, University of California, Berkeley (Jan 2016), http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html - 35. Yiannacouras, P., Steffan, J.G., Rose, J.: Vespa: Portable, scalable, and flexible fpga-based vector processors. In: Proceedings of the 2008 International Conference on Compilers, Architectures and Synthesis for Embedded Systems. p. 61–70. CASES '08, Association for Computing Machinery, New York, NY, USA (2008). https://doi.org/10.1145/1450095.1450107 - 36. ZHAO, R.: Nsl-kdd (2022). https://doi.org/10.21227/8rpg-qt98