Concerning ad hoc network properties, the implementation of some complex security systems with more computing resources appears troublesome in most circumstances. Thus, the usage of anomaly and intrusion detection systems has drawn considerable attention. Detection systems are achieved as host-based (each node) or network-based (cluster head). These implementations exhibit advantages and drawbacks. For example, when cluster-based is used alone, it faces keeping protection when nodes delay to choose or substitute a cluster head. Notwithstanding different heuristics that have been introduced, there is still room for improvement.
This dissertation proposes a detection system that can run as host- or as cluster-based to detect routing misbehavior. The detection operates on datasets built using proposed routing information-sharing algorithms. The detection system uses supervised learning to train when previous network status or exploratory network is available. Otherwise, it uses unsupervised learning. The testbed is stretched to evaluate the effects of mobility and network size. The simulation results show promising performance, even with limiting factors. The performance of the proposed detection system relies on neighboring nodes' communication. This communication can be heavily affected, at the data-link layer, by the presence of a jammer. In this dissertation, we analyze the efficiency of using a single-task reinforcement learning algorithm to mitigate jamming attacks with frequency hopping strategy. Our findings show that single-task learning implementations do not always guarantee a better cumulative reward. Hence come the possibilities of using multi-task learning instead.
Multi-task reinforcement learning provides room for performance improvement to single-task learning when the tasks are related and learned with mutual information. Therefore, to improve the communication despite a jammer's presence, we propose deep multi-task conditional and sequential learning (DMCSL), a multi-task learning algorithm that develops a transition policy to resolve conditional and sequential tasks. The tasks are sensing time and transmission channel selection. DMCSL is a composite of state-of-the-art reinforcement learning algorithms, multi-armed bandit, and an extended deep-Q-network. To ensure the convergence and optimal cumulative reward of the algorithm, DMCSL is proposed with a continuous update algorithm for the sensing time action-space. The simulation results show that DMCSL with logarithmically increased action space guarantees better performance than single-task learning.