Abstract:
As we live in an era where privacy over the Internet has become rudimentary, protocols like DNS over HTTPS (DoH) and DNS over TLS (DoT), which promote encryption, have become popular. While these protocols were introduced to overcome the drawbacks of DNS protocol, even DoH has some security issues that need to be tackled to prevent any misuse. Herein, we implemented deep learning models to classify DNS over HTTPS traffic and found the most efficient method in regard to time-required complexity and computational requirements. Previous studies have used a variety of features from datasets to identify malicious activities. Although machine learning and deep learning models are commonly used, they require more human intervention. These models are also more computationally complex, as one is required to tune the model and its parameters for accurate results. In comparison, some deep learning models are more efficient as they work well without any human intervention and are capable of parameter tuning by themselves. In this work, we used the CIRA-CIC-DoHBrw-2020 dataset and performed data imbalance handling, one hot encoding, and feature selection to create a model that can be used for a more generalized environment. We implemented long short-term memory (LSTM), bidirectional LSTM (BiLSTM), and gated recurrent unit (GRU) models to classify DoH traffic with high accuracy. Although the mentioned models produced good accuracy, the BiLSTM model performs better than the LSTM model in the time taken for prediction and accuracy; the GRU model outperformed both LSTM and BiLSTM models in terms of accuracy, computation time, and computation complexity. Hence, it is more efficient than both LSTM and BiLSTM models.