Accepted Papers

Analysis of Augmentation Methods for RF Fingerprinting under Impaired Channels
Ceren Comert1, Michel Kulhandjian1, Omer Melih Gul1, Azzedine Touazi2, Cliff Ellement2, Burak Kantarci1, and Claude D'Amours1
1 University of Ottawa
2 ThinkRF

Cyber-physical systems such as autonomous vehicle networks are considered to be critical infrastructures in various applications. However, their mission critical deployment makes them prone to cyber-attacks. Radio frequency (RF) fingerprinting is a promising security solution to pave the way for “security by design” for critical infrastructures. With this in mind, this paper leverages deep learning methods to analyze unique fingerprints of transmitters so as to discriminate between legitimate and malicious unmanned vehicles. As RF fingerprinting models are sensitive to varying environmental and channel conditions, these factors should be taken into consideration when deep learning models are employed. As another option, data acquisition can be considered; however, it is infeasible since collecting samples of different circumstances for the training set is quite difficult. To address such aspects of RF fingerprinting, this paper applies various augmentation methods, namely, additive noise, generative models and channel profiling. Out of the studied augmentation methods, our results indicate that tapped delay line and clustered delay line (TDL/CDL) models seem to be the most viable solution as the accuracy to recognize transmitters can significantly increase from 74% to 87.94% on unobserved data.

Digital Twin Virtualization with Machine Learning for IoT and Beyond 5G Networks: Research Directions for Security and Optimal Control
Jithin Jagannath1, Keyvan Ramezanpour1, and Anu Jagannath1
1 Marconi-Rosenblatt AI/ML Innovation Lab, ANDRO Computational Solutions LLC

Digital twin (DT) technologies have emerged as a solution for real-time data-driven modeling of cyber physical systems (CPS) using the vast amount of data available by Internet of Things (IoT) networks. In this position paper, we elucidate unique characteristics and capabilities of a DT framework that enables realization of such promises as online learning of a physical environment, real-time monitoring of assets, Monte Carlo heuristic search for predictive prevention, on-policy, and off-policy reinforcement learning in real-time. We establish a conceptual layered architecture for a DT framework with decentralized implementation on cloud computing and enabled by artificial intelligence (AI) services for modeling and decision-making processes. The DT framework separates the control functions, deployed as a system of logically centralized process, from the physical devices under control, much like software-defined networking (SDN) in fifth generation (5G) wireless networks. To clarify the significance of DT in lowering the risk of development and deployment of innovative technologies on existing system, we discuss the application of implementing zero trust architecture (ZTA) as a necessary security framework in future data-driven communication networks.

Systems View to Designing RF Fingerprinting for Real-World Operations
Scott Kuzdeba1, Josh Robinson1, Joseph Carmack1, and David Couto1
1 BAE Systems

Great progress has been made recently in radio frequency (RF) machine learning (ML), including RF fingerprinting. Much of this work to date, however, has been limited in scope to proof-of-concept demonstrations or narrowly defined and tested under circumstances that only address part of the actual operational considerations. In this paper we expand this consideration for an RF fingerprinting application. In doing so, we build on our previous work developing our RiftNet#8482; deep learning classifier to consider realistic operational and systems aspects to ensure the solution is robust to real-world signal environments and other tasks of an RF receiver. In particular, we show new results on how to handle difficult cases of signal interference, how to efficiently improve performance with the use of geolocation information, and ways to further reduce computational constraints for edge operation. In summary, this work presents moving beyond initial success in RF ML towards expanding the solution for eventual edge operation in a larger system.

Online Stream Sampling for Low-Memory On-Device Edge Training for WiFi Sensing
Steven M Hernandez1 and Eyuphan Bulut1
1 Virginia Commonwealth University

Deploying machine learning models on-board edge devices allows for low latency model inference and data privacy by keeping sensor data local to the computation rather than at a central server. However, typical TinyML systems train a single global model which is duplicated across all edge devices. This leads to a model that is generalized to the training data, but not specialized to the unique physical environment where the device is deployed. In this work, we evaluate how we can train machine learning models on-board low-memory edge devices with streams of incoming data. When using these low-memory devices, storage space is at a minimum and as such, representative data samples from the data stream must be captured to ensure that the models can improve even with a limited set of available training samples. We propose the Variable Low/High Loss sampling method for selecting representative data samples from a data stream and demonstrate that our methods are able to increase the accuracy of the machine learning model compared to state-of-the-art methods. We demonstrate the applicability of our proposed method for WiFi sensing based human activity detection, where WiFi signals are used to predict human activities in a given environment without requiring sensors on their bodies.

Beam Pattern Fingerprinting with Missing Features for Spoofing Attack Detection in Millimeter-Wave Networks
Ya Jiang1, Long Jiao1, Liang Zhao2, and Kai Zeng1
1 George Mason University
2 Emory University

As one of the key enabling technologies of 5G wireless communication, millimeter-Wave (mmWave) technology unlocks the ultra-wide bandwidth opportunity in supporting high-throughput (e.g., multi-Gbps) and ultra-low latency applications at much lower cost-per-bit. However, due to the broadcast nature of wireless medium, like sub-6GHz communication, mmWave communication is still subject to various attacks, such as the identity spoofing attacks. Recently, beam pattern fingerprinting using the signal-to-noise-ratio (SNR) traces obtained during the beam sweeping process has been proposed to detect spoofing attacks in mmWave networks. However, a complete beam sweeping that tests all the tx-rx beam pairs is not always applied in practice. That is, to save link initialization or maintenance overhead, the implementation of efficient beam management schemes usually only probes a subset of tx-rx beam pairs. Therefore, SNR traces could randomly miss some features, which will bring challenges for fingerprint training as well as online prediction. In this work, we build a machine learning model that can achieve a fast and highly-accurate identity spoofing detection under missing features. Experimental results show that our proposed approach can achieve a high detection accuracy of almost 100% with short detection delay under various feature missing patterns. In addition, it improves the performance (i.e., accuracy, precision, recall, F1-score) of co-located attack detection by over 18% under missing features, compared to the model trained on data with only complete features.

Can You Hear It? Backdoor Attacks via Ultrasonic Triggers
Stefanos Koffas1, Jing Xu1, Mauro Conti2, and Stjepan Picek3,1
1 Delft University of Technology
2 University of Padua
3 Radboud University

This work explores backdoor attacks for automatic speech recognition systems where we inject inaudible triggers. By doing so, we make the backdoor attack challenging to detect for legitimate users and, consequently, potentially more dangerous. We conduct experiments on two versions of a speech dataset and three neural networks and explore the performance of our attack concerning the duration, position, and type of the trigger.

Our results indicate that less than 1% of poisoned data is sufficient to deploy a backdoor attack and reach a 100% attack success rate. We observed that short, non-continuous triggers result in highly successful attacks. Still, since our trigger is inaudible, it can be as long as possible without raising any suspicions making the attack more effective. Finally, we conduct our attack on actual hardware and saw that an adversary could manipulate inference in an Android application by playing the inaudible trigger over the air.

Automatic Machine Learning for Multi-Receiver CNN Technology Classifiers
Amir-Hossein Yazdani-Abyaneh1 and Marwan Krunz1
1 The University of Arizona

Convolutional Neural Networks (CNNs) are one of the most studied family of deep learning models for signal classification, including modulation, technology, detection, and identification. In this work, we focus on technology classification based on raw I/Q samples collected from multiple synchronized receivers. As an example use case, we study protocol identification of Wi-Fi, LTE-LAA, and 5G NR-U technologies that coexist over the 5 GHzUnlicensed National Information Infrastructure (U-NII) bands. Designing and training accurate CNN classifiers involve significant time and effort that goes to fine-tuning a model’s architectural settings (e.g., number of convolutional layers and their filter size) and determining the appropriate hyperparameter configurations, such as learning rate and batch size. We tackle the former by defining architectural settings themselves as hyperparameters. We attempt to automatically optimize these architectural parameters, along with other preprocessing (e.g., number of I/Q samples within each classifier input) and learning hyperparameters, by forming aHyperparameter Optimization (HyperOpt) problem, which we solve in a near-optimal fashion using the Hyperband algorithm. The resulting near-optimal CNN (OCNN) classifier is then used to study classification accuracy for OTA as well as simulations datasets, considering various SNR values. We show that using a larger number of receivers to construct multi-channel inputs for CNNs does not necessarily improve classification accuracy. Instead, this number should be defined as a preprocessing hyperparameter to be optimized via Hyperband. OTA results reveal that our OCNN classifiers improve classification accuracy by $24.58%$ compared to manually tuned CNNs. We also study the effect of min-max normalization of I/Q samples within each classifier’s input on generalization accuracy over simulated datasets SNRs other than training set’s SNR, and show an average of $108.05%$ improvement when I/Q samples are normalized.

Undermining Deep Learning Based Channel Estimation via Adversarial Wireless Signal Fabrication
Tao Hou1, Tao Wang2, Zhuo Lu1, Yao Liu1, and Yalin Sagduyu3
1 University of South Florida
2 New Mexico State University
3 Virginia Tech

Channel estimation is a crucial step in wireless communications. The estimator identifies the wireless channel distortions during the signal propagation and this information is further used for data precoding and decoding. Recent studies have shown that deep learning techniques can enhance the accuracy of conventional channel estimation algorithms. However, the reliability and security aspects of these deep learning algorithms have not yet been well investigated in the context of wireless communications. With no exceptions, channel estimation based on deep learning may be vulnerable to the adversarial machine learning attacks. However, close examination shows that we cannot simply adapt the traditional adversarial learning mechanisms to effectively manipulate channel estimation. In this paper, we propose a novel attack strategy that crafts a perturbation to fool the receiver with wrong channel estimation results. This attack is launched without knowing the current input signals and by only requiring a loose form of time synchronization. Through the over-the-air experiments with software-defined radios in our multi-user MIMO testbed, we show that the proposed strategy can effectively reduce the performance of deep learning-based channel estimation. We also demonstrate that the proposed attack can hardly be detected with the detection rate of 8% or lower.

KNEW: Key Generation using NEural Networks from Wireless Channels
Xue Wei1 and Dola Saha1
1 University at Albany, SUNY

Secret keys can be generated from reciprocal channels to be used for shared secret key encryption. However, challenges arise in practical scenarios from non-reciprocal measurements of reciprocal channels due to changing channel conditions, hardware inaccuracies and estimation errors resulting in low key generation rate (KGR) and high key disagreement rates (KDR). To combat these practical issues, we propose KNEW Key Generation using NEural Networks from Wireless Channels, which extracts the implicit features of channel in a compressed form to derive keys with high agreement rate. Two Neural Networks (NNs) are trained simultaneously to map each other’s channel estimates to a different domain, the latent space, which remains inaccessible to adversaries. The model also minimizes the distance between the latent spaces generated by two trusted pair of nodes, thus improving the KDR. Our simulated results demonstrate that the latent vectors of the legitimate parties are highly correlated yielding high KGR (≈ 64 bits per measurement) and low KDR (<0.05 in most cases). Our experiments with over-the-air signals show that the model can adapt to realistic channels and hardware inaccuracies, yielding over 32 bits of key per channel estimation without any mismatch.

A Machine Learning-Driven Analysis of Phantom E911 Calls
Batoul Taki1, Yang Hu1, Waheed Bajwa1, Manoop Talasila2, Mukesh Mantan2, and Anwar Aftab2
1 Inspire Lab, Rutgers University
2 AT&T Research Lab

Phantom Enhanced 911 (E911) calls are automatically generated 2 second calls, are a serious concern on cellular networks, and consume critical resources. As networks become increasingly complex, detecting and troubleshooting the causes of phantom E911 calls is becoming increasingly difficult. In this paper machine learning (ML) tools are used to analyze anonymized call detail record data collected by a major US telecom network service provider. The data is carefully pre-processed and encoded using an efficient encoding method. Classification algorithms K Nearest Neighbors (KNN) and Decision Trees (DTs) are then implemented to study correlations between device and network level features and a mobile device’s ability to initiate phantom calls. Based on the results, this work also suggests certain policy changes for network operators that may decrease the high volume of phantom E911 calls or alleviate the pressure of phantom E911 calls on a cellular network.

Deep Learning for Spectrum Awareness and Covert Communications via Unintended RF Emanations
Michael Hegarty1, Yalin E. Sagduyu2, Tugba Erpek2, and Yi Shi2
1 Intelligent Automation, A BlueHalo Company
2 Virginia Tech

We present a deep learning-based spectrum sensing and covert communication framework for unintended (side-channel) electromagnetic emanations. Electronic devices release unintentional RF emissions (without using any RF transmitter) depending on their processing activities and these emissions can be captured at the clock frequency and its harmonics. First, we train a deep neural network, in particular, a convolutional neural network (CNN), to detect RF emanations from a microcontroller (in particular, Arduino Uno R3). Through over-the-air (OTA) experiments, we show that the CNN that is trained with the input of signals received at a software-defined radio (SDR) can reliably detect RF emanations at a range of up to 10 feet, while the performance of conventional energy detector remains limited. Second, we demonstrate how to encode RF emanations due to different programs running on a microcontroller and generate frequency shift keying (FSK) modulated signals without using any RF transmitter. The conventional scheme of quadrature demodulation can decode signals communicated at a range of up to 9 inches. On the other hand, we show that a CNN that is trained with RF emanation data collected with an SDR can decode the signals encoded over RF emanations with higher reliability and extend the communication range up to 4 feet. These capabilities are promising to support emerging systems such as Internet of Things (IoT) with novel applications including covert communications, low-power device monitoring and sensing, and energy-efficient communications.

Voice Fingerprinting for Indoor Localization with a Single Microphone Array and Deep Learning
Shivenkumar Parmar1, Xuyu Wang1, Chao Yang2, and Shiwen Mao2
1 California State University, Sacramento
2 Auburn University

With the fast development of the Internet of Things (IoT), smart speakers for voice assistance have become increasingly important in smart homes, which offers a new type of human-machine interaction interface. Voice localization with microphone arrays can improve smart speaker’s performance and enable many new IoT applications. To address the challenges of complex indoor environments, such as non-line-of-sight (NLOS) and multi-path propagation, we propose voice fingerprinting for indoor localization using a single microphone array. The proposed system consists of a ReSpeaker 6-mic circular array kit connected to a Raspberry Pi and a deep learning model, and operates in offline training and online test stages. In the offline stage, the models are trained with spectrogram images obtained from audio data using short-time Fourier transform (STFT). Transfer learning is used to speed up the training process. In the online stage, a top-K probabilistic method is used for location estimation. Our experimental results demonstrate that the Inception-ResNet-v2 model can achieve a satisfactory localization performance with small location errors in two typical home environments.

Deep Learning-based Localization in Limited Data Regimes
Frost Mitchell1, Aniqua Baset1, Neal Patwari2, Sneha Kumar Kasera1, and Aditya Bhaskara1
1 University of Utah
2 Washington University in St. Louis

As demand for radio spectrum increases with the widespread use of wireless devices, effective spectrum allocation requires more flexibility in terms of time, space, and frequency. In order to protect users in next-generation wireless networks from interference, spectrum managers must have the ability to efficiently and accurately locate transmitters. We present TL;DL, a practical deep-learning based technique for multiple transmitter localization on crowdsourced data where all sensors and transmitters may be mobile and transmit with unknown power. We map sensor readings to an image representing the sensor location, then use a convolutional neural network to learn to generate a target image of transmitter locations. We also introduce a novel data-augmentation technique to drastically improve generalization and enable accurate localization on limited data. In our evaluation, TL;DL outperforms previous approaches on small real-world datasets with low sensor density, in terms of both accuracy and detection.

MR-iNet Gym: Framework for Edge Deployment of Deep Reinforcement Learning on Embedded Software Defined Radio
Jithin Jagannath1, Kian Hamedani1, Collin Farquhar1, Keyvan Ramezanpour1, and Anu Jagannath1
1 Marconi-Rosenblatt AI/ML Innovation Lab, ANDRO Computational Solutions LLC

Dynamic resource allocation plays a critical role in the next generation of intelligent wireless communication systems. Machine learning has been leveraged as a powerful tool to make strides in this domain. In most cases, the progress has been limited to simulations due to the challenging nature of hardware deployment of these solutions. In this paper, for the first time, we design and deploy deep reinforcement learning (DRL)-based power control agents on the GPU embedded software defined radios (SDRs). To this end, we propose an end-to-end framework (MR-iNet Gym) where the simulation suite and the embedded SDR development work cohesively to overcome real-world implementation hurdles. To prove feasibility, we consider the problem of distributed power control for code-division multiple access (DS-CDMA)-based LPI/D transceivers. We first build a DS-CDMA ns3 module that interacts with the OpenAI Gym environment. Next, we train the power control DRL agents in this ns3-gym simulation environment in a scenario that replicates our hardware testbed. Next, for edge (embedded on-device) deployment, the trained models are optimized for real-time operation without loss of performance. Hardware-based evaluation verifies the efficiency of DRL agents over traditional distributed constrained power control (DCPC) algorithm. More significantly, as the primary goal, this is the first work that has established the feasibility of deploying DRL to provide optimized distributed resource allocation for next-generation of GPU-embedded radios.