A Scientific Integrity Framework for Open-Set IoT Intrusion Detection with Device-Disjoint Splits

Chikezie, Chekwas; Usman, Abraham Usman; David, Michael; Zubair, Sulieman; Ohize, Henry Ohiani; Ojeniyi, Joseph

Please use this identifier to cite or link to this item: http://irepo.futminna.edu.ng:8080/jspui/handle/123456789/31544

Title:	A Scientific Integrity Framework for Open-Set IoT Intrusion Detection with Device-Disjoint Splits
Authors:	Chikezie, Chekwas Usman, Abraham Usman David, Michael Zubair, Sulieman Ohize, Henry Ohiani Ojeniyi, Joseph
Keywords:	Internet of Things security; information forensics; open-set recognition; dataset preprocessing; leakage control; reproducible machine learning
Issue Date:	27-May-2026
Publisher:	MDPI - FUTURE INTERNET
Abstract:	Machine-learning-based intrusion detection for Internet of Things systems has often been evaluated through model-centered pipelines that use weakly governed partitioning, limited leakage auditing, and closed-set assumptions. Consequently, reported performance could reflect data-handling artifacts rather than reliable security intelligence. This paper introduces a scientific integrity framework that treats preprocessing as a primary research object for open-set Internet of Things intrusion detection. The framework integrated devicedisjoint split governance, feasibility-aware zero-day isolation, quantified leakage control, train-only preprocessing, shared-safe feature selection, diagnostic-harness verification, baseline split comparison, and auditable artifact generation. Applied to the CICIoT-DIAD 2024 corpus with Institute of Electrical and Electronics Engineers Organizationally Unique Identifier-based vendor enrichment, the protocol locked 28 canonical classes, eight semantic attack families, and five policy labels before constructing a device-disjoint, vendor-aware grouped split. When strict device-level zero-day holdout was infeasible, the framework activated an audited row-level fallback that preserved contamination-free holdout isolation without claiming strict device-novel zero-day evaluation. On 35,672,407 flows from 180 files, the accepted run achieved zero device overlap, zero flow-signature Jaccard leakage risk, 100 percent zero-day purity, a Feature Distribution Stability Score of 0.00518, a Device-Feature Dependency Index of 0.00000, an Attack Invariance Score of 0.92964, and an Attack Semantic Consistency Score of 0.90714. The diagnostic harness produced zero hard failures and zero warnings, while baseline comparison showed stronger preprocessing integrity than random stratified and simple device-disjoint splitting. This study did not claim downstream classifier superiority; rather, it established an auditable preprocessing substrate for later classifier-level experiments.
URI:	http://irepo.futminna.edu.ng:8080/jspui/handle/123456789/31544
Appears in Collections:	Telecommunication Engineering

Files in This Item:

File	Description	Size	Format
futureinternet-18-00287.pdf		2.13 MB	Adobe PDF	View/Open

Show full item record