Da Luo, Paweł Korus, Jiwu Huang
IEEE Transactions on Information Forensics and Security, Vol. 13, Issue 9, 2018
Digital audio recordings are one of the key types of evidence used in law enforcement proceedings. As a result, the development of reliable techniques for forensic analysis of such recordings is of principal importance. One of the main problems in forensic analysis is source attribution, i.e., verifying whether a certain recording was acquired with a given device. While this problem has been widely studied for other types of multimedia signals, there are very few techniques for audio recordings. Moreover, reported evaluation results were obtained from extremely small datasets on the order of a dozen devices. The goal of this paper is to propose a new feature set, the band energy difference (BED) descriptor, for source attribution of digital speech recordings. We demonstrate that a frequency response curve extracted from sample recordings can serve as a robust fingerprint that carries significant discriminative power and can characterize the recording device. We study two sub-problems of source attribution: (a) identification of a recording device among a list of possible candidates (device identification); and (b) confirmation that a suspected device has indeed been used to acquire the recording in question (device verification). For our evaluation, we prepared two novel datasets: a controlled-conditions dataset with 31 devices; and an uncontrolled-conditions dataset with 141 devices. Our experimental evaluation demonstrates that the proposed BED descriptor is effective for both device identification and verification. In the former task, we reached an accuracy of over 96%. In the latter, we obtained a high true positive rate of 89% while maintaining a fixed low false positive rate of 1%.