Paweł Korus and Jiwu Huang
IEEE Signal Processing Letters, Vol. 23, Issue 1, 2016
2016-spl-merw.pdf (0 MB)
https://github.com/pkorus/merw
In this paper we propose to use maximal entropy random walk on a graph for tampering localization in digital image forensics. Our approach serves as an additional post-processing step after conventional sliding-window analysis with a forensic detector. Strong localization property of this random walk will highlight important regions and attenuate the background - even for noisy response maps. Our evaluation shows that the proposed method can significantly outperform both the commonly used threshold-based decision, and the recently proposed optimization-based approach with a Markovian prior.
Copyright © 2015 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.
Supplementary materials for this paper include:
The set contains a total of 2,000 response maps, and includes 1,000 maps for two analysis windows: 64 px and 128 px. The naming convention is as follows (printf convention):
p%d
- tampering pattern number (shown below)_%d
- tampered image numbers
or d
- denotes whether double compression artifacts are present in the spliced area (d) or in the background (s)For example: p4_1970s
corresponds to a forgery performed according to pattern 4 (triangle), with double JPEG compression of the background, and single JPEG compression inside the spliced shape.
The detector's response maps are 64 x 64 arrays. They are provided in two formats:
The YAML files contain the following fields:
map_count
- number of response maps (always 2)ground_truth
- the ground truth tampering mapcandidate_1
- response map for 64 px windowcandidate_2
- response map for 128 px windowThe PNG files are organized into separate directories for ground truth, 64px response maps, and 128 px response maps. Although there should not be significant differences between the results obtained from these maps, we did not check it thoroughly. The results presented in the paper were obtained from the real-valued maps.
The data set was obtained by sliding-window analysis of synthetic JPEG splicing forgeries with a SVM-based detector (mode-based first digit features). The forgeries involved replacement of a certain fragment of the image (according to a randomly placed, predefined pattern) with the same content, but with a different compression history. The first compression level Q1 was chosen randomly from {50, 51, ..., 99} and the second from {Q1 + 1, ..., 100}. Hence, the data set contains response map with highly varied reliability (which deteriorates for high compression rates and when Q2 approaches Q1).
In order to facilitate research reproducibility, we provide a reference implementation of the proposed method (C++) via github.com. The software is provided as is, without warranties of any kind and can be used free of charge for educational and research purposes only. For more information, see the included license.txt file.