FRE: A Fast Method For Anomaly Detection And Segmentation

Ibrahima J. Ndiour, Nilesh A. Ahuja, Utku Genc and Omesh Tickoo
Intel Labs

Introduction

We present a fast and principled approach to address the challenge of visual anomaly detection and segmentation. Our method operates under the assumption of having access solely to anomaly-free training data while aiming to identify anomalies of an arbitrary nature on test data. We build upon prior research and present a generalized approach that utilizes a shallow linear autoencoder to perform out-of-distribution detection on the intermediate features generated by a pre-trained deep neural network. More specifically, we compute the feature reconstruction error (FRE) and establish it as principled measure of uncertainty. We rigorously connect our technique to the theory of linear auto-associative networks to provide a solid theoretical foundation and to offer multiple practical implementation strategies. Furthermore, extending the FRE concept to convolutional layers, we derive FRE maps that provide precise pixel-level spatial localization of the anomalies within images, effectively achieving segmentation. Extensive experimentation demonstrates that our method outperforms the current state of the art. It excels in terms of speed, robustness, and remarkable insensitivity to parameterization. We make our code available on github.

Proposed Approach

Pipeline
Block diagram of our proposed approach.

At training: we pass the training data through the backbone network for a single forward pass and consider the intermediate (training) features at a given layer k. Subsequently, we train a (tied) shallow linear autoencoder with such features.

During inference: the autoencoder is applied to a test feature to obtain the reduced-dimension embedding, and its reconstruction back into the original (feature) space. The feature reconstruction error (FRE) is then calculated and used as an uncertainty score:

$$\begin{equation} FRE(\mathbf{u}, \mathcal{T}) = \mathbf{e} \triangleq \mathbf{u}-\mathbf{\hat{u}} = \mathbf{u}-(\mathcal{T}^{inv} \circ \mathcal{T})\mathbf{u}. \end{equation}$$

To derive the segmentation map, we perform channel-wise averaging on the FRE (re-arranged as a tensor) in order to accumulate the FRE errors along the channel dimension. This produces a single-channel FRE anomaly map M, where higher intensity regions correspond to anomalies.

The following detection score is highly effective at discriminating between normal and anomalous samples:

$$\begin{equation*} \mathbf{M}_k(i,j) = \frac{1}{C_k}\sum_{c=1}^{C_k} \mathbf{e}(c,i,j) %, i\in {1, 2, \dots, W_k}, i\in {1, 2, \dots, W_k} \end{equation*}$$

Results

Detection results

Detection table 1
MVTec dataset (AUROC metric ↑).
Detection table 2
Magnetic Tile dataset (AUROC metric ↑).
Detection table 3
MVTec dataset for FRE across backbones (AUROC metric ↑).

Segmentation results

Segmentation table 1
Pixel-AUROC and PRO metrics on MVTec dataset.
Segmentation results
From left to right, each set of four images comprises: original image, ground truth segmentation mask, anomaly heatmap using FRE (our method) from a single layer, anomaly heatmap using FRE from three layers.
Segmentation table 2
Pixel-wise AUROC and PRO metrics on MVTec for FRE across backbones.