What is the model looking at?
Grad-CAM (Selvaraju et al. 2017) projects gradient-weighted activations from the last convolutional layer back onto the image. We compute it for the predicted class on a balanced sample of the test set — 30 images per subgroup.
Per-subgroup attention bias
| Subgroup | Foreground | Background (bias) | Acc |
|---|---|---|---|
| Landbird on land (majority) | 51.6% | 48.4% | 93.3% |
| Landbird on water (conflict) | 59.3% | 40.7% | 60.0% |
| Waterbird on land (conflict) | 59.3% | 40.7% | 70.0% |
| Waterbird on water (majority) | 65.2% | 34.8% | 96.7% |
Even when correct, ~30–48% of the model's attention falls outside the bird in the center crop. The two majority groups carry the highest "background" share — exactly the failure mode the brief asks us to detect.
Visual gallery
Each image is a three-panel plate produced by src/gradcam_analysis.py: the original image, the Grad-CAM overlay for the predicted class, and the white box showing the 60% center crop used as the foreground heuristic. Shortcut failures are highlighted in red.

Majority case: waterbird on water. Saliency overlaps the bird; background and foreground both look 'consistent'.

Majority case — model is confident and attention spans bird + water.

Another majority example, same pattern.

Conflict group success: a waterbird on land where the model still found the bird.

Shortcut failure: a waterbird placed on land. The model misclassifies it as landbird — saliency leaks onto the land background.

Another waterbird-on-land failure — strong evidence the background is steering the prediction.

Easy majority case for landbirds.

Landbird on land — accurate prediction.

Landbird majority case.

Conflict success: landbird placed on water but still classified correctly.

Shortcut failure: landbird on water becomes 'waterbird'. Saliency drifts toward the water background.

Another landbird-on-water failure — the background pattern dominates.
Caveat: the heuristic is coarse
Birds are not always perfectly centered, so the 60% center crop is a coarse foreground proxy. We discuss this honestly in Limitations. What matters here is the relative change in background attention across subgroups and across interventions.