We generated stimuli whose activations within an artificial neural network match those of a natural stimulus. Inspired by previous work in human color perception and visual crowding, we call these stimuli “Model Metamers.”
🧵3/N
Invariances can be described in terms of sets in the stimulus space. For a given reference stimulus, a set of stimuli will evoke the same classification judgment as the reference. A subset of these stimuli (metamers) produce the same activations as the reference.
🧵4/N
Humans performed a classification task on metamers generated from different stages of a model. We investigated both *audio* and *visual* models. If model invariances are shared by humans, humans should be able to classify model metamers as the reference stimulus class.
🧵5/N
Successive stages of a model may build up invariance, producing successively larger sets of model metamers. Do these metamers remain recognizable to humans for commonly used computational models, as they would in a “correct” model?
🧵6/N
We tested various supervised neural network models, including convolutional architectures, transformers, and models trained on large datasets. In all cases, model metamers generated from the final stages appeared unnatural and were generally unrecognizable to humans.
🧵7/N
This general method has previously been used for model visualization in computer science papers, but the significance for models of human perception has gone mostly unnoticed.
🧵8/N
We quantified these observations with human behavioral experiments. By the final stages of the tested models, humans were nearly at chance on the task, even though the model represented these the same as the natural stimulus (and recognized them as such).
🧵9/N
Could these misaligned invariances be due to the supervised task? To get at this, we tested visual self-supervised models. Although some models had slightly more recognizable metamers at intermediate stages, human recognition was still low in absolute terms.
🧵10/N
Another discrepancy between current models and humans is the tendency for models to base their judgments on texture rather than shape. However, we found that models trained to reduce this texture bias had metamers that are also comparably unrecognizable to humans.
🧵11/N
Can we fix this human-model discrepancy? We found that humans were better able to recognize metamers from models trained with *adversarial training*. Adversarial examples were generated online and models were trained to associate them with the correct label.
🧵12/N
Metamers from adversarially trained models appeared more natural, and were more recognizable to humans. But note that at late stages the metamers are still less than fully recognizable - the training does not fully mitigate the discrepancy with humans.
🧵13/N
We trained audio models with adversarial training and found the same result! These models also had more human-recognizable model metamers compared to their standard-trained counterparts.
🧵14/N
Is this just another test to assess adversarial vulnerability? NO! Even though adversarial training improved human-recognizability of model metamers, within adversarially trained models, metamer recognizability was not predicted by adversarial robustness.
🧵15/N
This result suggests that something about the adversarial training procedure aligns model invariances with those of humans, but robustness itself does not drive the effect.
🧵16/N
We also examined other sources of adversarial robustness: architectural changes to reduce aliasing (the “Lowpass” model) and a V1-inspired front-end (“VOne” model). Although these yielded similar robustness (f), the lowpass architecture had more recognizable metamers (g).
🧵17/N
Metamer recognizability also dissociated from other forms of robustness, such as susceptibility to class-preserving image corruptions.
🧵18/N
So what is happening with these model representations to cause them to be misaligned with humans? To get at this, we tested how well a model’s metamers were recognized by other models.
🧵19/N
In my favorite result of the paper, we found that human recognizability was well correlated with other-model recognizability. Thus, the discrepant metamers are due to the models having *idiosyncratic invariances* that are not shared with other models or human observers!
🧵20/N
Might humans analogously have invariances that are specific to an individual? This is hard to test definitively given that we do not have analogous access to human perceptual systems, and cannot generate human metamers at will.
🧵21/N
If idiosyncratic invariances were also present in humans, the phenomenon we describe might not represent a human–model discrepancy and could instead be a common property of recognition systems.
🧵22/N
The main argument against this interpretation is that several model modifications (adversarial training & architectural tweaks to reduce aliasing) reduced the idiosyncratic invariances present in the models, suggesting that they are not unavoidable in a recognition system.
🧵23/N
Oct 16, 2023 22:39Lots more can be found in the paper, including experiments with brain predictions, regularization and “classic” models. We also released code to generate metamers from your favorite PyTorch model and to run the human recognition experiments
github.com/jenellefeath...
🧵24/N
Finally, a very big *thank you* to our reviewers for this article! Their feedback improved our paper. It also would not have been possible without the support during my PhD from MIT Brain and Cog and DOECSGF. Thanks for reading!
Link again:
www.nature.com/articles/s41...
🧵25/N