Topic Segmentation
In this analysis, we project a Segmentation map corresponding to an input text onto the image.
The segmentation map is computed using various heads to illustrate the properties characterized by the head.
The heatmap is shown in "blue" which matches the input text description.
Topic Segmentation results for Layer 22, Head 13 (a "geolocation" head). The model used is ViT-L-14 (LAION-2B).
The blue color is focused on "Eiffle tower", "Christ", "Statue of Liberty" and "Taj Mahal" which are in France, NewYork, Brazil
and India respectively as provided in the text input. Interesting to note that, there is no explicit information provided
such as Eiffle tower is in Paris, France. The Layer 22, Head 13 has geolocation properties which implicitly identifies it.
Topic Segmentation results for Layer 11, Head 3 (an "environment/weather" head). The model used is ViT-B-16 (LAION-2B).
In the first image (left), the heatmap (blue) is focused on "flowers" which matched the text description.
In the second image (middle), the heatmap (blue) is concentrated on the "tornado" matching the text description.
In the last image, the heatmap (blue) is focused on "sun" matching the description "Hot Summer".
Topic Segmentation results for Layer 10, Head 6 (an "emotion" head). The model used is ViT-B-32 (OpenAI-400M).
In the first image (left), the heatmap (blue color) is more pronounced on the "smile" emotion in a child's face which
suits the text description. In the middle image, the heatmap is focused on "fear" emotion from the Conjuring
movie. Interesting fact to note that, there is no explicit information provided that the picture correponds to the
"fear" emotion. In the last image, we see the heatmap is centralized on the sad emotion of "Thanos" in Marvel.