Assignment 3 Visual AI and Culture S26 (optional)
Final Guidelines for the Assignment:
In this assignment we will continue to work with materials and algorithms discussed in class to explore a custom-made corpus of digital images. We will use DV Explorer to explore this corpus and then write a small critical essay based on our findings. This can be worked on in pairs, or alone.
This assignment is optional. You can receive up to 5 extra credit points for your grade. All students in the course will already receive full credit for the assignment.
Part One:
The very first thing you have to do is to build an image dataset, a corpus. You should choose a corpus of images, made up of at least 150 images.
Guidelines for the Corpus building exercise:
- a dataset of images from a cultural domain is required, one about which you know a little or a lot is welcome.
- it does not have to be “high” cultural or from the GLAM sector. I do not want you to take a pre-made dataset from another site, say Kaggle.
- it does not have to be designed with computation in mind.
- it should address something that you suspect is not close to Microsoft COCO or Google’s Inception training.
- its collection can be automated (with Image Downloader, for example), but does not have to be.
- it can come from multiple accounts, sites, or a search engine, but the process of collection should be explained.
- it should not have faces of people in them.
Tips on automating the collection:
You could think about automating this process by using certain extensions such as Image Downloader for Chrome, or by learning how to scrape images. Remember that part of building a corpus will be the organization of those images into categories, and potentially the renaming of them. You will want to keep track of where you get them and explain that as part of your corpus building exercise.
Some prior knowledge of the subject is important since in the second part you will classify the images into subtypes. Choosing a dataset of images you are not very interested in, or a dataset that you do not have enough knowledge to categorize will make it difficult.
You may end up collecting some of the images and then removing or adding others later, particularly in the move between part two and parts three and four.
Part Two:
The first computational exercise is a clustering exercise, of the “let-the-data-group-itself” sort.
I suggest you use this part of the assignment to do some exploratory data analysis (EDA). Using the pre-set Orange Data Mining workflow I distributed (images2.ows), you can address the following questions:
- How does one built-in algorithm such as Inceptionv3 in Orange Data Mining cluster the data?
- Do other algorithms give you different results?
- What features seem to be most characteristic of the different quadrants of the image plot?
- Using hierarchical clustering, isolate specific clusters to look more closely. What are the groups that it suggests? Do these correspond to the way that you would group them?
- Pick a group that is not so clear to you and propose a few possibilities.
- How “in reading a corpus of visual culture through a neural network, [are you] always also doing the reverse?” (Impett and Offert, 2023)
Make screenshots of the main image plot you generate and annotate it with labels (just typed on manually with a program like Preview) which illustrate in your opinion what combination of features are represented in each of the four quadrants of the plot. Caption the image with the relevant information (i.e. Image Plot generated with Orange Data Mining and the algorithm of your choice). Also include image plot of specific clusters that can help you to understand how the out-of-the-box algorithms (think of the Monet/Manet video).
Be sure to comment on how differently you see the images.
Part Three:
In the second computational exercise you will categorize the images using Orange Data Mining into at least two different groupings, although more than 2 are suggested. The categorization is carried out by placing the images into subfolders (each bearing a descriptive title). This means that for the purposes of your assignment, you should have two or three copies of the same images, but organized in different folders according to categories you have chosen (e.g. size, shape, color, pixelization OR origin, quantity, size). For best results, your classification system should be sure to have a somewhat equal number of images in every category.
For each of the sets of images, please generate a confusion matrix and assess the ability of the chosen algorithm (Inception, VGG, OpenFace, etc) to predict the categories you have chosen. Remember that you can click on the box where there has been misclassification and visualize the misclassified objects in the viewer. Discussing what the model got wrong, can be very interesting (perhaps more interesting than what it got right).
Guiding questions for this part of the assignment:
- How well did any of the built-in algorithms you chose predict the categories you established?
- Did you go back and adjust any of your categories and retry. What was the rationale for adjusting your categories?
- When you isolate the cases of “false positives”, i.e. mis-predictions, can you understand why the algorithm got them wrong? (Think about the Monet/Manet mis-prediction case we saw in class).
- If you could train your own algorithm to look at specific features or deal with specific kinds of data, what would you aim to teach it?
In your write up, you can include insights and ideas from the book Distant Viewing (Arnold & Tilton) on the importance of labeling, ch 1.
Make sure that you include images from your analysis.
What to include in your assignment
Since this is a visual assignment, you are not limited in the number of images you can use. As in previous assignments, screenshots of key points that match your analysis will be graded the highest. If you depart from the exercise at all and use code, or other means for acquiring or manipulating the image data, please explain and include that code (if appropriate). If you feel comfortable sharing the images in GitHub you can do so in the assets folder. If not, make sure that you make the images available for me to look at in Drive with a link shared for me. Be sure to connect your writing with any links to resources that are important for your reader. Also, connect your analysis to anything from the book Distant Reading or the other articles from the third section of class.
Due Date for this assignment: 12 May 2026, 1159pm, 1500-2000 words maximum.