3 minute read

DRAFT Guidelines for the Assignment:

The Spatial Data Assignment is an assignment in one step. It builds upon the work we did in class since 25 February.

This assignment must be done in pairs or groups of three. It can be written together, but should be posted on each of the team member’s course site.

We will work with the Zanzibar Gazette pdfs (in Drive) on a project entitled “Thick Mapping Zanzibar (1892-1919): Visualizing the Zanzibar Gazette.” Each team will work on a longitudinal dataset, ie a dataset that collects information from over time.

The datasets that you generate will be all put together by the instructor after we are done to attempt some thick mapping of Zanzibar from the perspective of the British gazette.

This exercise has five main elements: (1) choosing an element of the source material (2) identifying the spatial elements of any humanities source material (3) modeling a spatial dataset based on those elements, (4) enriching the spatial data with any other relevant metadata, (5) visualizing the data on a map, discussion and interpretion.

Step 1: Choice of topic that you would like to work on from within the Zanzibar Gazette pdfs. You should aim to create a dataset which has about 100 rows of data per person on your team. Please check with the instructor if you use one which is not included below.

  • Imports into Zanzibar
  • Exports from Zanzibar
  • Shipping routes
  • Licenses Issued (for commercial purposes)
  • African Produce imported
  • other

Your choice of topic will need to be after research about that topic across the newspaper. Remember that after 1913, the Gazette is more formalized.

Step 2: Modeling the data. You should identify what data is available from the source.

For this exercise, you will semi-automate the process of data collection and you will need an account at openai.com or Gemini (the latter is automatic when logged in with your NYU netID).

Step 3: Possible enriching the data. Sorting and combining the data: choosing the most important elements. Geocoding.

When you are working with the data one of the things you will want to do is to examine what kinds of other information can be associated with the main points. You will need to add on columns which achieve that. These will be different according to the source.

This might be the case if you have data over time. You can bring together different tables, by the addition of a date column.

Finally, you can geocode by hand or automate that process using Geocode by Awesome Table so that you have latitude and longitude columns.

Step 4: Visualizing the data. Discussion and Interpretation.

Your choices for visualizing the spatial data are numerous. Perhaps the easiest is kepler.gl or UMAP.

You may also choose to use some visualization from within Google Sheets itself.

Insert the following line where you want the image to show up in the body of the .md file of your assignment:

<img src="/assets/{filename}.png" style="zoom:50%"/>

Use the zoom percentage to size the file. An example of the code of one of your instructor’s research projects can be found here

Guiding questions for this assignment:

  • What kind of source is it?
  • What kinds of data does that source provide?
  • Was the data directly or indirectly communicated? In other words, did you have to assume anything to create new columns?
  • How did you create a prompt for GPT or Gemini to automate the extraction of the information?
  • How good was the automation? What did you have to change?
  • When you mapped the data, did you see any interesting patterns? Were there areas of concentration of points? Did mapping the additional columns (profession, gender, etc) lead to any visible clusters?
  • If you were to do this over many more pages then you did for this assignment, what kind of results do you think the different scale would give?
  • If you were to do this assignment with a source not included here, what would your ideal source be?

Your assignment should be about 1500 words long, with several images, embedded links. Use a markdown cheatsheet such as this one to stylize your post, adding different layout features and embedded links if needed. You can refer to the readings if you want to, but this is not necessary for this assignment. Due date: 30 March.