Digitizing The New York Times archive with Google Cloud

Digitizing The New York Times archive with Google Cloud


First New York City subway, 1904. And there you have it. The morgue is what makes The Times The Times. There’s six hundred cabinets, a few thousand drawers. Six to eight million photographs dating from the late 1800s on till the 1990s. This is the Flying Hunters. This ran in 1930. George Washington Bridge. France’s biggest naval ship. American soldier greeting his mom. Christmas at Penn Station. I mean there’s pretty much
anything and everything. The history of the world through
the eyes of The New York Times. Allan: I didn’t think too
much about it at first. Like, “Yeah, sure, we have photo archives.” And then I learned more about it, and there were, like, millions and millions of photos down there that, except for, like, one person, nobody
really knows what’s hiding down there. For every picture that we were able to publish, many never saw the light of day. The more that I work down in the morgue the more the real value and the real importance of having access to it dawned on me. The exciting thing about this project is The New York Times has over
a hundred years of photos locked in a basement, and this project will allow people to see
photos that have never been seen before and make them accessible to the
newsroom at The New York Times. Allan: When I first heard that there was a
possibility of us digitizing this, I was really excited. So Google and The New York Times have
had a partnership for many years. And we think it’s actually
an ideal partnership to leverage the power of
Google Cloud’s technology. What Google is bringing to the table is a lot of the infrastructure that
The New York Times needs as well as providing the, sort of, platform
level services on the Vision APIs. Allan: The first part of the digitization process
is obviously the physical scan of the photos. Getting them out of folders,
getting them out of all these boxes and physically scanning them. Jeff: So here’s the front
of the picture, but the back of the picture
is just as interesting. Allan: The stamps, handwritten notes, etc.
That tells us something about the photo, who took it, etc. That is the data we need to extract. Nancy: These markers all over
the back are the clues for where the picture was used. So here it was published
in the newspaper at least twice. Here we can see the captions that were
taped on the back indicating publication and along this top edge
there’s a number. That is the indicator of where this
photograph lives inside the morgue. Samuel: We’ll upload them into tools, which will allow the photo editors to search the
archive and bring up the images they need. Allan: So once we’re done with this,
it will enable the newsroom to immediately access our entire archive from their desktop. Jeff: Once the pictures are
digitized, I mean, everything old is new again. Cornelius: We get the sense
that covering current events is talking about what just happened. But having this resource available
to reporters and editors gives them the ability to draw in
all the context of what preceded it, the wider world that led to
this contemporary event. Nancy: There is nothing else,
no other way of reporting what goes on
in the universe that can do that the way
a still photograph can. Cornelius: The idea of telling stories in pictures is
how society works now. Samuel: For Google, our job
is to make the world’s information universally
accessible and useful. And in this project, we’re helping The New York Times
with their data to be able to do that.

Author:

24 thoughts on “Digitizing The New York Times archive with Google Cloud”

Leave a Reply

Your email address will not be published. Required fields are marked *