Metadata Cataloging, Storage, and Retrieval of Multilingual Motion Picture Subtitles: An XML Digital Library
Abstract:
The popularity of motion pictures in digital form has seen a dramatic
increase in recent years, and the global entertainment market has driven
demands for subtitles in multiple languages. This paper investigates the
informational potential of aggregating a corpus of multilingual subtitles
for a digital library. Subtitles are extracted from commercial DVD
releases and downloaded from the internet. These subtitles and their
bibliographic metadata are then incorporated in an XML-based database
structure. A prototype digital library is developed to provide full-text
search and browse of the subtitle text with single- or parallel-language
displays. The resulting product includes a set of tools for subtitles
acquisition and a web browser-based digital library prototype that is
portable, extensible and interoperable across computing platforms. The functionalities of this prototype are discussed in comparison to the another subtitle corpus created for computational linguistics studies. Several informational potentials of this digital library
prototype are identified: as an educational tool for language learning, as
a finding aid for citations, and as a gateway for additional temporal
access points for video retrieval.
Keywords: metadata, subtitles, digital library, cataloging, XML,
SRT, motion pictures
Welcome to the Multiligual Motion Picture Subtitle Database
This is a prototype created by Kimmy Szeto and Helena Marvin for their thesis project for the Graduate School of Library and Information Studies at Queens College, City University of New York.
The database includes subtitles for 10 motion pictures in English, Spanish and French. Users are able to browse, perform full-text search, and display subtitles in parallel languages.
Click here to experience the prototype
View or Download the PDF of the paper Metadata Cataloging, Storage, and Retrieval of Multilingual Motion Picture Subtitles: An XML Digital Library
View a flowchart (as a PDF) which explains our xml database creation process
View the XML Database we created
Download related software
With special thanks to:
- Rob Deary
- Alan Jessen
- Arthur Peters
Tuesday, December 15th, 2009