Metadata Cataloging, Storage, and Retrieval of Multilingual Motion Picture Subtitles: An XML Digital Library

Abstract:
The popularity of motion pictures in digital form has seen a dramatic increase in recent years, and the global entertainment market has driven demands for subtitles in multiple languages. This paper investigates the informational potential of aggregating a corpus of multilingual subtitles for a digital library. Subtitles are extracted from commercial DVD releases and downloaded from the internet. These subtitles and their bibliographic metadata are then incorporated in an XML-based database structure. A prototype digital library is developed to provide full-text search and browse of the subtitle text with single- or parallel-language displays. The resulting product includes a set of tools for subtitles acquisition and a web browser-based digital library prototype that is portable, extensible and interoperable across computing platforms. The functionalities of this prototype are discussed in comparison to the another subtitle corpus created for computational linguistics studies. Several informational potentials of this digital library prototype are identified: as an educational tool for language learning, as a finding aid for citations, and as a gateway for additional temporal access points for video retrieval.
Keywords: metadata, subtitles, digital library, cataloging, XML, SRT, motion pictures

Welcome to the Multiligual Motion Picture Subtitle Database

This is a prototype created by Kimmy Szeto and Helena Marvin for their thesis project for the Graduate School of Library and Information Studies at Queens College, City University of New York.


The database includes subtitles for 10 motion pictures in English, Spanish and French. Users are able to browse, perform full-text search, and display subtitles in parallel languages.

Click here to experience the prototype

View or Download the PDF of the paper Metadata Cataloging, Storage, and Retrieval of Multilingual Motion Picture Subtitles: An XML Digital Library

View a flowchart (as a PDF) which explains our xml database creation process

View the XML Database we created

Download related software

With special thanks to:

Tuesday, December 15th, 2009