TimeBank 1.2数据集介绍,编号LDC2006T08

140 阅读2分钟

TimeBank 1.2

Item Name:TimeBank 1.2
Author(s):James Pustejovsky, Marc Verhagen, Roser Sauri, Jessica Littman, Robert Gaizauskas, Graham Katz, Inderjeet Mani, Robert Knippen, Andrea Setzer
LDC Catalog No.:LDC2006T08
ISBN:1-58563-386-0
ISLRN:717-712-373-266-4
DOI:
Release Date:April 17, 2006
Member Year(s):2006
DCMI Type(s):Text
Data Source(s):newswire
Application(s):information extraction, temporal analysis
Language(s):English
Language ID(s):eng
License(s):LDC User Agreement for Non-Members
Online Documentation:LDC2006T08 Documents
Licensing Instructions:Subscription & Standard Members, and Non-Members
Citation:Pustejovsky, James, et al. TimeBank 1.2 LDC2006T08. Web Download. Philadelphia: Linguistic Data Consortium, 2006.
Related Works:HidehasAnnotationLDC2009T23 FactBank 1.0isSimilarWithLDC2012T01 ModeS TimeBank 1.0LDC2012T10 Catalan TimeBank 1.0LDC2012T12 Spanish TimeBank 1.0relatesToLDC2009T07 Unified Linguistic Annotation Text Collection

Introduction
TimeBank 1.2 was developed by Brandeis University and contains 183 English news articles with over 27,000 event and temporal annotations, adding events, times and temporal links between events and times. The annotation follows the TimeML 1.2.1 specification.

Data
TimeML aims to capture and represent temporal information. This is accomplished using four primary tag types: TIMEX3 for temporal expressions, EVENT for temporal events, SIGNAL for temporal signals, and LINK for representing relationships. For a detailed description of TimeML, see the TimeML 1.2.1 Specification and Guidelines included in the corpus package documentation.
Here are descriptions for each tag:
TIMEX3 - Captures dates, times, durations, and sets of dates and times.
EVENT - Annotates those elements in a text that mark the semantic events described by it.
MAKEINSTANCE - Creates tags for events that include information about a particular instance of the event. When an event participates in a relationship, it is actually the event instance that is referenced.
SIGNAL - Annotates temporal function words such as "after," "during," and "when."
The following three tags are link tags. They capture temporal, subordination, and aspectual relationships found in the text. These tags do not consume any actual text, but they do relate the four tag types above to each other.
TLINK - Temporally relates two temporal expressions, two event instances, or a temporal expression and an event instance.
SLINK - Captures subordination relationships that involve event modality, evidentiality, and factuality.
ALINK - Captures an aspectual connection between two event instances.
TimeBank 1.2 contains 183 articles with just over 61,000 non-punctuation tokens. The count for each TimeML tag is listed below:

EVENT7,935
MAKEINSTANCE7,940
TIMEX31,414
SIGNAL688
ALINK265
SLINK2,932
TLINK6,418
Total27,592