Skip to main content

ELAN and Praat

Overview

ELAN and Praat are the two most widely used tools for time-aligned linguistic annotation. ELAN provides a multi-tier annotation model for transcription, gesture coding, sign language analysis, and multimodal annotation. Praat specializes in acoustic phonetic analysis with interval and point tiers. Both use temporal anchoring as their primary mechanism.

ELAN (EAF)

Tier Architecture

ELAN ConceptLayers EquivalentNotes
Annotation Document (.eaf file)pub.layers.expression + pub.layers.media + annotation layersThe EAF file is a collection of tiers over a media file. In Layers, the media file is a media record, the document context is an expression, and each tier is an annotationLayer.
Tierpub.layers.annotation#annotationLayer with kind="tier"Each ELAN tier maps to an annotation layer with kind="tier". The subkind (via subkindUri) can specify the tier's semantic type (e.g., "transcription", "gesture", "translation").
Tier @LINGUISTIC_TYPE_REFannotationLayer.ontologyRef or subkindUriELAN's linguistic types (symbolic association, time subdivision, included in, etc.) define how tiers relate to each other.
Tier @PARENT_REFannotationLayer.parentLayerRefELAN's parent-child tier relationships map directly to parentLayerRef.
Tier @PARTICIPANTannotationLayer.metadata.personaRef or features.participantSpeaker/participant association.
Controlled Vocabularypub.layers.ontologyELAN's controlled vocabularies map to Layers ontology typeDef entries.

Annotation Types

ELAN Annotation TypeLayers EquivalentNotes
ALIGNABLE_ANNOTATIONannotation with anchor.temporalSpanTime-aligned annotations with start/end time slots. TIME_SLOT_REF1/TIME_SLOT_REF2temporalSpan.start/temporalSpan.ending.
REF_ANNOTATIONannotation with anchor.tokenRef or parent referenceAnnotations that reference parent tier annotations rather than time directly. Represented via features linking to the parent annotation's UUID.
Time slots (TIME_ORDER)Converted to millisecond values in temporalSpanELAN uses indirected time slot IDs. Layers uses direct millisecond values.
Annotation valueannotation.value or annotation.labelThe text content of the annotation.

ELAN Linguistic Types

ELAN Linguistic TypeLayers RepresentationNotes
Time_SubdivisionChild annotationLayer with parentLayerRef, each annotation has its own temporalSpanSubdivides parent annotation's time interval.
Symbolic_SubdivisionChild annotationLayer with parentLayerRef, annotations indexed by positionSubdivides parent without independent time alignment.
Symbolic_AssociationChild annotationLayer with parentLayerRef, 1:1 annotation mappingOne annotation per parent annotation (e.g., translation of each utterance).
Included_InChild annotationLayer with parentLayerRef, annotations within parent time boundsAnnotations contained within parent's time interval.

ELAN Metadata

ELAN FeatureLayers EquivalentNotes
Media descriptorspub.layers.media recordAudio/video file references. ELAN's MEDIA_DESCRIPTORmedia.externalUri or media.blob. Audio metadata (sampleRate, channels, bitDepth, codec, speakerCount) is stored as first-class fields. transcriptRef links to the expression containing the transcript; segmentationRef links to its segmentation.
Linked filespub.layers.expression.features or sourceRefAdditional linked resources.
Author/datepub.layers.defs#annotationMetadataCreation metadata.
Licensepub.layers.corpus.license (at corpus level)Licensing information.

Praat TextGrid

TextGrid Structure

Praat ConceptLayers EquivalentNotes
TextGrid (file)pub.layers.expression + pub.layers.media + annotation layersA TextGrid is a collection of tiers over a sound file.
IntervalTierpub.layers.annotation#annotationLayer with kind="tier"Each interval tier is an annotation layer. Intervals map to annotations with anchor.temporalSpan.
PointTier (TextTier)pub.layers.annotation#annotationLayer with kind="tier"Point tiers have annotations with a single time point. Represented as temporalSpan where start == ending, or via features storing the point time.
Intervalpub.layers.annotation#annotation with anchor.temporalSpanxmin/xmaxtemporalSpan.start/temporalSpan.ending (converted from seconds to milliseconds). textannotation.value.
Pointpub.layers.annotation#annotation with anchor.temporalSpantimetemporalSpan.start (with ending = start). markannotation.value.

Common Praat Tier Types

Praat Tier UsageLayers RepresentationNotes
Phoneme tierannotationLayer(kind="tier", subkind="phonetic")IPA segments with time boundaries.
Word tierannotationLayer(kind="tier") + pub.layers.expression tokenizationWord-level intervals can also populate a tokenization.
Syllable tierannotationLayer(kind="tier") with custom subkindSyllable boundaries.
ToBI tone tierannotationLayer(kind="tier", subkind="tobi") or annotationLayer(kind="token-tag", subkind="tobi")Intonation annotations.
Break index tierannotationLayer(kind="tier") with custom subkindProsodic break indices.
Pitch/formant tiersannotationLayer(kind="tier") with acoustic values in featuresAcoustic measurements.

Praat ↔ ELAN Correspondence

Both ELAN and Praat tiers map to the same Layers representation (annotationLayer with kind="tier"), making it straightforward to combine annotations from both tools on the same expression.

Interlinear Glossing in ELAN

ELAN is commonly used for interlinear glossed text in language documentation. The standard tier structure maps to Layers as:

ELAN IGT TierLayers EquivalentNotes
Transcription tier (utterance)annotationLayer(kind="tier") for utterance text + tokenization(kind="whitespace") for wordsTime-aligned transcription.
Morpheme break tiertokenization(kind="morphological")Morpheme-level tokenization (child of word tokenization via pub.layers.alignment).
Gloss tierannotationLayer(kind="token-tag", subkind="gloss") on morphological tokenizationLeipzig-style glosses on morphemes.
POS tierannotationLayer(kind="token-tag", subkind="pos")Part-of-speech tags.
Free translation tierpub.layers.alignment(kind="parallel-text") or annotation layerSentence-level translation.

The word-to-morpheme correspondence uses pub.layers.alignment(kind="interlinear", subkind="word-to-morpheme").