Skip to main content

PAULA / Salt / ANNIS

Overview

Salt is a graph-based meta-model for linguistic annotation that serves as the common intermediate representation for the Pepper converter framework and the ANNIS corpus search system. PAULA is Salt's XML serialization. Salt models annotations as a directed, labeled graph with multiple annotation layers over shared primary data (text or audio). Salt explicitly supports multiple concurrent annotation layers, timeline-based alignment for spoken data, and hierarchical/relational structures.

Salt Meta-Model Mapping

Core Graph Model

Salt ConceptLayers EquivalentNotes
SDocumentpub.layers.expressionRoot document container.
SDocumentGraphAll annotation layers + segmentation for an expressionThe set of all annotations over a document.
SCorpuspub.layers.corpusCorpus container.
SCorpusGraphCorpus membership recordsCorpus hierarchy.
STextualDS (textual data source)pub.layers.expression.textPrimary text data. The SofA.
SMedialDS (media data source)pub.layers.mediaAudio/video primary data.
STimelineImplicit in pub.layers.defs#temporalSpanSalt's timeline for spoken data alignment. Time points map to millisecond values.

Annotation Nodes

Salt Node TypeLayers EquivalentNotes
STokenpub.layers.expression (kind: token)Token node with text span.
SSpanpub.layers.annotation#annotation with anchor.tokenRefSequenceSpan over tokens (e.g., NP, entity mention).
SStructurepub.layers.annotation#annotation with parentId/childIdsHierarchical node (constituency tree node, discourse unit).

Annotation Edges

Salt Edge TypeLayers EquivalentNotes
STextualRelationtoken.textSpanToken-to-text anchoring. sStart/sEndspan.start/span.ending.
STimelineRelationTemporal anchoring via annotation.anchor.temporalSpanToken-to-timeline anchoring for spoken data.
SSpanningRelationannotation.anchor.tokenRefSequenceSpan-to-token membership.
SDominanceRelationannotation.parentId/annotation.childIdsParent-child edges in hierarchical structures (constituency trees).
SPointingRelationpub.layers.graph#graphEdge or annotation.headIndex/argumentRefDirected edge between nodes (dependency arcs, coreference links, discourse relations).
SOrderRelationToken ordering via tokenIndexSequential ordering of tokens.
SMedialRelationannotation.anchor.temporalSpanNode-to-media timeline anchoring.

Annotations on Nodes/Edges

Salt FeatureLayers EquivalentNotes
SAnnotation (on node)annotation.label, annotation.value, or annotation.featuresKey-value annotations on nodes.
SAnnotation (on edge)graphEdge.properties or annotation.label (for dependency labels)Key-value annotations on edges.
SMetaAnnotationpub.layers.defs#annotationMetadata + featureMapDocument and corpus-level metadata.
SLayerpub.layers.annotation#annotationLayerNamed annotation layers grouping nodes and edges. Salt layers map directly to Layers annotation layers.

Multi-Layer Architecture

Salt explicitly supports multiple annotation layers over the same primary data, which is the core of Layers's design:

Salt PatternLayers PatternNotes
Multiple SLayer over same STextualDSMultiple annotationLayer records referencing same expressionIndependent annotation layers from different sources.
Cross-layer referencesargumentRef.layerRef + argumentRef.objectIdSalt allows edges between nodes in different layers; Layers supports this via cross-layer argument references.
Layer-specific node typesannotationLayer.kind/subkindSalt layers can contain different node types; Layers discriminates by kind/subkind.

PAULA XML Elements

PAULA ElementLayers EquivalentNotes
<paula> (document)pub.layers.expressionDocument root.
<body> (primary data)expression.textPrimary text.
<markList> (token/span markables)pub.layers.expression tokens + annotation spansToken and span definitions.
<mark>token or annotationIndividual markable.
<structList>annotationLayer with hierarchical annotationsStructural annotations (trees).
<struct>annotation with parentId/childIdsStructural node.
<relList>annotationLayer with kind="relation" or pub.layers.graphRelation annotations.
<rel>graphEdge or annotation with headIndexIndividual relation.
<featList>featureMapFeature annotations.

ANNIS Query Compatibility

ANNIS provides AQL (ANNIS Query Language) for searching across annotation layers. Layers's appview can provide equivalent query capabilities by indexing annotation layers in Elasticsearch and PostgreSQL:

ANNIS FeatureLayers Appview EquivalentNotes
Token searchElasticsearch full-text + token indexText and token-level search.
Span searchElasticsearch on annotation layersSpan annotation queries.
Tree queries (dominance)PostgreSQL recursive queries on parentId/childIdsTree structure traversal.
Pointing relation queriesPostgreSQL/Neo4j on graph relationsRelation queries.
Cross-layer queriesJOIN across annotation layer tablesMulti-layer query composition.
Frequency analysisElasticsearch aggregationsStatistical analysis over annotations.