Skip to main content

bead (FACTS.lab)

Overview

bead provides four core capabilities: (1) a unified interface abstraction over syntactic and semantic frame ontology resources (PropBank, FrameNet, VerbNet, etc.), (2) a template system for constructing experimental stimuli with typed slot constraints, (3) a framework for deploying and analyzing large-scale linguistic judgment experiments, and (4) active learning workflows for efficient annotation. The companion glazing package provides unified Pydantic models for FrameNet, PropBank, VerbNet, and WordNet with cross-reference resolution.

Component 1: Unified Frame Ontology Interface

bead's Approach

bead defines abstract interfaces that unify heterogeneous frame ontology resources:

  • Frame: An event/situation template with named role slots (e.g., PropBank's buy.01, FrameNet's Commerce_buy)
  • Role/Argument Slot: A participant slot in a frame (e.g., ARG0/Buyer, ARG1/Goods)
  • Type Constraints: Restrictions on what can fill a role slot
  • Inheritance: Frame hierarchies (FrameNet's frame relations, VerbNet's verb class tree)

Layers Mapping

bead ConceptLayers EquivalentNotes
Frame ontology resourcepub.layers.ontology (record)A named, versioned ontology with domain, personaRef, and knowledgeRefs. The domainUri field allows community-defined domains.
Frame definitionpub.layers.ontology#typeDef with typeKind="FRAME_TYPE"bead's unified frame interface maps to Layers's typeDef with typeKind set to FRAME_TYPE. The allowedRoles array contains roleSlot references.
Event typepub.layers.ontology#typeDef with typeKind="EVENT_TYPE"bead treats events as a subtype of frames. Layers's typeKindUri allows community extension.
Role/argument slotpub.layers.ontology#roleSlotroleName → role label (ARG0, Agent, Theme); fillerTypeRefs → type constraints; required → obligatoriness; constraints → declarative selectional restrictions via pub.layers.defs#constraint.
Entity typepub.layers.ontology#typeDef with typeKind="ENTITY_TYPE"Filler type definitions for role constraints.
Type hierarchytypeDef.parentTypeRefRecursive parent reference enables arbitrary inheritance depth.
Knowledge base linktypeDef.knowledgeRefs[]Links to FrameNet, PropBank, VerbNet, Wikidata via pub.layers.defs#knowledgeRef. The sourceUri field allows community-defined knowledge bases.

Frame Annotation (Instance Level)

bead ConceptLayers EquivalentNotes
Frame instance (SRL)pub.layers.annotation#annotationLayer with kind="span", subkind="frame"Frame instances annotated in text. annotation.label holds the frame name; annotation.ontologyTypeRef points to the typeDef.
Role fillerpub.layers.annotation#argumentRefargumentRef.role is the role label; argumentRef.annotationId points to the filler span annotation.
PropBank-style SRLformalism="PropBank" on the annotationLayerannotation.label = roleset ID (e.g., buy.01); arguments use ARG0, ARG1, etc.
FrameNet-style SRLformalism="FrameNet" on the annotationLayerannotation.label = frame name; arguments use frame element names.
VerbNet-styleformalism="VerbNet" on the annotationLayerannotation.label = verb class; arguments use thematic role names.

Unified Access Pattern

bead's key contribution is a single interface that abstracts over PropBank, FrameNet, and VerbNet. In Layers, this unification is achieved by:

  1. All frame ontologies use the same pub.layers.ontology record structure (typeDef + roleSlot)
  2. The formalism field on annotationLayer identifies which tradition the annotation follows
  3. The ontologyRef field links to the specific ontology definition
  4. knowledgeRefs on typeDef cross-link equivalent frames across resources (e.g., PropBank's buy.01 linked to FrameNet's Commerce_buy via Wikidata or direct URI)

glazing (Unified Frame Inventory)

The glazing companion package provides unified Pydantic models for FrameNet 1.7, PropBank 3.4, VerbNet 3.4, and WordNet 3.1 with cross-reference resolution.

glazing ConceptLayers EquivalentNotes
FrameNet Framepub.layers.ontology#typeDef with typeKind="FRAME_TYPE" + knowledgeRefs source "framenet"Frame definition with roles, frame elements, relations.
FrameNet LexicalUnitpub.layers.resource#entry with knowledgeRefs source "framenet"Lexical unit within a frame.
PropBank Rolesetpub.layers.ontology#typeDef with typeKind="FRAME_TYPE" + knowledgeRefs source "propbank"PropBank predicate-argument structure.
VerbNet VerbClasspub.layers.ontology#typeDef with hierarchy via parentTypeRef + knowledgeRefs source "verbnet"VerbNet class with inheritance.
WordNet Synsetpub.layers.resource#entry with knowledgeRefs source "wordnet"WordNet sense/synset reference.
Frame inventory (e.g., FrameNet 1.7)pub.layers.resource#collection with kind="frame-inventory" and version="1.7"A versioned collection of frame definitions.
Cross-resource mappingknowledgeRefs arrays + pub.layers.graph#graphEdgeMultiple knowledgeRefs on a single typeDef cross-link equivalent items across resources.

Component 2: Template and Stimulus Construction

bead's Approach

bead provides a rich system for constructing experimental stimuli:

  • Template: A text pattern with named {variable} slots, per-slot constraints, cross-slot constraints, and a language tag
  • Slot: A named variable position with a constraint expression (DSL), required flag, default value, and reference to a lexicon of allowed fillers
  • Constraint: A DSL expression that can operate at slot level (self.pos == "VERB"), template level (subject.features.number == verb.features.number), or cross-template level
  • LexicalItem: A vocabulary entry with lemma, form, language, and features (POS, morphology, frequency, etc.)
  • Lexicon: A named collection of LexicalItems
  • FilledTemplate: A template with all slots mapped to specific fillers, plus the rendered text and filling strategy
  • Filling strategies: Exhaustive, Random, Stratified, MLM (masked language model), CSP (constraint satisfaction), Mixed
  • Item: A constructed stimulus with labeled spans, span relations, and optionally model outputs
  • ItemTemplate: An experimental item structure with judgment type, task type, and element definitions

Layers Mapping

bead ConceptLayers EquivalentNotes
Templatepub.layers.resource#templateText with {slotName} placeholders. slots[] contains slot definitions. constraints[] holds cross-slot constraints. experimentRef links to the experiment.
Slotpub.layers.resource#slotname → slot name; required → obligatoriness; defaultValue → default filler; collectionRef → lexicon of allowed fillers; ontologyTypeRef → type constraint; constraints[] → slot-level constraint expressions.
Constraint (slot-level)pub.layers.defs#constraint with scope="slot"expression holds the DSL string (e.g., self.pos == "VERB"); expressionFormat identifies the DSL (e.g., "python-expr").
Constraint (template-level)pub.layers.defs#constraint with scope="template"expression holds cross-slot constraints (e.g., subject.features.number == verb.features.number); context lists the slot names involved.
Constraint (cross-template)pub.layers.defs#constraint with scope="cross-template"For constraints spanning multiple templates in a multi-template experiment design.
LexicalItempub.layers.resource#entrylemma → citation form; form → surface form; language → BCP-47 tag; features → POS, morphology, frequency, register, etc.; ontologyTypeRef → type classification.
Lexiconpub.layers.resource#collection with kind="lexicon"Named collection of entries. language, version, ontologyRef provide metadata.
Lexicon membershippub.layers.resource#collectionMembershipLinks an entry to a collection with optional ordering. Many-to-many: an entry can belong to multiple lexicons.
FilledTemplatepub.layers.resource#fillingtemplateRef → the template; slotFillings[] → slot-to-filler mappings; renderedText → the result; strategy → filling strategy; expressionRef → materialized text for annotation.
SlotFillingpub.layers.resource#slotFillingslotName → which slot; entryRef → AT-URI of the entry filling this slot; literalValue → literal string override; renderedForm → surface form after inflection.
Filling strategyfilling.strategy / filling.strategyUriURI+slug pattern. knownValues: exhaustive, random, stratified, mlm, csp, mixed, manual, custom. Strategy parameters go in features.
Item (constructed stimulus)pub.layers.expression + pub.layers.resource#fillingThe filling's renderedText becomes an expression's text. The expressionRef on the filling links them. Annotations on the expression represent labeled spans.
Item spanspub.layers.annotation#annotationLayer on the expressionLabeled spans on the materialized stimulus text.
Item span relationspub.layers.annotation#argumentRef or pub.layers.graph#graphEdgeDirected typed relations between spans, using the standard annotation argument mechanism or graph edges.
ItemTemplatepub.layers.resource#template + pub.layers.judgment#experimentDefThe structural template (text, slots) lives in resource#template; experiment-level metadata (judgmentType, taskType, guidelines, labels, scale) lives in experimentDef. The experimentRef on template links them. Item-specific metadata (judgmentType, category) can go in template.features.
Model outputs on itemspub.layers.annotation#annotationLayer with metadata.tool = model nameModel predictions stored as annotation layers on the materialized expression, with provenance tracking.

Template-to-Experiment Pipeline

bead's stimulus construction pipeline maps to Layers as a chain of AT-URI references:

  1. Define lexicons: Create resource#collection records with kind="lexicon", populate via resource#entry + resource#collectionMembership
  2. Define templates: Create resource#template records with {slotName} text, slot definitions referencing collections, and constraints
  3. Generate fillings: Create resource#filling records with slot-to-filler mappings, rendered text, and filling strategy
  4. Materialize items: Create expression records from rendered text; set filling.expressionRef to link back
  5. Annotate items: Create annotation#annotationLayer records on the expressions (labeled spans, span relations)
  6. Run experiment: Create judgment#experimentDef with templateRefs pointing to the templates and collectionRefs to the filler pools
  7. Collect judgments: Annotators create judgment#judgmentSet records; each judgment has itemRef → expression and fillingRef → filling
  8. Analyze: judgment#agreementReport summarizes inter-annotator agreement

Every step is a separate ATProto record linked by AT-URI, enabling full provenance tracing from judgment back through filling to template to lexicon to entry.

Component 3: Linguistic Judgment Experiment Framework

bead's Approach

bead provides a structured framework for:

  • Defining annotation/judgment tasks with various response types
  • Collecting judgments from multiple annotators
  • Computing inter-annotator agreement
  • Supporting active learning (selecting maximally informative items for annotation)

Layers Mapping

bead ConceptLayers EquivalentNotes
Experiment definitionpub.layers.judgment#experimentDefDirect mapping. taskType covers all bead task types: CATEGORICAL, ORDINAL, SCALAR, RANKING, SPAN_SELECTION, FREETEXT, PAIRWISE_COMPARISON, BEST_WORST_SCALING, ACCEPTABILITY. The taskTypeUri field allows community-defined task types. templateRefs links to stimulus templates; collectionRefs links to filler pools.
Annotation guidelinesexperimentDef.guidelinesFree-text guidelines (up to 100K chars). ontologyRef links to the formal type system.
Scale definitionexperimentDef.scaleMin / experimentDef.scaleMaxFor scalar and ordinal tasks.
Label setexperimentDef.labels[]For categorical tasks.
Annotatorpub.layers.judgment#judgmentSet.annotatorDid / annotatorIdATProto DID for decentralized identity; fallback string ID for anonymous annotators.
Single judgmentpub.layers.judgment#judgmentSupports all bead response types: categoricalValue, scalarValue, rankValue, textSpan, freeText. Adds responseTimeMs, confidence, and fillingRef.
Judgment batchpub.layers.judgment#judgmentSetGroups judgments from a single annotator for an experiment.
Agreement metricpub.layers.judgment#agreementReportmetric covers standard measures: Cohen's kappa, Fleiss' kappa, Krippendorff's alpha, percent agreement, correlation, F1. The metricUri field allows community-defined metrics.
Active learning signaljudgment.responseTimeMs + judgment.confidenceResponse time and confidence scores enable active learning item selection. Additional signals can be stored in features.
Behavioral analyticspub.layers.judgment#judgment.behavioralDataMouse movements, keystroke patterns, eye tracking data, and other behavioral signals stored as a featureMap.
List constraintspub.layers.judgment#experimentDesign.listConstraintsLatin square balancing, no-adjacent-same-condition, balanced frequency, and minimum distance constraints. Each listConstraint has kind, targetProperty, parameters, and an optional formal constraint expression.
Distribution strategypub.layers.judgment#experimentDesign.distributionStrategyHow items are distributed to annotators: latin-square, random, blocked, stratified, or community-defined via distributionStrategyUri.
Presentation modepub.layers.judgment#experimentDesign.presentationModeHow items are ordered within a list: random-order, fixed-order, blocked, adaptive, or community-defined via presentationModeUri.
Template sequencepub.layers.resource#templateComposition with compositionType="sequence"bead's TemplateSequence (ordered list of templates for multi-part stimuli) maps to a template composition with ordered templateMember entries.
Template treepub.layers.resource#templateComposition with compositionType="tree"bead's TemplateTree (hierarchical template structures) maps to a template composition where members can reference nested compositions via compositionRef.
Multi-word expressionpub.layers.resource#entry.components + entry.mweKindMWE entries have components (array of mweComponent with form, lemma, position, isHead) and mweKind (compound, phrasal-verb, idiom, etc.).

Experiment Lifecycle

bead's experiment lifecycle maps to Layers records as follows:

  1. Design: Create experimentDef with task type, guidelines, labels/scale, ontology reference, templateRefs, and collectionRefs
  2. Generate: Use templates and collections to produce fillings; materialize as communications
  3. Deploy: The corpusRef on experimentDef identifies the item pool; personaRef identifies the annotator role
  4. Collect: Each annotator creates a judgmentSet record in their own PDS containing their judgments
  5. Analyze: An agreementReport record summarizes inter-annotator agreement across judgment sets
  6. Iterate: Active learning workflows use response times and confidence scores to select new items; new experiment rounds create new experimentDef records

Decentralized Advantage

In bead, experiments are typically run on centralized platforms. In Layers:

  • Annotators own their judgment data in their own PDSes
  • Templates, lexicons, and fillings are independently publishable and reusable
  • Multiple independent experimenters can create overlapping experiments on the same corpus
  • Agreement reports can be computed by any party with access to the judgment sets
  • The full provenance chain (judgment → filling → template → lexicon → entry) is traceable via AT-URIs
  • Experiments are reproducible because all data is versioned and content-addressed via ATProto