Skip to content

Video Annotation

Define spatial environments and regions by annotating video frames directly using an interactive napari interface.

Overview

The video annotation workflow allows you to:

  1. Draw environment boundaries from video frames (e.g., arena walls)
  2. Define holes/obstacles within the environment (excluded areas)
  3. Create named regions for behavioral analysis (e.g., reward zones, nest areas)
  4. Apply calibration to convert pixel coordinates to real-world units (cm)

This is particularly useful when you have tracking data and want to define the spatial structure from the same video used for tracking.

Quick Start

from neurospatial.annotation import annotate_video

# Launch interactive annotation
result = annotate_video("experiment.mp4", bin_size=2.0)

# Access results
env = result.environment  # Discretized Environment
regions = result.regions   # Named Regions

Complete Workflow

Step 1: Launch Annotation

from neurospatial.annotation import annotate_video

# Basic annotation (pixel coordinates)
result = annotate_video(
    "experiment.mp4",
    frame_index=0,        # Which frame to display
    bin_size=2.0,         # Grid resolution for environment
    mode="both",          # Annotate boundary + regions
)

Step 2: Draw Annotations

When the napari viewer opens:

  1. Draw the environment boundary (cyan polygon):
  2. Click points to define vertices
  3. Press Enter to complete the polygon
  4. This defines the spatial extent of your environment

  5. Press M to cycle to hole mode (red polygon):

  6. Draw any obstacles or excluded areas inside the boundary
  7. Holes are subtracted from the environment

  8. Press M again for region mode (yellow polygon):

  9. Enter a name for each region before drawing
  10. Draw polygons for reward zones, nest areas, etc.

  11. Press Escape or click "Save and Close" to return results

Step 3: Use the Results

# The environment is ready to use
print(f"Environment has {env.n_bins} bins")

# Access regions
for name, region in result.regions.items():
    print(f"Region '{name}': {region.kind}")

# Use with tracking data
bin_indices = env.bin_sequence(positions)

Keyboard Shortcuts

Key Action
M Cycle annotation mode (environment → hole → region)
3 Move shape mode
4 Edit vertices mode
Delete Remove selected shape
Escape Save and close viewer
Ctrl+Z Undo last action

Adding Calibration

Convert pixel coordinates to real-world units (e.g., centimeters):

Using Scale Bar

from neurospatial.transforms import VideoCalibration, calibrate_from_scale_bar

# Two points on a known length in the video
point1_px = (100, 200)  # Start of scale bar (pixels)
point2_px = (300, 200)  # End of scale bar (pixels)
known_length_cm = 50.0  # Known length in cm
frame_size = (640, 480) # Video dimensions

transform = calibrate_from_scale_bar(
    point1_px, point2_px, known_length_cm, frame_size
)
calibration = VideoCalibration(transform, frame_size)

# Annotate with calibration
result = annotate_video(
    "experiment.mp4",
    calibration=calibration,
    bin_size=2.0,  # Now in cm!
)

Using Landmark Correspondences

from neurospatial.transforms import calibrate_from_landmarks, VideoCalibration
import numpy as np

# Known correspondences: video pixels → environment cm
landmarks_px = np.array([
    [50, 50],    # Top-left corner in pixels
    [590, 50],   # Top-right corner in pixels
    [590, 430],  # Bottom-right corner in pixels
    [50, 430],   # Bottom-left corner in pixels
])
landmarks_cm = np.array([
    [0, 0],      # Top-left in cm
    [100, 0],    # Top-right in cm
    [100, 80],   # Bottom-right in cm
    [0, 80],     # Bottom-left in cm
])

transform = calibrate_from_landmarks(
    landmarks_px, landmarks_cm, frame_size_px=(640, 480)
)
calibration = VideoCalibration(transform, (640, 480))

result = annotate_video(
    "experiment.mp4",
    calibration=calibration,
    bin_size=2.0,
)

Simplifying Hand-Drawn Polygons

Hand-drawn polygons often have jagged edges. Use simplify_tolerance to smooth them:

result = annotate_video(
    "experiment.mp4",
    bin_size=2.0,
    simplify_tolerance=1.0,  # Douglas-Peucker tolerance in output units
)

Higher values produce smoother polygons but may lose detail.

Annotation Modes

Control what to annotate with the mode parameter:

# Both boundary and regions (default)
result = annotate_video("video.mp4", mode="both", bin_size=2.0)

# Only environment boundary
result = annotate_video("video.mp4", mode="environment", bin_size=2.0)

# Only regions (no environment created)
result = annotate_video("video.mp4", mode="regions")
# Note: bin_size not required for regions-only mode

Handling Multiple Boundaries

If you accidentally draw multiple environment boundaries, control the behavior:

# Use the last drawn boundary (default)
result = annotate_video(
    "video.mp4",
    bin_size=2.0,
    multiple_boundaries="last",
)

# Use the first drawn boundary
result = annotate_video(
    "video.mp4",
    bin_size=2.0,
    multiple_boundaries="first",
)

# Raise error if multiple boundaries (strict mode)
result = annotate_video(
    "video.mp4",
    bin_size=2.0,
    multiple_boundaries="error",
)

Editing Existing Regions

Provide existing regions to edit them:

from neurospatial.regions import Regions, Region
from shapely.geometry import Polygon

# Create initial regions
initial = Regions([
    Region("reward_zone", "polygon", Polygon([(10, 10), (20, 10), (20, 20), (10, 20)])),
])

# Edit them interactively
result = annotate_video(
    "video.mp4",
    initial_regions=initial,
    bin_size=2.0,
)

Importing from External Tools

Import annotations created in other tools:

LabelMe

from neurospatial.annotation import regions_from_labelme

# Without calibration (pixel coordinates)
regions = regions_from_labelme("annotations.json")

# With calibration (cm coordinates)
regions = regions_from_labelme("annotations.json", calibration=calibration)

CVAT

from neurospatial.annotation import regions_from_cvat

regions = regions_from_cvat("cvat_export.xml", calibration=calibration)

Role Types

Shapes are assigned roles that determine their purpose:

Role Color Purpose
environment Cyan Primary boundary defining spatial extent
hole Red Excluded areas within the boundary
region Yellow Named regions of interest

Access the role type in your code:

from neurospatial.annotation import Role

# Type hint for role parameters
def process_annotation(role: Role) -> None:
    if role == "environment":
        # Handle boundary
        ...

API Reference

annotate_video

Launch interactive napari annotation on a video frame.

Opens a napari viewer with the specified video frame. Users can draw polygons to define an environment boundary and/or named regions. After closing the viewer, annotations are converted to Regions and optionally an Environment.

Parameters:

Name Type Description Default
video_path str or Path

Path to video file (any format supported by OpenCV).

required
config AnnotationConfig

Configuration for annotation UI settings. Groups frame_index, simplify_tolerance, multiple_boundaries, and show_positions. Individual parameters override config values if both provided.

None
initial_regions Regions

Pre-existing regions to display for editing.

None
calibration VideoCalibration

Pixel-to-cm transform. If provided, output coordinates are in cm. If None, coordinates remain in pixels.

None
mode ('environment', 'regions', 'both')

What to annotate: - "environment": Only expect environment boundary - "regions": Only expect named regions - "both": Expect both boundary and regions

"environment"
bin_size float

Bin size for environment discretization. Required if mode is "environment" or "both".

None
initial_boundary Polygon or NDArray

Pre-drawn boundary for editing. Can be:

  • Shapely Polygon: Used directly as boundary
  • NDArray (n, 2): Position data to infer boundary from

If None, user draws boundary manually.

None
boundary_config BoundaryConfig

Configuration for boundary inference when initial_boundary is an array. If None, uses BoundaryConfig defaults (convex_hull, 2% buffer, 1% simplify).

None
frame_index int

Which frame to display for annotation. Overrides config.frame_index. Default is 0 (first frame).

None
simplify_tolerance float

Tolerance for polygon simplification using Douglas-Peucker algorithm. Removes vertices that deviate less than this distance from the simplified line. Overrides config.simplify_tolerance.

Units depend on calibration: - With calibration: environment units (typically cm) - Without calibration: pixels

Recommended values: - For cm: 1.0-2.0 (removes hand-drawn jitter) - For pixels: 2.0-5.0

None
multiple_boundaries ('last', 'first', 'error')

How to handle multiple environment boundaries. Overrides config.multiple_boundaries. Default is "last".

  • "last": Use the last drawn boundary (default). A warning is emitted.
  • "first": Use the first drawn boundary. A warning is emitted.
  • "error": Raise ValueError if multiple boundaries are drawn.
"last"
show_positions bool

If True and initial_boundary is an array, show positions as a Points layer for reference while editing. Overrides config.show_positions. Default is False.

None

Returns:

Type Description
AnnotationResult

Named tuple containing: - environment: Environment or None - regions: Regions collection

Raises:

Type Description
ValueError

If bin_size is not provided when mode requires environment creation, or if multiple_boundaries="error" and multiple environment boundaries are drawn.

ImportError

If napari is not installed.

Examples:

>>> from neurospatial.annotation import annotate_video
>>> # Simple annotation (pixel coordinates)
>>> result = annotate_video("experiment.mp4", bin_size=10.0)
>>> print(result.environment)  # Environment from boundary
>>> print(result.regions)  # Named regions
>>> # With calibration (cm coordinates)
>>> from neurospatial.transforms import VideoCalibration, calibrate_from_scale_bar
>>> transform = calibrate_from_scale_bar((0, 0), (200, 0), 100.0, (640, 480))
>>> calib = VideoCalibration(transform, (640, 480))
>>> result = annotate_video("experiment.mp4", calibration=calib, bin_size=2.0)
Notes

This function blocks until the napari viewer is closed. The viewer runs in the same Python process, and the function returns only after the user closes it (via the "Save and Close" button, Escape key, or window close).

If multiple environment boundaries are drawn, only the last one is used and a warning is emitted.

Environments with Holes ^^^^^^^^^ Users can draw "hole" polygons inside the environment boundary to create excluded areas. Press M to cycle to hole mode (red) after drawing the boundary. Holes are subtracted from the boundary using Shapely's difference operation before creating the Environment.

Coordinate Systems ^^^^^^ - Napari shapes: (row, col) with origin at top-left - Video pixels: (x, y) with origin at top-left - Environment: (x, y) with origin at bottom-left (if calibrated)

See Also

regions_from_labelme : Import from LabelMe JSON regions_from_cvat : Import from CVAT XML AnnotationConfig : Configuration dataclass for annotation settings.

Source code in src/neurospatial/annotation/core.py
def annotate_video(
    video_path: str | Path,
    *,
    config: AnnotationConfig | None = None,
    initial_regions: Regions | None = None,
    calibration: VideoCalibration | None = None,
    mode: Literal["environment", "regions", "both"] = "both",
    bin_size: float | None = None,
    initial_boundary: Polygon | NDArray[np.float64] | None = None,
    boundary_config: BoundaryConfig | None = None,
    # Individual params (can be overridden by config)
    frame_index: int | None = None,
    simplify_tolerance: float | None = None,
    multiple_boundaries: MultipleBoundaryStrategy | None = None,
    show_positions: bool | None = None,
) -> AnnotationResult:
    """Launch interactive napari annotation on a video frame.

    Opens a napari viewer with the specified video frame. Users can draw
    polygons to define an environment boundary and/or named regions.
    After closing the viewer, annotations are converted to Regions and
    optionally an Environment.

    Parameters
    ----------
    video_path : str or Path
        Path to video file (any format supported by OpenCV).
    config : AnnotationConfig, optional
        Configuration for annotation UI settings. Groups frame_index,
        simplify_tolerance, multiple_boundaries, and show_positions.
        Individual parameters override config values if both provided.
    initial_regions : Regions, optional
        Pre-existing regions to display for editing.
    calibration : VideoCalibration, optional
        Pixel-to-cm transform. If provided, output coordinates are in cm.
        If None, coordinates remain in pixels.
    mode : {"environment", "regions", "both"}, default="both"
        What to annotate:
        - "environment": Only expect environment boundary
        - "regions": Only expect named regions
        - "both": Expect both boundary and regions
    bin_size : float, optional
        Bin size for environment discretization. Required if mode is
        "environment" or "both".
    initial_boundary : Polygon or NDArray, optional
        Pre-drawn boundary for editing. Can be:

        - Shapely Polygon: Used directly as boundary
        - NDArray (n, 2): Position data to infer boundary from

        If None, user draws boundary manually.
    boundary_config : BoundaryConfig, optional
        Configuration for boundary inference when initial_boundary is an array.
        If None, uses BoundaryConfig defaults (convex_hull, 2% buffer, 1% simplify).
    frame_index : int, optional
        Which frame to display for annotation. Overrides config.frame_index.
        Default is 0 (first frame).
    simplify_tolerance : float, optional
        Tolerance for polygon simplification using Douglas-Peucker algorithm.
        Removes vertices that deviate less than this distance from the simplified line.
        Overrides config.simplify_tolerance.

        Units depend on calibration:
        - With calibration: environment units (typically cm)
        - Without calibration: pixels

        Recommended values:
        - For cm: 1.0-2.0 (removes hand-drawn jitter)
        - For pixels: 2.0-5.0
    multiple_boundaries : {"last", "first", "error"}, optional
        How to handle multiple environment boundaries.
        Overrides config.multiple_boundaries. Default is "last".

        - "last": Use the last drawn boundary (default). A warning is emitted.
        - "first": Use the first drawn boundary. A warning is emitted.
        - "error": Raise ValueError if multiple boundaries are drawn.
    show_positions : bool, optional
        If True and initial_boundary is an array, show positions as a
        Points layer for reference while editing.
        Overrides config.show_positions. Default is False.

    Returns
    -------
    AnnotationResult
        Named tuple containing:
        - environment: Environment or None
        - regions: Regions collection

    Raises
    ------
    ValueError
        If bin_size is not provided when mode requires environment creation,
        or if ``multiple_boundaries="error"`` and multiple environment
        boundaries are drawn.
    ImportError
        If napari is not installed.

    Examples
    --------
    >>> from neurospatial.annotation import annotate_video
    >>> # Simple annotation (pixel coordinates)
    >>> result = annotate_video("experiment.mp4", bin_size=10.0)
    >>> print(result.environment)  # Environment from boundary
    >>> print(result.regions)  # Named regions

    >>> # With calibration (cm coordinates)
    >>> from neurospatial.transforms import VideoCalibration, calibrate_from_scale_bar
    >>> transform = calibrate_from_scale_bar((0, 0), (200, 0), 100.0, (640, 480))
    >>> calib = VideoCalibration(transform, (640, 480))
    >>> result = annotate_video("experiment.mp4", calibration=calib, bin_size=2.0)

    Notes
    -----
    This function blocks until the napari viewer is closed. The viewer runs
    in the same Python process, and the function returns only after the user
    closes it (via the "Save and Close" button, Escape key, or window close).

    If multiple environment boundaries are drawn, only the last one is used
    and a warning is emitted.

    Environments with Holes
    ^^^^^^^^^^^^^^^^^^^^^^^
    Users can draw "hole" polygons inside the environment boundary to create
    excluded areas. Press M to cycle to hole mode (red) after drawing the
    boundary. Holes are subtracted from the boundary using Shapely's
    difference operation before creating the Environment.

    Coordinate Systems
    ^^^^^^^^^^^^^^^^^^
    - Napari shapes: (row, col) with origin at top-left
    - Video pixels: (x, y) with origin at top-left
    - Environment: (x, y) with origin at bottom-left (if calibrated)

    See Also
    --------
    regions_from_labelme : Import from LabelMe JSON
    regions_from_cvat : Import from CVAT XML
    AnnotationConfig : Configuration dataclass for annotation settings.

    """
    # Resolve config with individual parameter overrides
    resolved = _resolve_config_params(
        config,
        frame_index,
        simplify_tolerance,
        multiple_boundaries,
        show_positions,
    )
    logger.debug(
        "Resolved config: frame_index=%d, simplify_tolerance=%s, "
        "multiple_boundaries=%s, show_positions=%s",
        resolved.frame_index,
        resolved.simplify_tolerance,
        resolved.multiple_boundaries,
        resolved.show_positions,
    )

    # Validate parameters early (before expensive imports)
    _validate_annotate_params(mode, bin_size)

    try:
        import napari
    except ImportError as e:
        raise ImportError(
            "napari is required for interactive annotation. "
            "Install with: pip install napari[all]",
        ) from e

    # Convert to Path and load video frame
    video_path = Path(video_path)
    logger.debug("Loading video frame %d from %s", resolved.frame_index, video_path)
    frame = _load_video_frame(video_path, resolved.frame_index)
    logger.debug("Loaded frame with shape %s", frame.shape)

    # Process initial boundary (from polygon or positions)
    boundary_polygon, positions_for_display = _process_initial_boundary(
        initial_boundary,
        boundary_config,
        resolved.show_positions,
    )

    # Handle conflict: initial_boundary takes precedence over env regions in initial_regions
    if boundary_polygon is not None and initial_regions is not None:
        initial_regions = _filter_environment_regions(initial_regions)

    # Setup napari viewer with all layers and widgets
    _viewer, shapes = _setup_annotation_viewer(
        video_path,
        frame,
        mode,
        boundary_polygon,
        positions_for_display,
        initial_regions,
        calibration,
    )

    # Run napari (blocking until viewer closes)
    napari.run()

    # Convert annotations to result
    return _process_annotation_results(
        shapes,
        mode,
        bin_size,
        calibration,
        resolved.simplify_tolerance,
        resolved.multiple_boundaries,
    )

AnnotationResult

Bases: NamedTuple

Result from an annotation session.

Attributes:

Name Type Description
environment Environment or None

Discretized environment if boundary was annotated, else None.

regions Regions

All annotated regions (excluding environment boundary).

regions_from_labelme

Load regions from LabelMe JSON with optional calibration.

Parameters:

Name Type Description Default
json_path str or Path

Path to LabelMe JSON file.

required
calibration VideoCalibration

If provided, transforms pixel coordinates to world coordinates (cm).

None
label_key str

Key in JSON for region name.

"label"
points_key str

Key in JSON for polygon vertices.

"points"

Returns:

Type Description
Regions

Loaded regions with coordinates in cm (if calibrated) or pixels.

See Also

neurospatial.regions.io.load_labelme_json : Underlying implementation.

Examples:

>>> from neurospatial.annotation import regions_from_labelme
>>> from neurospatial.transforms import VideoCalibration, calibrate_from_scale_bar
>>> # Without calibration (pixel coordinates)
>>> regions = regions_from_labelme("annotations.json")
>>> # With calibration (cm coordinates)
>>> transform = calibrate_from_scale_bar((0, 0), (100, 0), 50.0, (640, 480))
>>> calib = VideoCalibration(transform, (640, 480))
>>> regions = regions_from_labelme("annotations.json", calibration=calib)
Source code in src/neurospatial/annotation/io.py
def regions_from_labelme(
    json_path: str | Path,
    calibration: VideoCalibration | None = None,
    *,
    label_key: str = "label",
    points_key: str = "points",
) -> Regions:
    """Load regions from LabelMe JSON with optional calibration.

    Parameters
    ----------
    json_path : str or Path
        Path to LabelMe JSON file.
    calibration : VideoCalibration, optional
        If provided, transforms pixel coordinates to world coordinates (cm).
    label_key : str, default="label"
        Key in JSON for region name.
    points_key : str, default="points"
        Key in JSON for polygon vertices.

    Returns
    -------
    Regions
        Loaded regions with coordinates in cm (if calibrated) or pixels.

    See Also
    --------
    neurospatial.regions.io.load_labelme_json : Underlying implementation.

    Examples
    --------
    >>> from neurospatial.annotation import regions_from_labelme
    >>> from neurospatial.transforms import VideoCalibration, calibrate_from_scale_bar
    >>> # Without calibration (pixel coordinates)
    >>> regions = regions_from_labelme("annotations.json")
    >>> # With calibration (cm coordinates)
    >>> transform = calibrate_from_scale_bar((0, 0), (100, 0), 50.0, (640, 480))
    >>> calib = VideoCalibration(transform, (640, 480))
    >>> regions = regions_from_labelme("annotations.json", calibration=calib)

    """
    from neurospatial.regions.io import load_labelme_json

    pixel_to_world = calibration.transform_px_to_cm if calibration else None
    return load_labelme_json(
        json_path,
        pixel_to_world=pixel_to_world,
        label_key=label_key,
        points_key=points_key,
    )

regions_from_cvat

Load regions from CVAT XML with optional calibration.

Parameters:

Name Type Description Default
xml_path str or Path

Path to CVAT XML export file.

required
calibration VideoCalibration

If provided, transforms pixel coordinates to world coordinates (cm).

None

Returns:

Type Description
Regions

Loaded regions with coordinates in cm (if calibrated) or pixels.

See Also

neurospatial.regions.io.load_cvat_xml : Underlying implementation.

Examples:

>>> from neurospatial.annotation import regions_from_cvat
>>> regions = regions_from_cvat("cvat_export.xml")
Source code in src/neurospatial/annotation/io.py
def regions_from_cvat(
    xml_path: str | Path,
    calibration: VideoCalibration | None = None,
) -> Regions:
    """Load regions from CVAT XML with optional calibration.

    Parameters
    ----------
    xml_path : str or Path
        Path to CVAT XML export file.
    calibration : VideoCalibration, optional
        If provided, transforms pixel coordinates to world coordinates (cm).

    Returns
    -------
    Regions
        Loaded regions with coordinates in cm (if calibrated) or pixels.

    See Also
    --------
    neurospatial.regions.io.load_cvat_xml : Underlying implementation.

    Examples
    --------
    >>> from neurospatial.annotation import regions_from_cvat
    >>> regions = regions_from_cvat("cvat_export.xml")

    """
    from neurospatial.regions.io import load_cvat_xml

    pixel_to_world = calibration.transform_px_to_cm if calibration else None
    return load_cvat_xml(xml_path, pixel_to_world=pixel_to_world)

Common Issues

"napari is not installed"

Install napari with:

pip install napari[all]

Shapes don't appear

Ensure you're in polygon drawing mode (default). Press 3 to move shapes or 4 to edit vertices.

Calibration appears wrong

Check that:

  • Y-axis convention matches your data (scientific data typically uses Y-up)
  • Scale factor is correct (verify by measuring known distances)
  • Landmark correspondences are accurate

Viewer closes unexpectedly

The viewer blocks until closed. If it closes without saving, check for Python errors in the console.

See Also