MaskArchive

MaskArchive bundles multiple segmentation masks sharing identical space (shape, spacing, origin) into one archive—ideal for large organ sets (lungs: lobes, segments, lesions, whole lung). All contained masks must share the same space.

1. Scenario: Multi-Level Lung Segmentation

  • 5 lobes (mutually exclusive)
  • 18 segments (mutually exclusive but overlap lobes)
  • N lesions (overlap everything)
  • Whole lung (overlaps all)

Traditional approach: store 5+18+N+1 separate files.

2. Synthetic Data Setup

import numpy as np
import matplotlib.pyplot as plt
from medmask import SegmentationMask, MaskArchive
from spacetransformer import Space
from pathlib import Path
import time

shape = (1,64,64)
space = Space(shape=shape, spacing=(1.0,1.0,1.0), origin=(0.0,0.0,0.0))

lobe_mask = np.zeros(shape, dtype=np.uint8)
# Five lobes
lobe_mask[0,10:30,10:25] = 1
lobe_mask[0,35:55,10:25] = 2
lobe_mask[0,10:25,40:55] = 3
lobe_mask[0,30:45,40:55] = 4
lobe_mask[0,50:60,40:55] = 5
lobe_mapping = {
    "left_upper_lobe":1,
    "left_lower_lobe":2,
    "right_upper_lobe":3,
    "right_middle_lobe":4,
    "right_lower_lobe":5
}

segment_mask = np.zeros(shape, dtype=np.uint8)
segment_mask[0,10:18,10:18] = 1
segment_mask[0,18:25,12:20] = 2
segment_mask[0,22:30,17:25] = 3
segment_mask[0,35:42,10:18] = 4
segment_mask[0,42:50,12:20] = 5
segment_mask[0,48:55,17:25] = 6
segment_mask[0,10:18,40:48] = 7
segment_mask[0,18:25,42:50] = 8
segment_mask[0,30:38,40:48] = 9
segment_mask[0,38:45,42:50] = 10
segment_mapping = {
    "LUL_S1":1,"LUL_S2":2,"LUL_S3":3,
    "LLL_S4":4,"LLL_S5":5,"LLL_S6":6,
    "RUL_S1":7,"RUL_S2":8,
    "RML_S4":9,"RML_S5":10
}

lesion_mask = np.zeros(shape, dtype=np.uint8)
lesion_mask[0,15:20,15:20] = 1
lesion_mask[0,40:45,15:20] = 2
lesion_mask[0,25:30,45:50] = 3
lesion_mapping = {"nodule_1":1,"nodule_2":2,"mass_1":3}

whole_lung_mask = np.zeros(shape, dtype=np.uint8)
whole_lung_mask[0,8:62,8:57] = 1
whole_lung_mapping = {"whole_lung":1}

print("Setup complete")

3. Visualize Masks

fig, axes = plt.subplots(2,2, figsize=(12,10))
plots = [
    (lobe_mask[0], "Lobes", "Set3"),
    (segment_mask[0], "Segments", "Set3"),
    (lesion_mask[0], "Lesions", "Set3"),
    (whole_lung_mask[0], "Whole lung", "Set3")
]
for ax, (mask, title, cmap) in zip(axes.flatten(), plots):
    ax.imshow(mask, cmap=cmap, alpha=0.8)
    ax.set_title(title)
    ax.axis('off')
plt.tight_layout(); plt.show()

4. Traditional: Separate Files

print("=== Separate files ===")
start = time.time()

lobe_segmask = SegmentationMask(lobe_mask, lobe_mapping, space=space)
segment_segmask = SegmentationMask(segment_mask, segment_mapping, space=space)
lesion_segmask = SegmentationMask(lesion_mask, lesion_mapping, space=space)
whole_segmask = SegmentationMask(whole_lung_mask, whole_lung_mapping, space=space)

lobe_segmask.save("lung_lobes.msk")
segment_segmask.save("lung_segments.msk")
lesion_segmask.save("lung_lesions.msk")
whole_segmask.save("whole_lung.msk")

separate_time = time.time() - start
files = ["lung_lobes.msk","lung_segments.msk","lung_lesions.msk","whole_lung.msk"]
size_total = sum(Path(f).stat().st_size for f in files)
print(f"Time: {separate_time:.3f}s, files: {len(files)}, size: {size_total/1024:.1f} KB")

5. MaskArchive

print("\n=== MaskArchive ===")
start = time.time()
archive = MaskArchive("lung_analysis.mska", mode="w", space=space)
archive.add_segmask(lobe_segmask, "lobes")
archive.add_segmask(segment_segmask, "segments")
archive.add_segmask(lesion_segmask, "lesions")
archive.add_segmask(whole_segmask, "whole_lung")
archive_time = time.time() - start
archive_size = Path("lung_analysis.mska").stat().st_size
print(f"Time: {archive_time:.3f}s, files: 1, size: {archive_size/1024:.1f} KB")
print("Masks in archive:", archive.all_names())
print("Compression ratio:", size_total/archive_size if archive_size else 'N/A')

6. Access + Integrity

reader = MaskArchive("lung_analysis.mska", mode="r")
print("Names:", reader.all_names())

loaded_lobes = reader.load_segmask("lobes")
loaded_lesions = reader.load_segmask("lesions")
print("Shapes match:", np.array_equal(lobe_mask, loaded_lobes.data))

left_upper = loaded_lobes.get_binary_mask_by_names("left_upper_lobe")
nodule = loaded_lesions.get_binary_mask_by_names("nodule_1")
print("Left upper lobe voxels:", left_upper.sum())
print("Nodule 1 voxels:", nodule.sum())

Recommendations

Use MaskArchive when:

  • Many masks share identical grid (organ atlases, multi-stage annotations)
  • Need a single artifact for storage/transfer/versioning
  • Want consistent semantics via embedded mappings

Keep separate files when:

  • Masks come from differing grids/spaces
  • Legacy systems require individual formats

MaskArchive reduces file sprawl while preserving full metadata and semantics.