Validation¶
Learn how to validate your dataset structure to catch errors early.
Basic Validation¶
from dummyxarray import DummyDataset
import numpy as np
ds = DummyDataset()
ds.add_dim("time", 10)
data = np.random.rand(10)
ds.add_variable("temperature", dims=["time"], data=data)
# Validate the dataset
ds.validate() # Raises ValueError if validation fails
What Gets Validated¶
The validation checks for:
- Unknown dimensions: Variables reference dimensions that don't exist
- Shape mismatches: Data shape doesn't match declared dimensions
- Missing coordinates (optional): Variables use dimensions without corresponding coordinates
Dimension Checks¶
ds = DummyDataset()
ds.add_variable("temp", dims=["unknown_dim"])
try:
ds.validate()
except ValueError as e:
print(f"Error: {e}")
# Error: Dataset validation failed:
# temp: Unknown dimension 'unknown_dim'.
Shape Validation¶
Shape mismatches are caught when adding variables:
ds = DummyDataset()
ds.add_dim("time", 10)
ds.add_dim("lat", 5)
# This will raise an error immediately
data = np.random.rand(10, 6) # Wrong shape!
try:
ds.add_variable("test", dims=["time", "lat"], data=data)
except ValueError as e:
print(f"Error: {e}")
# Error: Dimension mismatch for 'lat': existing=5 new=6
Strict Coordinate Validation¶
Enable strict mode to require coordinates for all dimensions:
ds = DummyDataset()
ds.add_dim("time", 10)
data = np.random.rand(10)
ds.add_variable("temp", dims=["time"], data=data)
# Strict validation requires coordinates
try:
ds.validate(strict_coords=True)
except ValueError as e:
print(f"Error: {e}")
# Error: temp: Missing coordinate for dimension 'time'.
Validation in Workflows¶
Validation is automatically called before conversion: