CF Standards with cf_xarray¶
dummyxarray uses cf_xarray to apply community-agreed CF standards to your datasets.
Overview¶
dummyxarray integrates cf_xarray to provide:
- Community standards - Based on official CF conventions
- Automatic detection - Uses criteria from MetPy and Iris
- Ecosystem integration - Consistent with xarray tools
- Comprehensive validation - Beyond basic checks
Why cf_xarray?¶
Instead of creating our own interpretation of CF standards, we use cf_xarray
which implements community-agreed criteria for:
- Axis detection (X, Y, Z, T)
- Coordinate identification
- Attribute requirements
- Standard names
This ensures your datasets are compatible with the broader scientific Python ecosystem.
Installation¶
cf_xarray is automatically installed as a dependency when you install dummyxarray:
Basic Usage¶
Apply CF Standards¶
from dummyxarray import DummyDataset
ds = DummyDataset()
ds.add_dim("time", 12)
ds.add_coord(
"time",
dims=["time"],
attrs={"units": "days since 2000-01-01"}
)
# Apply CF standards using cf_xarray (no data needed!)
result = ds.apply_cf_standards()
# Check what was detected
print(result['axes_detected']) # {'time': 'T'}
print(result['attrs_added']) # {'time': {'axis': 'T', ...}}
# Coordinate now has proper CF attributes
print(ds.coords['time'].attrs)
# {'units': 'days since 2000-01-01', 'axis': 'T', 'standard_name': 'time'}
Validate CF Metadata¶
# Validate against CF standards (no data needed!)
result = ds.validate_cf_metadata()
print(f"Valid: {result['valid']}")
print(f"Errors: {result['errors']}")
print(f"Warnings: {result['warnings']}")
print(f"Suggestions: {result['suggestions']}")
Works Without Data! 🎉¶
Great news: Both apply_cf_standards() and validate_cf_metadata() now work
without requiring data to be populated!
How?¶
The functions automatically create temporary dummy arrays (zeros) just for cf_xarray processing, then discard them - keeping only the detected metadata. Your actual dataset remains metadata-only.
# Works perfectly without data!
ds = DummyDataset()
ds.add_coord("time", dims=["time"], attrs={"units": "days since 2000-01-01"})
ds.add_variable("temperature", dims=["time"], attrs={"units": "K"})
# No data needed - metadata-only!
result = ds.apply_cf_standards()
print(result['axes_detected']) # {'time': 'T'}
# Data is still None
print(ds.coords['time'].data) # None
Optional: Use Real Data¶
If you already have data or want to populate it, that works too:
# Optional: populate with data
ds.populate_with_random_data(seed=42)
# Works the same way
ds.apply_cf_standards()
Complete Example¶
from dummyxarray import DummyDataset
# Create dataset with minimal metadata
ds = DummyDataset()
ds.assign_attrs(Conventions="CF-1.8", title="Climate Data")
# Add coordinates with basic attributes
ds.add_dim("time", 365)
ds.add_dim("lat", 64)
ds.add_dim("lon", 128)
ds.add_coord("time", dims=["time"],
attrs={"units": "days since 2000-01-01"})
ds.add_coord("lat", dims=["lat"],
attrs={"units": "degrees_north"})
ds.add_coord("lon", dims=["lon"],
attrs={"units": "degrees_east"})
ds.add_variable("temperature", dims=["time", "lat", "lon"],
attrs={"standard_name": "air_temperature", "units": "K"})
# Apply CF standards
if ds.check_cf_standards_available():
result = ds.apply_cf_standards()
print(f"Detected axes: {result['axes_detected']}")
# Output: {'time': 'T', 'lat': 'Y', 'lon': 'X'}
# Validate
validation = ds.validate_cf_metadata()
if validation['valid']:
print("✓ CF compliant!")
else:
print("Install cf_xarray for CF standards support")
What Gets Added¶
cf_xarray automatically adds appropriate attributes based on detection:
Time Coordinates¶
# Before
attrs = {"units": "days since 2000-01-01"}
# After apply_cf_standards()
attrs = {
"units": "days since 2000-01-01",
"axis": "T",
"standard_name": "time"
}
Latitude Coordinates¶
# Before
attrs = {"units": "degrees_north"}
# After
attrs = {
"units": "degrees_north",
"axis": "Y",
"standard_name": "latitude"
}
Longitude Coordinates¶
# Before
attrs = {"units": "degrees_east"}
# After
attrs = {
"units": "degrees_east",
"axis": "X",
"standard_name": "longitude"
}
Vertical Coordinates¶
# Before
attrs = {"units": "hPa", "positive": "down"}
# After
attrs = {
"units": "hPa",
"positive": "down",
"axis": "Z",
"standard_name": "air_pressure"
}
Detection Criteria¶
cf_xarray uses sophisticated criteria to detect coordinates:
By Units¶
degrees_north,degree_north,degrees_N→ Latitude (Y)degrees_east,degree_east,degrees_E→ Longitude (X)- Time units like
days since YYYY-MM-DD→ Time (T) - Pressure units like
hPa,Pa,mbar→ Vertical (Z)
By Standard Name¶
latitude,grid_latitude→ Y axislongitude,grid_longitude→ X axistime→ T axisair_pressure,altitude,height→ Z axis
By Axis Attribute¶
axis="X"→ X axisaxis="Y"→ Y axisaxis="Z"→ Z axisaxis="T"→ T axis
By Name Pattern¶
- Names like
lat,latitude,y→ Y axis - Names like
lon,longitude,x→ X axis - Names like
time,t→ T axis - Names like
lev,level,z,height→ Z axis
Built-in vs cf_xarray¶
| Feature | Built-in | cf_xarray |
|---|---|---|
| Dependencies | None | cf_xarray |
| Speed | Fast | Moderate |
| Standards | Basic | Community-agreed |
| Detection | Simple patterns | Comprehensive criteria |
| Validation | Essential checks | Thorough |
| Ecosystem | Standalone | xarray-compatible |
When to Use Built-in¶
- Quick prototyping
- No external dependencies needed
- Simple datasets
- Fast iteration
When to Use cf_xarray¶
- Production datasets
- Publishing data
- Ecosystem compatibility
- Comprehensive validation
- Following community standards
Workflow Recommendations¶
Development Workflow¶
# During development: use built-in for speed
ds.infer_axis()
ds.set_axis_attributes()
result = ds.validate_cf()
Production Workflow¶
# For production: use cf_xarray for standards
if ds.check_cf_standards_available():
ds.apply_cf_standards()
result = ds.validate_cf_metadata(strict=True)
if not result['valid']:
print("Fix these issues:")
for error in result['errors']:
print(f" - {error}")
else:
# Fallback to built-in
ds.infer_axis()
ds.set_axis_attributes()
Advanced Usage¶
Verbose Mode¶
Strict Validation¶
Check Availability¶
# Check before using
if ds.check_cf_standards_available():
ds.apply_cf_standards()
else:
print("Install: pip install cf_xarray")
Integration with Other Tools¶
cf_xarray makes your datasets compatible with:
- xarray - Native integration
- MetPy - Meteorological calculations
- Iris - Climate data analysis
- Cartopy - Map projections
- Dask - Parallel computing
Troubleshooting¶
cf_xarray Not Available¶
# Check if installed
import importlib.util
if importlib.util.find_spec("cf_xarray") is None:
print("Install: pip install cf_xarray")
Attributes Not Detected¶
If cf_xarray doesn't detect your coordinates:
- Check units are CF-compliant
- Add
standard_nameattribute - Use recognized coordinate names
- Add
axisattribute explicitly
Validation Warnings¶
Common warnings and fixes:
# Warning: Missing standard_name
ds.coords['time'].attrs['standard_name'] = 'time'
# Warning: Missing axis
ds.coords['lat'].attrs['axis'] = 'Y'
# Warning: Non-standard units
ds.coords['lat'].attrs['units'] = 'degrees_north' # Not 'deg N'
Best Practices¶
- Start minimal - Add basic attributes, let cf_xarray fill the rest
- Use standard names - Follow CF standard name table
- Validate early - Check compliance during development
- Document choices - Use history tracking for decisions
- Test compatibility - Verify with xarray and other tools
Resources¶
See Also¶
- CF Compliance - Built-in CF features
- Validation - Dataset validation
- Examples - More CF standards examples