Loading SACC Data with TwoPointFactory

Version ?env:FIRECROWN_VERSION

Authors

Marc Paterno

Sandro Vitenti

Purpose of this Document

This tutorial demonstrates how to load data from SACC files and construct TwoPoint objects using the TwoPointFactory.

For an overview of the factory system, see Two-Point Factory Basics.

Working with SACC Objects

A SACC object provides all components needed for a statistical analysis in Firecrown:

  • Metadata: Layout, data types, binning, tracer names.
  • Calibration data: Redshift distributions \(\mathrm{d}n/\mathrm{d}z\) for each bin.
  • Data: Measurements (e.g., power spectra).
  • Covariance: Uncertainties and correlations.

Firecrown supports two workflows: the recommended full extraction approach, and a legacy indices-only approach, now deprecated.

Deprecated: Indices-Only Extraction

This approach was used in Firecrown \(\leq 1.7\). Users needed to know the structure of the SACC file a priori and create TwoPoint objects manually.

To reduce this burden, Firecrown introduced a helper to extract tracer pairs and data types from a SACC file:

from firecrown.metadata_functions import extract_all_real_metadata_indices
from firecrown.likelihood.factories import load_sacc_data

# Load the SACC file
sacc_data = load_sacc_data("../tests/sacc_data.hdf5")
# Extract all metadata indices
all_meta = extract_all_real_metadata_indices(sacc_data)

The extracted metadata describes the following two-point correlations:

Code
import pandas as pd
from IPython.display import Markdown

all_meta_table = [
    {
        "bin-x": str(meta["tracer_names"].name1),
        "bin-y": str(meta["tracer_names"].name2),
        "SACC data-type": meta["data_type"],
    }
    for meta in all_meta
]

df = pd.DataFrame(all_meta_table)
Markdown(df.to_markdown(index=False))
bin-x bin-y SACC data-type
lens0 lens0 galaxy_density_xi
lens1 lens1 galaxy_density_xi
lens2 lens2 galaxy_density_xi
lens3 lens3 galaxy_density_xi
lens4 lens4 galaxy_density_xi
src0 lens0 galaxy_shearDensity_xi_t
src1 lens0 galaxy_shearDensity_xi_t
src2 lens0 galaxy_shearDensity_xi_t
src3 lens0 galaxy_shearDensity_xi_t
src0 lens1 galaxy_shearDensity_xi_t
src1 lens1 galaxy_shearDensity_xi_t
src2 lens1 galaxy_shearDensity_xi_t
src3 lens1 galaxy_shearDensity_xi_t
src0 lens2 galaxy_shearDensity_xi_t
src1 lens2 galaxy_shearDensity_xi_t
src2 lens2 galaxy_shearDensity_xi_t
src3 lens2 galaxy_shearDensity_xi_t
src0 lens3 galaxy_shearDensity_xi_t
src1 lens3 galaxy_shearDensity_xi_t
src2 lens3 galaxy_shearDensity_xi_t
src3 lens3 galaxy_shearDensity_xi_t
src0 lens4 galaxy_shearDensity_xi_t
src1 lens4 galaxy_shearDensity_xi_t
src2 lens4 galaxy_shearDensity_xi_t
src3 lens4 galaxy_shearDensity_xi_t
src0 src0 galaxy_shear_xi_minus
src0 src1 galaxy_shear_xi_minus
src0 src2 galaxy_shear_xi_minus
src0 src3 galaxy_shear_xi_minus
src1 src1 galaxy_shear_xi_minus
src1 src2 galaxy_shear_xi_minus
src1 src3 galaxy_shear_xi_minus
src2 src2 galaxy_shear_xi_minus
src2 src3 galaxy_shear_xi_minus
src3 src3 galaxy_shear_xi_minus
src0 src0 galaxy_shear_xi_plus
src0 src1 galaxy_shear_xi_plus
src0 src2 galaxy_shear_xi_plus
src0 src3 galaxy_shear_xi_plus
src1 src1 galaxy_shear_xi_plus
src1 src2 galaxy_shear_xi_plus
src1 src3 galaxy_shear_xi_plus
src2 src2 galaxy_shear_xi_plus
src2 src3 galaxy_shear_xi_plus
src3 src3 galaxy_shear_xi_plus

Construct the TwoPoint objects using the extracted layout and the factory:

tp_factory = base_model_from_yaml(TwoPointFactory, two_point_yaml)
two_point_list = TwoPoint.from_metadata_index(all_meta, tp_factory)

At this stage, the TwoPoint objects contain only structural metadata (e.g., tracer combinations, data types). They are not yet in the ready state, as no metadata or measurement data has been attached. To complete the construction, you must call the Statistic.read method on each object. Alternatively, if the TwoPoint objects are part of a Likelihood instance, calling its Likelihood.read method will internally propagate to each contained statistic:

likelihood = ConstGaussian(two_point_list)
likelihood.read(sacc_data)

Each TwoPoint object is a subclass of Statistic, and the Likelihood.read method delegates to Statistic.read for each of its components.

Note: This indices-only method is deprecated and kept for compatibility with older code. For new projects, prefer the full extraction interface above.

Summary

You’ve learned two methods for extracting data from SACC files:

  1. Full Extraction (Recommended): Extract complete measurements with extract_all_real_data() and construct ready-state TwoPoint objects directly
  2. Indices-Only (Deprecated): Extract metadata indices with extract_all_real_metadata_indices(), construct TwoPoint objects, then load data separately

Both produce the same fundamental components: - Metadata layouts describing measurement structure - TwoPoint objects (via TwoPointFactory) ready for analysis - Likelihood objects combining predictions with data

The key outputs from this tutorial are: - two_point_reals: Measurements extracted from SACC - two_points_ready: TwoPoint objects in ready state - likelihood_ready: Complete likelihood with covariance

Next Steps

To compute predictions and evaluate likelihoods:

  • Bin Pair Selectors: Learn how to filter which bin pair combinations are extracted (optional)
  • Factory Basics: Detailed examples of setting up parameters, computing theory vectors, and evaluating likelihoods
  • Scale Cuts: Apply physical scale cuts to your data
  • Integration Methods: Control computation accuracy and speed