def doc_theme():
return theme_minimal() + theme(
panel_grid_minor=element_line(color="gray", linetype="--"),
)Using Firecrown Factories to Initialize Two-Point Objects
version 1.10.0
Purpose of this Document
In the tutorial Two Point Generators we show how to use generators to create the necessary metadata to instantiate TwoPoint objects. In addition to generators, the user has also the option to use data already saved in a SACC object (or file) to initialize the TwoPoint objects. In this tutorial we show how to extract the metadata and data from SACC object and use the metadata and a factory to initialize the objects.
SACC objects
The SACC objects are used to store all necessary information for a statistical analysis. In practice, it stores metadata (data layout, data type, bins, dependent variables, etc), calibration data (redshift distribution per bin \(\mathrm{d}n/\mathrm{d}z\)) and the covariance between all measurements. In the tutorials InferredGalaxyZDist, InferredGalaxyZDist Generators and InferredGalaxyZDist Serialization we show how to use, generate and serialize the InferredGalaxyZDist objects describing the redshift distributions. In this document we extract these components from a SACC object.
Metadata only SACC
Up to the version 1.7 Firecrown relied on a two phase construction for its likelihoods. During the first phase, the Likelihood, Statistics and Systematics objects are created with metadata only. Then, the user must call the read method passing a SACC object to finish the object construction.
One shortcoming of this methodology is that the user must know the SACC object struct beforehand, so they can create the necessary objects. Then, a matching SACC object must be passed in the read phase. To simplify this, we introduced functions to extract the metadata from the SACC file, such that it can be used toguether with the factories to create the objects.
In the code below we extract the metadata indices from the examples/des_y1_3x2pt SACC file.
from firecrown.metadata_functions import extract_all_real_metadata_indices
import sacc
sacc_data = sacc.Sacc.load_fits("../examples/des_y1_3x2pt/sacc_data.fits")
all_meta = extract_all_real_metadata_indices(sacc_data)The metadata can be seen below:
Code
import yaml
from IPython.display import Markdown
all_meta_prune = [
{
"data_type": meta["data_type"],
"tracer1": str(meta["tracer_names"].name1),
"tracer2": str(meta["tracer_names"].name2),
}
for meta in all_meta
]
all_meta_yaml = yaml.safe_dump(all_meta_prune, default_flow_style=False)
Markdown(f"```yaml\n{all_meta_yaml}\n```")- data_type: galaxy_density_xi
tracer1: lens0
tracer2: lens0
- data_type: galaxy_density_xi
tracer1: lens1
tracer2: lens1
- data_type: galaxy_density_xi
tracer1: lens2
tracer2: lens2
- data_type: galaxy_density_xi
tracer1: lens3
tracer2: lens3
- data_type: galaxy_density_xi
tracer1: lens4
tracer2: lens4
- data_type: galaxy_shearDensity_xi_t
tracer1: lens0
tracer2: src0
- data_type: galaxy_shearDensity_xi_t
tracer1: lens0
tracer2: src1
- data_type: galaxy_shearDensity_xi_t
tracer1: lens0
tracer2: src2
- data_type: galaxy_shearDensity_xi_t
tracer1: lens0
tracer2: src3
- data_type: galaxy_shearDensity_xi_t
tracer1: lens1
tracer2: src0
- data_type: galaxy_shearDensity_xi_t
tracer1: lens1
tracer2: src1
- data_type: galaxy_shearDensity_xi_t
tracer1: lens1
tracer2: src2
- data_type: galaxy_shearDensity_xi_t
tracer1: lens1
tracer2: src3
- data_type: galaxy_shearDensity_xi_t
tracer1: lens2
tracer2: src0
- data_type: galaxy_shearDensity_xi_t
tracer1: lens2
tracer2: src1
- data_type: galaxy_shearDensity_xi_t
tracer1: lens2
tracer2: src2
- data_type: galaxy_shearDensity_xi_t
tracer1: lens2
tracer2: src3
- data_type: galaxy_shearDensity_xi_t
tracer1: lens3
tracer2: src0
- data_type: galaxy_shearDensity_xi_t
tracer1: lens3
tracer2: src1
- data_type: galaxy_shearDensity_xi_t
tracer1: lens3
tracer2: src2
- data_type: galaxy_shearDensity_xi_t
tracer1: lens3
tracer2: src3
- data_type: galaxy_shearDensity_xi_t
tracer1: lens4
tracer2: src0
- data_type: galaxy_shearDensity_xi_t
tracer1: lens4
tracer2: src1
- data_type: galaxy_shearDensity_xi_t
tracer1: lens4
tracer2: src2
- data_type: galaxy_shearDensity_xi_t
tracer1: lens4
tracer2: src3
- data_type: galaxy_shear_xi_minus
tracer1: src0
tracer2: src0
- data_type: galaxy_shear_xi_minus
tracer1: src0
tracer2: src1
- data_type: galaxy_shear_xi_minus
tracer1: src0
tracer2: src2
- data_type: galaxy_shear_xi_minus
tracer1: src0
tracer2: src3
- data_type: galaxy_shear_xi_minus
tracer1: src1
tracer2: src1
- data_type: galaxy_shear_xi_minus
tracer1: src1
tracer2: src2
- data_type: galaxy_shear_xi_minus
tracer1: src1
tracer2: src3
- data_type: galaxy_shear_xi_minus
tracer1: src2
tracer2: src2
- data_type: galaxy_shear_xi_minus
tracer1: src2
tracer2: src3
- data_type: galaxy_shear_xi_minus
tracer1: src3
tracer2: src3
- data_type: galaxy_shear_xi_plus
tracer1: src0
tracer2: src0
- data_type: galaxy_shear_xi_plus
tracer1: src0
tracer2: src1
- data_type: galaxy_shear_xi_plus
tracer1: src0
tracer2: src2
- data_type: galaxy_shear_xi_plus
tracer1: src0
tracer2: src3
- data_type: galaxy_shear_xi_plus
tracer1: src1
tracer2: src1
- data_type: galaxy_shear_xi_plus
tracer1: src1
tracer2: src2
- data_type: galaxy_shear_xi_plus
tracer1: src1
tracer2: src3
- data_type: galaxy_shear_xi_plus
tracer1: src2
tracer2: src2
- data_type: galaxy_shear_xi_plus
tracer1: src2
tracer2: src3
- data_type: galaxy_shear_xi_plus
tracer1: src3
tracer2: src3The tracer names above are ones from the SACC object. The user needs to know the tracer names to create the TwoPoint objects. However, now the user can use the metadata to create the objects.
Factories
The factories are used to create the TwoPoint objects. In Firecrown, we write factories using Pydantic, a library to validate and parse data. This gives us the advantage of having a schema for the data, and the user can use the factories to validate the data before creating the objects.
For example, we can organize the factories in a YAML file:
from firecrown.likelihood.two_point import TwoPoint
from firecrown.likelihood.weak_lensing import WeakLensingFactory
from firecrown.likelihood.number_counts import NumberCountsFactory
from firecrown.utils import base_model_from_yaml
weak_lensing_yaml = """
per_bin_systematics:
- type: MultiplicativeShearBiasFactory
- type: PhotoZShiftFactory
global_systematics:
- type: LinearAlignmentSystematicFactory
alphag: 1.0
"""
number_counts_yaml = """
per_bin_systematics:
- type: PhotoZShiftFactory
global_systematics: []
"""
weak_lensing_factory = base_model_from_yaml(WeakLensingFactory, weak_lensing_yaml)
number_counts_factory = base_model_from_yaml(NumberCountsFactory, number_counts_yaml)
two_point_list = TwoPoint.from_metadata_index(
metadata_indices=all_meta,
wl_factory=weak_lensing_factory,
nc_factory=number_counts_factory,
)Creating a Likelihood object
Now we can create a Likelihood object using the TwoPoint objects. Since we are using the metadata only, we need to pass the SACC object to the read method.
from firecrown.likelihood.gaussian import ConstGaussian
likelihood = ConstGaussian(two_point_list)
likelihood.read(sacc_data)Extracting the Metadata and Data
In the previous section, we demonstrated how to extract metadata from a SACC object. Now, we will show how to extract all components from the SACC object, including the metadata, data, and calibration data.
In the upcoming interface for Firecrown, likelihood objects will be created using the complete data. This data may be extracted from a SACC object, generated by a custom generator, or provided in any other manner by the user.
In the code below we extract all components from the examples/des_y1_3x2pt SACC file.
from firecrown.data_functions import (
extract_all_real_data,
check_two_point_consistence_real,
)
from pprint import pprint
two_point_reals = extract_all_real_data(sacc_data)
check_two_point_consistence_real(two_point_reals)The list two_point_reals contains all information about real-space two point functions contained in the SACC object. Therefore, we need to use a different constructor to obtain the two-point objects already in the ready state.
from firecrown.likelihood.two_point import TwoPoint
weak_lensing_factory = base_model_from_yaml(WeakLensingFactory, weak_lensing_yaml)
number_counts_factory = base_model_from_yaml(NumberCountsFactory, number_counts_yaml)
two_points_ready = TwoPoint.from_measurement(
two_point_reals,
wl_factory=weak_lensing_factory,
nc_factory=number_counts_factory,
)Note that we used the same factories to instantiate our TwoPoint objects.
Now, to create the likelihood we need to use a different constructor that support creating likelihoods in ready state.
from firecrown.likelihood.gaussian import ConstGaussian
likelihood_ready = ConstGaussian.create_ready(
two_points_ready, sacc_data.covariance.dense
)Note that, since we are creating a Likelihood in ready state, the constructor requires the statistics covariance.
Comparing results
We can now compare whether the two methods yield the same results. Before we do that, we need to obtain our required parameters.
from firecrown.parameters import ParamsMap
req_params = likelihood.required_parameters()
req_params_ready = likelihood_ready.required_parameters()
assert req_params_ready == req_params
default_values = req_params.get_default_values()
params = ParamsMap(default_values)Note that the required parameters are the same for both likelihoods. Before generating the two-point statistics, we use the following default values:
Code
import yaml
default_values_yaml = yaml.dump(default_values, default_flow_style=False)
Markdown(f"```yaml\n{default_values_yaml}\n```")alphaz: 0.0
ia_bias: 0.5
lens0_bias: 1.5
lens0_delta_z: 0.0
lens1_bias: 1.5
lens1_delta_z: 0.0
lens2_bias: 1.5
lens2_delta_z: 0.0
lens3_bias: 1.5
lens3_delta_z: 0.0
lens4_bias: 1.5
lens4_delta_z: 0.0
src0_delta_z: 0.0
src0_mult_bias: 1.0
src1_delta_z: 0.0
src1_mult_bias: 1.0
src2_delta_z: 0.0
src2_mult_bias: 1.0
src3_delta_z: 0.0
src3_mult_bias: 1.0
z_piv: 0.5Finally, we can prepare both likelihoods and compare the loglike values.
from firecrown.modeling_tools import ModelingTools
from firecrown.ccl_factory import CCLFactory
from firecrown.updatable import get_default_params_map
from firecrown.parameters import ParamsMap
from firecrown.utils import base_model_to_yaml
from firecrown.data_functions import TwoPointBinFilterCollection, TwoPointBinFilter
from firecrown.metadata_types import Galaxies
tools = ModelingTools(ccl_factory=CCLFactory(require_nonlinear_pk=True))
params = get_default_params_map(tools, likelihood)
tools = ModelingTools()
tools.update(params)
tools.prepare()
likelihood.update(params)
likelihood_ready.update(params)Code
print(f"Loglike from metadata only: {likelihood.compute_loglike(tools)}")
print(f"Loglike from ready state: {likelihood_ready.compute_loglike(tools)}")Loglike from metadata only: -2742.739024737394
Loglike from ready state: -2742.739024737394
Filtering Data: Scale-cuts
Real analyses use only a subset of the measured two-points statistics, where the utilized data is typically limited my the accuracy of the models used to fit the data. It is then useful to define the physical scales (corresponding to the data) that should be analyzed in a given likelihood evaluation of two-point statistics. Firecrown can implement this feature though its factories, notably by defining a TwoPointBinFilterCollection object. This object is a collection of TwoPointBinFilter objects, which define the valid data analysis range for a given combination of two-point tracers. For instance, we can define the filtered range of galaxy clustering auto-correlations as follows:
tp_collection = TwoPointBinFilterCollection(
filters=[
TwoPointBinFilter.from_args(
name1=f"lens{i}",
measurement1=Galaxies.COUNTS,
name2=f"lens{i}",
measurement2=Galaxies.COUNTS,
lower=2,
upper=300,
)
for i in range(5)
],
require_filter_for_all=True,
allow_empty=True,
)
Markdown(f"```yaml\n{base_model_to_yaml(tp_collection)}\n```")require_filter_for_all: true
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]Equivalently, we may reduce the complexity of the code slightly and specify the use of auto-correlations only:
tp_collection = TwoPointBinFilterCollection(
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=2,
upper=300,
)
for i in range(5)
],
require_filter_for_all=True,
allow_empty=True,
)
Markdown(f"```yaml\n{base_model_to_yaml(tp_collection)}\n```")require_filter_for_all: true
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]One may alternatively define the tracers directly (instead of from arguments) as TwoPointTracerSpec objects.
A TwoPointExperiment object is able to keep track of the relevant Factory instances to generate the two-point configurations of the analysis (either in configuration or harmonic space) and the scale-cut/data filtering choices to evaluate a defined likelihood. The interpretation of the filtered lower and upper limits of the data depend on the definition of the TwoPointExperiment factories in either configuration or harmonic space.
With this formalism, we are able to evaluate the likelihood exactly as the previous section by defining filters to be very wide. Alternatively, by setting a restrictively small filtered range, we can remove data from the analysis and do so in the example below by filtering-out all galaxy clustering data.
from firecrown.likelihood.factories import (
DataSourceSacc,
TwoPointCorrelationSpace,
TwoPointExperiment,
TwoPointFactory,
)
tpf = TwoPointFactory(
correlation_space=TwoPointCorrelationSpace.REAL,
weak_lensing_factory=weak_lensing_factory,
number_counts_factory=number_counts_factory,
)
two_point_experiment = TwoPointExperiment(
two_point_factory=tpf,
data_source=DataSourceSacc(
sacc_data_file="../examples/des_y1_3x2pt/sacc_data.fits",
filters=TwoPointBinFilterCollection(
require_filter_for_all=False,
allow_empty=True,
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=0.5,
upper=300,
)
for i in range(5)
],
),
),
)
two_point_experiment_filtered = TwoPointExperiment(
two_point_factory=tpf,
data_source=DataSourceSacc(
sacc_data_file="../examples/des_y1_3x2pt/sacc_data.fits",
filters=TwoPointBinFilterCollection(
require_filter_for_all=False,
allow_empty=True,
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=2999,
upper=3000,
)
for i in range(5)
],
),
),
)The TwoPointExperiment objects can also be used to create likelihoods in the ready state. Additionally, they can be serialized into a yaml file, making it easier to share specific analysis choices with other users and collaborators.
The yaml below shows the first experiment.
Code
Markdown(f"```yaml\n{base_model_to_yaml(two_point_experiment)}\n```")two_point_factory:
correlation_space: REAL
weak_lensing_factory:
per_bin_systematics:
- {type: MultiplicativeShearBiasFactory}
- {type: PhotoZShiftFactory}
global_systematics:
- {type: LinearAlignmentSystematicFactory, alphag: 1.0}
number_counts_factory:
per_bin_systematics:
- {type: PhotoZShiftFactory}
global_systematics: []
include_rsd: false
data_source:
sacc_data_file: ../examples/des_y1_3x2pt/sacc_data.fits
filters:
require_filter_for_all: false
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
ccl_factory: {require_nonlinear_pk: false, amplitude_parameter: SIGMA8, mass_split: NORMAL,
creation_mode: DEFAULT, camb_extra_params: null, ccl_spline_params: null, parameter_prefix: null}The yaml below shows the second experiment.
Code
Markdown(f"```yaml\n{base_model_to_yaml(two_point_experiment_filtered)}\n```")two_point_factory:
correlation_space: REAL
weak_lensing_factory:
per_bin_systematics:
- {type: MultiplicativeShearBiasFactory}
- {type: PhotoZShiftFactory}
global_systematics:
- {type: LinearAlignmentSystematicFactory, alphag: 1.0}
number_counts_factory:
per_bin_systematics:
- {type: PhotoZShiftFactory}
global_systematics: []
include_rsd: false
data_source:
sacc_data_file: ../examples/des_y1_3x2pt/sacc_data.fits
filters:
require_filter_for_all: false
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
ccl_factory: {require_nonlinear_pk: false, amplitude_parameter: SIGMA8, mass_split: NORMAL,
creation_mode: DEFAULT, camb_extra_params: null, ccl_spline_params: null, parameter_prefix: null}Next, we can create likelihoods from the TwoPointExperiment objects and compare the loglike values.
likelihood_tpe = two_point_experiment.make_likelihood()
params = get_default_params_map(tools, likelihood_tpe)
tools = ModelingTools()
tools.update(params)
tools.prepare()
likelihood_tpe.update(params)
likelihood_tpe_filtered = two_point_experiment_filtered.make_likelihood()
params = get_default_params_map(tools, likelihood_tpe_filtered)
tools = ModelingTools()
tools.update(params)
tools.prepare()
likelihood_tpe_filtered.update(params)Code
print(f"Loglike from metadata only: {likelihood.compute_loglike(tools)}")
print(f"Loglike from ready state: {likelihood_ready.compute_loglike(tools)}")
print(f"Loglike from TwoPointExperiment: {likelihood_tpe.compute_loglike(tools)}")
print(f"Loglike from filtered TwoPointExperiment: {likelihood_tpe_filtered.compute_loglike(tools)}")Loglike from metadata only: -2742.739024737394
Loglike from ready state: -2742.739024737394
Loglike from TwoPointExperiment: -2742.739024737394
Loglike from filtered TwoPointExperiment: -2579.948781562013