def doc_theme():
return theme_minimal() + theme(
panel_grid_minor=element_line(color="gray", linetype="--"),
)Scale Cuts and Data Filtering
Version ?env:FIRECROWN_VERSION
Purpose of this Document
This tutorial demonstrates how to apply physical scale cuts to two-point statistics using TwoPointBinFilterCollection. Scale cuts are essential for limiting analyses to scales where theoretical models are accurate.
For background on the factory system, see Two-Point Factory Basics. For loading SACC data, see Loading SACC Data. For information on defining systematics factories, see Two-Point Factories Basics.
Filtering Data: Scale Cuts
Real analyses use only a subset of the measured two-point statistics, where the utilized data is typically limited by the accuracy of the models used to fit the data. It is useful to define the physical scales (corresponding to the data) that should be analyzed in a given likelihood evaluation of two-point statistics. Firecrown implements this feature through its factories, notably by defining a TwoPointBinFilterCollection object.
This object is a collection of TwoPointBinFilter objects, which define the valid data analysis range for a given combination of two-point tracers. For instance, we can define the filtered range of galaxy clustering auto-correlations as follows:
from firecrown.data_functions import TwoPointBinFilterCollection, TwoPointBinFilter
from firecrown.metadata_types import Galaxies
from firecrown.utils import base_model_to_yaml
from IPython.display import Markdown
tp_collection = TwoPointBinFilterCollection(
filters=[
TwoPointBinFilter.from_args(
name1=f"lens{i}",
measurement1=Galaxies.COUNTS,
name2=f"lens{i}",
measurement2=Galaxies.COUNTS,
lower=2,
upper=300,
)
for i in range(5)
],
require_filter_for_all=True,
allow_empty=True,
)
Markdown(f"```yaml\n{base_model_to_yaml(tp_collection)}\n```")require_filter_for_all: true
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: supportEquivalently, we may reduce the complexity of the code slightly and specify the use of auto-correlations only:
tp_collection = TwoPointBinFilterCollection(
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=2,
upper=300,
)
for i in range(5)
],
require_filter_for_all=True,
allow_empty=True,
)
Markdown(f"```yaml\n{base_model_to_yaml(tp_collection)}\n```")require_filter_for_all: true
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: support
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2.0, 300.0]
method: supportOne may alternatively define the tracers directly (instead of from arguments) as TwoPointTracerSpec objects.
Using TwoPointExperiment
A TwoPointExperiment object keeps track of the relevant Factory instances to generate the two-point configurations of the analysis (either in configuration or harmonic space) and the scale-cut/data filtering choices to evaluate a defined likelihood. The interpretation of the filtered lower and upper limits of the data depend on the definition of the TwoPointExperiment factories in either configuration or harmonic space.
With this formalism, we can evaluate the likelihood exactly as in the Loading SACC Data tutorial by defining filters to be very wide. Alternatively, by setting a restrictively small filtered range, we can remove data from the analysis. In the example below, we filter out all galaxy clustering data by using an extremely narrow range.
from firecrown.likelihood import TwoPointFactory
from firecrown.likelihood.factories import (
DataSourceSacc,
TwoPointExperiment,
)
from firecrown.utils import base_model_from_yaml
two_point_yaml = """
correlation_space: real
weak_lensing_factories:
- type_source: default
per_bin_systematics:
- type: MultiplicativeShearBiasFactory
- type: PhotoZShiftFactory
global_systematics:
- type: LinearAlignmentSystematicFactory
alphag: 1.0
number_counts_factories:
- type_source: default
per_bin_systematics:
- type: PhotoZShiftFactory
global_systematics: []
"""
tpf = base_model_from_yaml(TwoPointFactory, two_point_yaml)
two_point_experiment = TwoPointExperiment(
two_point_factory=tpf,
data_source=DataSourceSacc(
sacc_data_file="../tests/sacc_data.hdf5",
filters=TwoPointBinFilterCollection(
require_filter_for_all=False,
allow_empty=True,
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=0.5,
upper=300,
)
for i in range(5)
],
),
),
)
two_point_experiment_filtered = TwoPointExperiment(
two_point_factory=tpf,
data_source=DataSourceSacc(
sacc_data_file="../tests/sacc_data.hdf5",
filters=TwoPointBinFilterCollection(
require_filter_for_all=False,
allow_empty=True,
filters=[
TwoPointBinFilter.from_args_auto(
name=f"lens{i}",
measurement=Galaxies.COUNTS,
lower=2999,
upper=3000,
)
for i in range(5)
],
),
),
)Serializing TwoPointExperiment
The TwoPointExperiment objects can also be used to create likelihoods in the ready state. Additionally, they can be serialized into a YAML file, making it easier to share specific analysis choices with other users and collaborators.
The yaml below shows the first experiment:
Code
Markdown(f"```yaml\n{base_model_to_yaml(two_point_experiment)}\n```")two_point_factory:
correlation_space: real
weak_lensing_factories:
- type_source: default
per_bin_systematics:
- {type: MultiplicativeShearBiasFactory}
- {type: PhotoZShiftFactory}
global_systematics:
- {type: LinearAlignmentSystematicFactory, alphag: 1.0}
number_counts_factories:
- type_source: default
per_bin_systematics:
- {type: PhotoZShiftFactory}
global_systematics: []
include_rsd: false
cmb_factories: []
int_options: null
data_source:
sacc_data_file: ../tests/sacc_data.hdf5
filters:
require_filter_for_all: false
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
method: support
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
method: support
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
method: support
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
method: support
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [0.5, 300.0]
method: support
normalize_window: true
ccl_factory: {require_nonlinear_pk: false, amplitude_parameter: sigma8, mass_split: normal,
num_neutrino_masses: null, creation_mode: default, pure_ccl_transfer_function: boltzmann_camb,
use_camb_hm_sampling: false, allow_multiple_camb_instances: false, camb_extra_params: null,
ccl_spline_params: null, parameter_prefix: null}The yaml below shows the second experiment:
Code
Markdown(f"```yaml\n{base_model_to_yaml(two_point_experiment_filtered)}\n```")two_point_factory:
correlation_space: real
weak_lensing_factories:
- type_source: default
per_bin_systematics:
- {type: MultiplicativeShearBiasFactory}
- {type: PhotoZShiftFactory}
global_systematics:
- {type: LinearAlignmentSystematicFactory, alphag: 1.0}
number_counts_factories:
- type_source: default
per_bin_systematics:
- {type: PhotoZShiftFactory}
global_systematics: []
include_rsd: false
cmb_factories: []
int_options: null
data_source:
sacc_data_file: ../tests/sacc_data.hdf5
filters:
require_filter_for_all: false
allow_empty: true
filters:
- spec:
- name: lens0
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
method: support
- spec:
- name: lens1
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
method: support
- spec:
- name: lens2
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
method: support
- spec:
- name: lens3
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
method: support
- spec:
- name: lens4
measurement: {subject: Galaxies, property: COUNTS}
interval: [2999.0, 3000.0]
method: support
normalize_window: true
ccl_factory: {require_nonlinear_pk: false, amplitude_parameter: sigma8, mass_split: normal,
num_neutrino_masses: null, creation_mode: default, pure_ccl_transfer_function: boltzmann_camb,
use_camb_hm_sampling: false, allow_multiple_camb_instances: false, camb_extra_params: null,
ccl_spline_params: null, parameter_prefix: null}Creating and Comparing Likelihoods
Next, we create likelihoods from the TwoPointExperiment objects and compare the loglike values.
from firecrown.modeling_tools import ModelingTools
from firecrown.modeling_tools import CCLFactory
from firecrown.updatable import get_default_params_map
tools = ModelingTools()
likelihood_tpe = two_point_experiment.make_likelihood()
params = get_default_params_map(tools, likelihood_tpe)
likelihood_tpe.update(params)
tools.update(params)
tools.prepare()
likelihood_tpe_filtered = two_point_experiment_filtered.make_likelihood()
tools = ModelingTools()
params = get_default_params_map(tools, likelihood_tpe_filtered)
tools.update(params)
tools.prepare()
likelihood_tpe_filtered.update(params)Compare the log-likelihood values:
Code
print(f"Loglike from TwoPointExperiment: {likelihood_tpe.compute_loglike(tools)}")
print(
f"Loglike from filtered TwoPointExperiment: {likelihood_tpe_filtered.compute_loglike(tools)}"
)Loglike from TwoPointExperiment: -2742.739024737394
Loglike from filtered TwoPointExperiment: -2579.948781562013
The filtered likelihood should show a different value due to the excluded data points.
Next Steps
- Integration Methods: Learn how to control integration methods for two-point functions