autofit.Nautilus#

class Nautilus[source]#

Bases: AbstractNest

A Nautilus non-linear search.

Nautilus is an optional requirement and must be installed manually via the command pip install ultranest. It is optional as it has certain dependencies which are generally straight forward to install (e.g. Cython).

For a full description of Nautilus checkout its Github and documentation webpages:

https://github.com/johannesulf/nautilus https://nautilus-sampler.readthedocs.io/en/stable/index.html

Parameters:

name (Optional[str]) – The name of the search, controlling the last folder results are output.
path_prefix (Optional[str]) – The path of folders prefixing the name folder where results are output.
unique_tag (Optional[str]) – The name of a unique tag for this model-fit, which will be given a unique entry in the sqlite database and also acts as the folder after the path prefix and before the search name.
iterations_per_update (Optional[int]) – The number of iterations performed between update (e.g. output latest model to hard-disk, visualization).
number_of_cores (Optional[int]) – The number of cores sampling is performed using a Python multiprocessing Pool instance.
session (Optional[Session]) – An SQLalchemy session instance so the results of the model-fit are written to an SQLite database.

Methods

`call_search`	The x1 CPU and multiprocessing searches both call this function to perform the non-linear search.
`check_model`
`config_dict_test_mode_from`	Returns a configuration dictionary for test mode meaning that the sampler terminates as quickly as possible.
`copy_with_paths`
`exact_fit`	rtype: `Tuple`[`MeanField`, `Status`]
`fit`	Fit a model, M with some function f that takes instances of the class represented by model M and gives a score for their fitness.
`fit_mpi`	Perform the non-linear search, using MPI to distribute the model-fit across multiple computing nodes.
`fit_multiprocessing`	Perform the non-linear search, using multiple CPU cores parallelized via Python's multiprocessing module.
`fit_sequential`	Fit multiple analyses contained within the analysis sequentially.
`fit_x1_cpu`	Perform the non-linear search, using one CPU core.
`iterations_from`	Returns the next number of iterations that a dynesty call will use and the total number of iterations that have been performed so far.
`make_pool`	Make the pool instance used to parallelize a NonLinearSearch alongside a set of unique ids for every process in the pool.
`make_sneakier_pool`	rtype: `SneakierPool`
`make_sneaky_pool`	Create a pool for multiprocessing that uses slight-of-hand to avoid copying the fitness function between processes multiple times.
`optimise`	Perform optimisation for expectation propagation.
`output_search_internal`	Output the sampler results to hard-disk in their internal format.
`perform_update`	Perform an update of the non-linear search's model-fitting results.
`perform_visualization`	Perform visualization of the non-linear search's model-fitting results.
`plot_results`
`post_fit_output`	Cleans up the output folderds after a completed non-linear search.
`pre_fit_output`	Outputs attributes of fit before the non-linear search begins.
`remove_state_files`
`result_via_completed_fit`	Returns the result of the non-linear search of a completed model-fit.
`samples_from`	Loads the samples of a non-linear search from its output files.
`samples_info_from`
`samples_via_csv_from`	Returns a Samples object from the samples.csv and samples_info.json files.
`samples_via_internal_from`	Returns a Samples object from the ultranest internal results.
`start_resume_fit`	Start a non-linear search from scratch, or resumes one which was previously terminated mid-way through.

Attributes

`checkpoint_file`	The path to the file used for checkpointing.
`config_dict`
`config_dict_run`	A property that is only computed once per instance and then replaces itself with an ordinary attribute.
`config_dict_search`	A property that is only computed once per instance and then replaces itself with an ordinary attribute.
`config_dict_settings`	rtype: `Dict`
`config_type`
`logger`	Log 'msg % args' with severity 'DEBUG'.
`name`
`paths`	rtype: `Optional`[`AbstractPaths`]
`plotter_cls`
`sampler_cls`
`samples_cls`
`timer`	Returns the timer of the search, which is used to output informaiton such as how long the search took and how much parallelization sped up the search time.
`using_mpi`	Whether the search is being performing using MPI for parallelisation or not.

property checkpoint_file#

The path to the file used for checkpointing.

If autofit is not outputting results to hard-disk (e.g. paths is NullPaths), this function is bypassed.

fit_x1_cpu(fitness, model, analysis)[source]#

Perform the non-linear search, using one CPU core.

This is used if the likelihood function calls external libraries that cannot be parallelized or use threading in a way that conflicts with the parallelization of the non-linear search.

Parameters:

fitness – The function which takes a model instance and returns its log likelihood via the Analysis class
model – The model which maps parameters chosen via the non-linear search (e.g. via the priors or sampling) to instances of the model, which are passed to the fitness function.
analysis – Contains the data and the log likelihood function which fits an instance of the model to the data, returning the log likelihood the search maximizes.

fit_multiprocessing(fitness, model, analysis)[source]#

Perform the non-linear search, using multiple CPU cores parallelized via Python’s multiprocessing module.

This uses PyAutoFit’s sneaky pool class, which allows us to use the multiprocessing module in a way that plays nicely with the non-linear search (e.g. exception handling, keyboard interupts, etc.).

Multiprocessing parallelization can only parallelize across multiple cores on a single device, it cannot be distributed across multiple devices or computing nodes. For that, use the fit_mpi method.

Parameters:

fitness – The function which takes a model instance and returns its log likelihood via the Analysis class
model – The model which maps parameters chosen via the non-linear search (e.g. via the priors or sampling) to instances of the model, which are passed to the fitness function.
analysis – Contains the data and the log likelihood function which fits an instance of the model to the data, returning the log likelihood the search maximizes.

call_search(search_internal, model, analysis)[source]#

The x1 CPU and multiprocessing searches both call this function to perform the non-linear search.

This function calls the search a reduced number of times, corresponding to the iterations_per_update of the search. This allows the search to output results on-the-fly, for example writing to the hard-disk the latest model and samples.

It tracks how often to do this update alongside the maximum number of iterations the search will perform. This ensures that on-the-fly output is performed at regular intervals and that the search does not perform more iterations than the n_like_max input variable.

Parameters:

search_internal – The single CPU or multiprocessing search which is run and performs nested sampling.
model – The model which maps parameters chosen via the non-linear search (e.g. via the priors or sampling) to instances of the model, which are passed to the fitness function.
analysis – Contains the data and the log likelihood function which fits an instance of the model to the data, returning the log likelihood the search maximizes.

fit_mpi(fitness, model, analysis, checkpoint_exists)[source]#

Perform the non-linear search, using MPI to distribute the model-fit across multiple computing nodes.

This uses PyAutoFit’s sneaky pool class, which allows us to use the multiprocessing module in a way that plays nicely with the non-linear search (e.g. exception handling, keyboard interupts, etc.).

MPI parallelization can be distributed across multiple devices or computing nodes.

Parameters:

fitness – The function which takes a model instance and returns its log likelihood via the Analysis class
model – The model which maps parameters chosen via the non-linear search (e.g. via the priors or sampling) to instances of the model, which are passed to the fitness function.
analysis – Contains the data and the log likelihood function which fits an instance of the model to the data, returning the log likelihood the search maximizes.
checkpoint_exists (bool) – Does the checkpoint file corresponding do a previous run of this search exist?

iterations_from(search_internal)[source]#

Returns the next number of iterations that a dynesty call will use and the total number of iterations that have been performed so far.

This is used so that the iterations_per_update input leads to on-the-fly output of dynesty results.

It also ensures dynesty does not perform more samples than the n_like_max input variable.

Parameters:

search_internal – The Dynesty sampler (static or dynamic) which is run and performs nested sampling.

Return type:

Tuple[int, int]

Returns:

The next number of iterations that a dynesty run sampling will perform and the total number of iterations
it has performed so far.

output_search_internal(search_internal)[source]#

Output the sampler results to hard-disk in their internal format.

The multiprocessing Pool object cannot be pickled and thus the sampler cannot be saved to hard-disk. This function therefore extracts the necessary information from the sampler and saves it to hard-disk.

Parameters:: sampler – The nautilus sampler object containing the results of the model-fit.

samples_via_internal_from(model, search_internal=None)[source]#

Returns a Samples object from the ultranest internal results.

The samples contain all information on the parameter space sampling (e.g. the parameters, log likelihoods, etc.).

The internal search results are converted from the native format used by the search to lists of values (e.g. parameter_lists, log_likelihood_list).

Parameters:: model (AbstractPriorModel) – Maps input vectors of unit parameter values to physical values and model instances via priors.

config_dict_test_mode_from(config_dict)[source]#

Returns a configuration dictionary for test mode meaning that the sampler terminates as quickly as possible.

Entries which set the total number of samples of the sampler (e.g. maximum calls, maximum likelihood evaluations) are reduced to low values meaning it terminates nearly immediately.

Parameters:

config_dict (Dict) – The original configuration dictionary for this sampler which includes entries controlling how fast the sampler terminates.

Return type:

Dict

Returns:

A configuration dictionary where settings which control the sampler’s number of samples are reduced so it
terminates as quickly as possible.