Package Documentation¶
xrsdkit: a package for data-driven scattering and diffraction analysis
xrsdkit.definitions¶
This module defines the settings and parameters handled by xrsdkit.
TODO: document the following module attributes:
- xrsdkit.definitions.setting_descriptions
- xrsdkit.definitions.parameter_descriptions
- xrsdkit.definitions.parameter_units
- xrsdkit.definitions.structures
- xrsdkit.definitions.structure_settings
- xrsdkit.definitions.modelable_structure_settings
- xrsdkit.definitions.form_factors
- xrsdkit.definitions.form_settings
- xrsdkit.definitions.modelable_form_factor_settings
- xrsdkit.definitions.form_factor_params
- xrsdkit.definitions.noise_models
- xrsdkit.definitions.noise_params
- xrsdkit.definitions.crystal_systems
- xrsdkit.definitions.crystal_point_groups
- xrsdkit.definitions.bravais_lattices
- xrsdkit.definitions.lattice_space_groups
- xrsdkit.definitions.sg_point_groups
-
xrsdkit.definitions.reciprocal_lattice_vectors(lat1, lat2, lat3, crystallographic=True)[source]¶ Compute the reciprocal lattice vectors.
If not crystallographic, the computation includes the factor of 2*pi that is commmon in solid state physics
-
xrsdkit.definitions.secondary_settings(structure, form, primary_settings)[source]¶ Return secondary settings, along with sensible default values.
Secondary settings depend on the structure, form, and possibly primary setting values. Primary settings are defined by xrsdkit.definitions.structure_settings and xrsdkit.definitions.form_factor_settings.
Parameters: - structure (str) – Population structure designation, for fetching valid structure settings
- form (str) – Population form factor designation, for fetching valid form factor settings
- primary_settings (dict) – Dict of primary settings
Returns: sec_stgs – Dict of secondary settings along with sensible default values
Return type: OrderedDict
xrsdkit.system¶
-
xrsdkit.system.fit(sys, q, I, dI=None, error_weighted=None, logI_weighted=None, q_range=None)[source]¶ Fit the I(q) pattern and return a System with optimized parameters.
Parameters: - sys (xrsdkit.system.System) – System object defining populations and species, as well as settings and bounds/constraints for parameters.
- q (array of float) – 1d array of scattering vector magnitudes (1/Angstrom)
- I (array of float) – 1d array of intensities corresponding to q values
- dI (array of float) – 1d array of intensity error estimates for each I value
- error_weighted (bool) – Flag for weighting the objective with the I(q) error estimates.
- logI_weighted (bool) – Flag for evaluating the objective on log(I(q)) instead if I(q)
- q_range (list) – Two floats indicating the lower and upper q-limits for objective evaluation
Returns: sys_opt – Similar to input sys, but with fit-optimized parameters.
Return type: xrsdkit.system.System
xrsdkit.models¶
-
xrsdkit.models.load_model_from_files(yml_file, pickle_file, model_type)[source]¶ Build a xrsdkit.models.xrsd_model.XRSDModel from serialized model data.
Parameters: - yml_file (str) – absolute path to yml file generated by yaml.dump(XRSDModel.collect_model_data())
- pickle_file (str) – absolute path to pickle file generated by pickle.dump(XRSDModel.model)
- model_type (str) – either ‘classifier’ or ‘regressor’
Returns: modl – Either a xrsdkit Classifier or a xrsdkit Regressor, depending on the inputs provided
Return type: xrsdkit.models.xrsd_model.XRSDModel
xrsdkit.tools¶
-
xrsdkit.tools.Rsquared(y1, y2)[source]¶ Compute the coefficient of determination.
Parameters: - y1 (array) – an array of floats
- y2 (array) – an array of floats
Returns: Rsquared – coefficient of determination between y1 and y2
Return type: float
-
xrsdkit.tools.compute_Rsquared(y1, y2)[source]¶ Compute the coefficient of determination.
Parameters: - y1 (array) – an array of floats
- y2 (array) – an array of floats
Returns: Rsquared – coefficient of determination between y1 and y2
Return type: float
-
xrsdkit.tools.compute_chi2(y1, y2, weights=None)[source]¶ Compute sum of difference squared between two arrays.
Parameters: - y1 (array) – an array of floats
- y2 (array) – an array of floats
- weights (array) – array of weights to multiply each element of (y2-y1)**2
Returns: chi2 – sum of difference squared between y1 and y2.
Return type: float
-
xrsdkit.tools.compute_pearson(y1, y2)[source]¶ Compute the Pearson correlation coefficient.
Parameters: - y1 (array) – an array of floats
- y2 (array) – an array of floats
Returns: pearson_r – Pearson’s correlation coefficient between y1 and y2
Return type: float
-
xrsdkit.tools.g_of_r(q_I)[source]¶ Compute g(r) and the maximum characteristic scatterer length.
Parameters: q_I (array) – n-by-2 array of q values and intensities Returns: - g_of_r (array) – n-by-2 array of r values and g(r) magnitudes
- r_max (float) – maximum scatterer length- the integral of g(r) from zero to r_max is 0.99 times the full integral of g(r)
xrsdkit.scattering¶
-
xrsdkit.scattering.guinier_porod_intensity(q, rg, porod_exponent)[source]¶ Compute a Guinier-Porod scattering intensity.
Returned array of intensities is normalized such that I(0)=1.
Parameters: - q (array) – array of q values
- rg (float) – radius of gyration
- porod_exponent (float) – high-q Porod’s law exponent
Returns: - I (array) – Array of intensities for all q
- Reference
- ———
- B. Hammouda, J. Appl. Cryst. (2010). 43, 716-719.
-
xrsdkit.scattering.integrated_isotropic_diffraction_intensity(q, source_wavelength, lattice, latparams, coords, ff_funcs, pk_func, occupancies=None, q_min=0.0, q_max=None, space_group='', sf_mode='local', polz_correction=True, lorentz_correction=True, use_symmetry=True)[source]¶ Compute integrated diffraction pattern for an isotropic (powder-like) system.
Parameters: - q (numpy.array) – Vector of q-values where intensity will be computed
- lattice (str) – Lattice specification (one of xrsdkit.scattering.space_groups.all_lattices).
- latparams (dict) – Dict defining lattice parameters in Angstroms and degrees (dict keys: [‘a’,’b’,’c’,’alpha’,’beta’,’gamma’])
- coords (list) – List of 3-element iterables defining fractional coordinates of specie positions on the lattice vector basis
- ff_funcs (list) – List of functions that compute form factors for all species at any q, in order corresponding to coords
- pk_func (callable) – Function that yields a peak profile, as pk_func(q,q_center)
- source_wavelength (float) – Light source wavelength in Angstroms
- q_min (float) – Minimum q-value (>0) for reciprocal space integration
- q_max (float) – Maximum q-value (>`q_min`) for reciprocal space integration- if not provided, automatically set to the highest value in q.
- space_group (str) – Space group designation used for symmetrizing the reciprocal space summation, should be one of xrsdkit.scattering.space_groups.lattice_space_groups[lattice]
- sf_mode (str) – Either ‘local’ or ‘radial’. If ‘local’, for each reciprocal lattice point, the crystal structure factor is computed exactly at the lattice point. If ‘radial’, for each reciprocal lattice point, the crystal structure factor is computed along a line from the reciprocal space origin through the lattice point. The ‘radial’ mode is meant to capture the effects of form factors that vary considerably within peak widths.
- polz_correction (bool) – If True, a polarization correction of (1+cos^2(2*theta))/2 is applied, where lambda*q = 4*pi*sin(theta), for all input q values.
- lorentz_correction (bool) – If True, the Lorentz correction of 1/(sin(theta)*sin(2*theta)) is applied separately to each peak. This is not applied for all q values because it is indefinite at q=0.
- use_symmetry (bool) – If True, the summation over reciprocal space is reduced by applying the symmetry operations of the point group associated with the space_group, and multiplicity factors are collected and applied accordingly
Returns: Diffracted intensity, normalized such that I(q=0) is equal to 1.
Return type: numpy.array
-
xrsdkit.scattering.spherical_normal_intensity(q, r0, sigma, sampling_width=3.5, sampling_step=0.05)[source]¶ Compute the form factor for a normally-distributed sphere population.
The returned form factor is normalized such that its value at q=0 is 1. The current version samples the distribution from r0*(1-sampling_width*sigma) to r0*(1+sampling_width*sigma) in steps of sampling_step*sigma*r0 Additional info about sampling_width and sampling_step: https://github.com/scattering-central/saxskit/examples/spherical_normal_saxs_benchmark.ipynb
Parameters: - q (array) – array of scattering vector magnitudes
- r0 (float) – mean radius of the sphere population
- sigma (float) – fractional standard deviation of the sphere population radii
- sampling_width (float) – sampling width in units of sigma- samples are taken from below and above the mean, unless this would require sampling negative values, in which case the region below zero is truncated.
- sampling_step (float) – spacing between samples in units of sigma
Returns: I – Array of intensity values for all q
Return type: array
xrsdkit.visualization¶
-
xrsdkit.visualization.visualize_dataframe(data, labels=['system_class'], features=['Imax_over_Imean', 'Ilowq_over_Imean', 'Imax_sharpness', 'I_fluctuation', 'logI_fluctuation', 'logI_max_over_std', 'r_fftIcentroid', 'q_Icentroid', 'q_logIcentroid', 'pearson_q', 'pearson_q2', 'pearson_expq', 'pearson_invexpq', 'q_best_hump', 'q_best_trough', 'best_hump_qwidth', 'best_trough_qwidth', 'q_best_hump_log', 'q_best_trough_log', 'best_hump_qwidth_log', 'best_trough_qwidth_log'], use_pca=True, pca_comp_to_use=[0, 1], show_plots=False)[source]¶ Makes a labeled scatterplot of data.
If use_pca is True, PCA will be applied to the data[features], the pca components which are specified in pca_comp_to_use will be used for plotting. If use_pca is False, the first two features will be used for plotting.
Parameters: - data (pandas.DataFrame) – dataframe containing features and labels
- labels (list of str) – names of the columns to use for labeling scatterplot points
- features (list of str) – If use_pca is True, these are the column names used to evaluate the PCA. If use_pca is False, these column names are used directly as axis labels (in this case, only the first two entries are used).
- use_pca (bool) – if True, PCA will be applied to features
- pca_comp_to_use (list of int) – if use_pca is True, this is a list of two indices indicating which PCs to use as plot axes. Each index must be less than the total number of features
- show_plots (bool) – whether or not to show the plots on the display