API¶

This page contains a comprehensive list of all classes and functions in predeval.

ContinuousEvaluator¶

Library of classes for evaluating continuous model outputs.

class predeval.continuous.ContinuousEvaluator(ref_data, assertions=None, verbose=True, **kwargs)[source]¶

Bases: predeval.parent.ParentPredEval

Evaluator for continuous model outputs (e.g., regression models).

By default, this will run the tests listed in the assertions attribute ([‘min’, ‘max’, ‘mean’, ‘std’, ‘ks_test’]). You can change the tests that will run by listing the desired tests in the assertions parameter.

The available tests are min, max, mean, std, and ks_test.

…

Parameters:

ref_data (list of int or float or np.array) – This the reference data for all tests. All future data will be compared to this data.
assertions (list of str, optional) – These are the assertion tests that will be created. Defaults is [‘chi2_test’, ‘exist’].
verbose (bool, optional) – Whether tests should print their output. Default is true

Variables:

assertion_params (dict) –
dictionary of test names and values defining these tests.
- minimum : float
  
  Expected minimum.
- maximum : float
  
  Expected maximum.
- mean : float
  
  Expected mean.
- std : float
  
  Expected standard-deviation.
- ks_stat: float
  
  ks-test-statistic. When this value is exceeded. The test ‘failed’.
- ks_test : func
  
  Partially evaluated ks test.
assertions (list of str) – This list of strings describes the tests that will be run on comparison data. Defaults to [‘min’, ‘max’, ‘mean’, ‘std’, ‘ks_test’]

check_data(test_data)¶

Check whether test_data is as expected.

Run threw all tests in assertions and return whether the data passed these tests.

Parameters:	test_data (list or np.array) – This the data that will be compared to the reference data.
Returns:	output – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail.
Return type:	list of tuples

check_ks(test_data)[source]¶

Test whether test_data is similar to reference data.

If the returned ks-test-statistic is greater than the threshold (default 0.2), the test failed.

The threshold is set by assertion_params[‘ks_test’].

Uses Kolmogorov-Smirnov test from scipy.

Parameters:	comparison_data (list or np.array, optional) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

check_max(test_data)[source]¶

Check whether test_data has any larger values than expected.

The expected max is controlled by assertion_params[‘max’].

Parameters:	comparison_data (list or np.array, optional) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

check_mean(test_data)[source]¶

Check whether test_data has a different mean than expected.

If the observed mean is more than 2 standard deviations from the expected mean, the test fails.

The expected mean is controlled by assertion_params[‘mean’].

The expected standard deviation is controlled by assertion_params[‘std’].

Parameters:	comparison_data (list or np.array, optional) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

check_min(test_data)[source]¶

Check whether test_data has any smaller values than expected.

The expected min is controlled by assertion_params[‘min’].

Parameters:	comparison_data (list or np.array, optional) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

check_std(test_data)[source]¶

Check whether test_data has any larger values than expected.

If the observed standard deviation is less than 1/2 the expected std or greater than 1.5 times the expected std, then the test fails.

The expected standard deviation is controlled by assertion_params[‘std’].

Parameters:	comparison_data (list or np.array, optional) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

update_ks_test(input_data)[source]¶

Create partially evaluated ks_test.

Uses Kolmogorov-Smirnov test from scipy.

Parameters:	input_data (list or np.array) – This the reference data for the ks-test. All future data will be compared to this data.
Returns:
Return type:	None

update_max(input_data)[source]¶

Find max of input data.

Parameters:	input_data (list or np.array) – This the reference data for the max-test. All future data will be compared to this data.
Returns:
Return type:	None

update_mean(input_data)[source]¶

Find mean of input data.

Parameters:	input_data (list or np.array) – This the reference data for the max-test. All future data will be compared to this data.
Returns:
Return type:	None

update_min(input_data)[source]¶

Find min of input_data.

Parameters:	input_data (list or np.array) – This the reference data for the min-test. All future data will be compared to this data.
Returns:
Return type:	None

update_param(param_key, param_value)¶

Update value in assertion param dictionary attribute.

Parameters:	param_key (string) – This is the assertion param that we want to update. param_value (real number or partially evaluated test.) – This is the updated value.
Returns:
Return type:	None

update_std(input_data)[source]¶

Find standard deviation of input data.

Parameters:	input_data (list or np.array) – This the reference data for the max-test. All future data will be compared to this data.
Returns:
Return type:	None

CategoricalEvaluator¶

Library of classes for evaluating categorical model outputs.

class predeval.categorical.CategoricalEvaluator(ref_data, assertions=None, verbose=True, **kwargs)[source]¶

Bases: predeval.parent.ParentPredEval

Evaluator for categorical model outputs (e.g., classification models).

By default, this will run the tests listed in the assertions attribute ([‘chi2_test’, ‘exist’]). You can change the tests that will run by listing the desired tests in the assertions parameter.

The available tests are chi2_test and exist.

…

Parameters:

ref_data (list of int or float or np.array) – This the reference data for all tests. All future data will be compared to this data.
assertions (list of str, optional) – These are the assertion tests that will be created. Defaults is [‘chi2_test’, ‘exist’].
verbose (bool, optional) – Whether tests should print their output. Default is true

Variables:

assertion_params (dict) –
dictionary of test names and values defining these tests.
- chi2_stat : float
  
  Chi2-test-statistic. When this value is exceeded. The test ‘failed’.
- chi2_test : func
  
  Partially evaluated chi2 test.
- cat_exists : list of int or str
  
  This is a list of the expected model outputs
assertions (list of str) – This list of strings describes the tests that will be run on comparison data. Defaults to [‘chi2_test’, ‘exist’]

check_chi2(test_data)[source]¶

Test whether test_data is similar to reference data.

If the returned chi2-test-statistic is greater than the threshold (default 2), the test failed.

The threshold is set by assertion_params[‘chi2_test’].

Uses chi2_contingency test from scipy.

Parameters:	test_data (list or np.array) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

check_data(test_data)¶

Check whether test_data is as expected.

Run threw all tests in assertions and return whether the data passed these tests.

Parameters:	test_data (list or np.array) – This the data that will be compared to the reference data.
Returns:	output – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail.
Return type:	list of tuples

check_exist(test_data)[source]¶

Check that all distinct values present in test_data.

If any values missing, then the function will return a False (rather than true).

The expected values is controlled by assertion_params[‘cat_exists’].

Parameters:	test_data (list or np.array) – This the data that will be compared to the reference data.
Returns:	2 item tuple with test name and boolean expressing whether passed test.
Return type:	(string, bool)

update_chi2_test(input_data)[source]¶

Create partially evaluated chi2 contingency test.

Uses chi2_contingency test from scipy.

Parameters:	input_data (list or np.array) – This the reference data for the ks-test. All future data will be compared to this data.
Returns:
Return type:	None

update_exist(input_data)[source]¶

Create input data for test checking whether all categorical outputs exist.

Parameters:	input_data (list or np.array) – This the reference data for the check_exist. All future data will be compared to it.
Returns:
Return type:	None

update_param(param_key, param_value)¶

Update value in assertion param dictionary attribute.

Parameters:	param_key (string) – This is the assertion param that we want to update. param_value (real number or partially evaluated test.) – This is the updated value.
Returns:
Return type:	None

Utilities¶

Helper functions for the predeval module.

predeval.utilities.evaluate_tests(test_ouputs, assert_test=False, verbose=True)[source]¶

Check whether the data passed evaluation tests.

Parameters:	test_ouputs (list of tuples) – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail. This is the output of the check_data method. assert_test (bool) – Whether to assert the test passed. Default is False. verbose (bool) – Whether to print whether each test was passed or not.
Returns:
Return type:	None