API¶
This page contains a comprehensive list of all classes and functions in predeval.
ContinuousEvaluator¶
Library of classes for evaluating continuous model outputs.
-
class
predeval.continuous.
ContinuousEvaluator
(ref_data, assertions=None, verbose=True, **kwargs)[source]¶ Bases:
predeval.parent.ParentPredEval
Evaluator for continuous model outputs (e.g., regression models).
By default, this will run the tests listed in the assertions attribute ([‘min’, ‘max’, ‘mean’, ‘std’, ‘ks_test’]). You can change the tests that will run by listing the desired tests in the assertions parameter.
The available tests are min, max, mean, std, and ks_test.
…
Parameters: - ref_data (list of int or float or np.array) – This the reference data for all tests. All future data will be compared to this data.
- assertions (list of str, optional) – These are the assertion tests that will be created. Defaults is [‘chi2_test’, ‘exist’].
- verbose (bool, optional) – Whether tests should print their output. Default is true
Variables: - assertion_params (dict) –
dictionary of test names and values defining these tests.
- minimum : float
- Expected minimum.
- maximum : float
- Expected maximum.
- mean : float
- Expected mean.
- std : float
- Expected standard-deviation.
- ks_stat: float
- ks-test-statistic. When this value is exceeded. The test ‘failed’.
- ks_test : func
- Partially evaluated ks test.
- assertions (list of str) – This list of strings describes the tests that will be run on comparison data. Defaults to [‘min’, ‘max’, ‘mean’, ‘std’, ‘ks_test’]
-
check_data
(test_data)¶ Check whether test_data is as expected.
Run threw all tests in assertions and return whether the data passed these tests.
Parameters: test_data (list or np.array) – This the data that will be compared to the reference data. Returns: output – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail. Return type: list of tuples
-
check_ks
(test_data)[source]¶ Test whether test_data is similar to reference data.
If the returned ks-test-statistic is greater than the threshold (default 0.2), the test failed.
The threshold is set by assertion_params[‘ks_test’].
Uses Kolmogorov-Smirnov test from scipy.
Parameters: comparison_data (list or np.array, optional) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
check_max
(test_data)[source]¶ Check whether test_data has any larger values than expected.
The expected max is controlled by assertion_params[‘max’].
Parameters: comparison_data (list or np.array, optional) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
check_mean
(test_data)[source]¶ Check whether test_data has a different mean than expected.
If the observed mean is more than 2 standard deviations from the expected mean, the test fails.
The expected mean is controlled by assertion_params[‘mean’].
The expected standard deviation is controlled by assertion_params[‘std’].
Parameters: comparison_data (list or np.array, optional) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
check_min
(test_data)[source]¶ Check whether test_data has any smaller values than expected.
The expected min is controlled by assertion_params[‘min’].
Parameters: comparison_data (list or np.array, optional) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
check_std
(test_data)[source]¶ Check whether test_data has any larger values than expected.
If the observed standard deviation is less than 1/2 the expected std or greater than 1.5 times the expected std, then the test fails.
The expected standard deviation is controlled by assertion_params[‘std’].
Parameters: comparison_data (list or np.array, optional) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
update_ks_test
(input_data)[source]¶ Create partially evaluated ks_test.
Uses Kolmogorov-Smirnov test from scipy.
Parameters: input_data (list or np.array) – This the reference data for the ks-test. All future data will be compared to this data. Returns: Return type: None
-
update_max
(input_data)[source]¶ Find max of input data.
Parameters: input_data (list or np.array) – This the reference data for the max-test. All future data will be compared to this data. Returns: Return type: None
-
update_mean
(input_data)[source]¶ Find mean of input data.
Parameters: input_data (list or np.array) – This the reference data for the max-test. All future data will be compared to this data. Returns: Return type: None
-
update_min
(input_data)[source]¶ Find min of input_data.
Parameters: input_data (list or np.array) – This the reference data for the min-test. All future data will be compared to this data. Returns: Return type: None
-
update_param
(param_key, param_value)¶ Update value in assertion param dictionary attribute.
Parameters: - param_key (string) – This is the assertion param that we want to update.
- param_value (real number or partially evaluated test.) – This is the updated value.
Returns: Return type: None
CategoricalEvaluator¶
Library of classes for evaluating categorical model outputs.
-
class
predeval.categorical.
CategoricalEvaluator
(ref_data, assertions=None, verbose=True, **kwargs)[source]¶ Bases:
predeval.parent.ParentPredEval
Evaluator for categorical model outputs (e.g., classification models).
By default, this will run the tests listed in the assertions attribute ([‘chi2_test’, ‘exist’]). You can change the tests that will run by listing the desired tests in the assertions parameter.
The available tests are chi2_test and exist.
…
Parameters: - ref_data (list of int or float or np.array) – This the reference data for all tests. All future data will be compared to this data.
- assertions (list of str, optional) – These are the assertion tests that will be created. Defaults is [‘chi2_test’, ‘exist’].
- verbose (bool, optional) – Whether tests should print their output. Default is true
Variables: - assertion_params (dict) –
dictionary of test names and values defining these tests.
- chi2_stat : float
- Chi2-test-statistic. When this value is exceeded. The test ‘failed’.
- chi2_test : func
- Partially evaluated chi2 test.
- cat_exists : list of int or str
- This is a list of the expected model outputs
- assertions (list of str) – This list of strings describes the tests that will be run on comparison data. Defaults to [‘chi2_test’, ‘exist’]
-
check_chi2
(test_data)[source]¶ Test whether test_data is similar to reference data.
If the returned chi2-test-statistic is greater than the threshold (default 2), the test failed.
The threshold is set by assertion_params[‘chi2_test’].
Uses chi2_contingency test from scipy.
Parameters: test_data (list or np.array) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
check_data
(test_data)¶ Check whether test_data is as expected.
Run threw all tests in assertions and return whether the data passed these tests.
Parameters: test_data (list or np.array) – This the data that will be compared to the reference data. Returns: output – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail. Return type: list of tuples
-
check_exist
(test_data)[source]¶ Check that all distinct values present in test_data.
If any values missing, then the function will return a False (rather than true).
The expected values is controlled by assertion_params[‘cat_exists’].
Parameters: test_data (list or np.array) – This the data that will be compared to the reference data. Returns: 2 item tuple with test name and boolean expressing whether passed test. Return type: (string, bool)
-
update_chi2_test
(input_data)[source]¶ Create partially evaluated chi2 contingency test.
Uses chi2_contingency test from scipy.
Parameters: input_data (list or np.array) – This the reference data for the ks-test. All future data will be compared to this data. Returns: Return type: None
-
update_exist
(input_data)[source]¶ Create input data for test checking whether all categorical outputs exist.
Parameters: input_data (list or np.array) – This the reference data for the check_exist. All future data will be compared to it. Returns: Return type: None
-
update_param
(param_key, param_value)¶ Update value in assertion param dictionary attribute.
Parameters: - param_key (string) – This is the assertion param that we want to update.
- param_value (real number or partially evaluated test.) – This is the updated value.
Returns: Return type: None
Utilities¶
Helper functions for the predeval module.
-
predeval.utilities.
evaluate_tests
(test_ouputs, assert_test=False, verbose=True)[source]¶ Check whether the data passed evaluation tests.
Parameters: - test_ouputs (list of tuples) – Each tuple has a string a boolean. The string describes the test. The boolean describes the outcome. True is a pass and False is a fail. This is the output of the check_data method.
- assert_test (bool) – Whether to assert the test passed. Default is False.
- verbose (bool) – Whether to print whether each test was passed or not.
Returns: Return type: None