Skip to content

Project Headache #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 91 commits into from
Mar 15, 2018
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
3728e6b
first commit to new statistical test structure
Feb 19, 2018
45a1dbc
small fix - newline
Feb 19, 2018
d68a897
Add formula field of derived kpi
shansfolder Feb 19, 2018
67eaa05
Add docstring + small update
shansfolder Feb 19, 2018
9e986ee
Add check for input type
shansfolder Feb 20, 2018
99ebea3
Add KPI class
shansfolder Feb 20, 2018
b77b6ce
Merge pull request #193 from zalando/results_data_structure
shansfolder Feb 21, 2018
179723b
Make result classes json serializable
shansfolder Feb 21, 2018
5470c64
Fix double confidence interval fields
shansfolder Feb 21, 2018
90d58ab
Merge pull request #195 from zalando/json_serializable
shansfolder Feb 21, 2018
625a21b
First steps toward the biggest refactoring
shansfolder Feb 21, 2018
53d3a52
refactoring experiment, delta, features checks
Feb 22, 2018
cc04de8
Another round of implementation
shansfolder Feb 22, 2018
e91b2c8
One more round of implementation
shansfolder Feb 22, 2018
4f1254b
Improve docstrings
shansfolder Feb 22, 2018
3f4e0ab
fix typos in wordings
Feb 22, 2018
cb86b8c
simplified formula for derived kpi
Feb 23, 2018
a4e514e
methods apply to data for KPI and FeatureFilter
Feb 23, 2018
c8d0cd7
fix naming
Feb 23, 2018
9953c9a
Small update
shansfolder Feb 23, 2018
79cdafe
Small update
shansfolder Feb 23, 2018
39c4c68
Fix get_variant method
shansfolder Feb 23, 2018
febcbaf
added checks for get_variant result and fix for data subset
Feb 23, 2018
63e2d6d
Merge pull request #196 from zalando/adapt_experiment_module
shansfolder Feb 26, 2018
f83a604
Small update based on comments
shansfolder Feb 26, 2018
2a3b7ee
adopting statistics to new result structure
Feb 26, 2018
fed46b3
removed test from statistical test to experiment test
Feb 26, 2018
bf212fb
Cleanup statistics.py
shansfolder Feb 26, 2018
86e8ea5
Adapt test for statistics
shansfolder Feb 26, 2018
0df29f5
Adapt all tests in statistics.py
shansfolder Feb 27, 2018
adfd52f
adapted early_stopping results to the new result structure
Feb 27, 2018
42ae926
Implement computation of p value
shansfolder Feb 27, 2018
fcee28e
fixed docstring formatting
Feb 27, 2018
3b9a49d
Add helper function for computing p value
shansfolder Feb 27, 2018
3daa439
Add unit test for computing p value
shansfolder Feb 27, 2018
0eadc8b
Merge branch 'adopt_statistics' of github.com:zalando/expan into adap…
Feb 27, 2018
65806dc
fixed p value calculation call and adapt unit tests
Feb 27, 2018
ae72e66
fixes by comments, added check for np.ndarray
Feb 27, 2018
12b330a
removed list conversion for dummy random samples
Feb 27, 2018
e2b8f74
Add warning to logger
shansfolder Feb 28, 2018
f9e71ed
added docstring, checks and logger, small fixes
Feb 28, 2018
7a38c96
small typo
Feb 28, 2018
32884c2
Merge remote-tracking branch 'origin/adopt_statistics' into adapt_ear…
shansfolder Feb 28, 2018
d8699ab
adapted util module and util unit tests
Feb 28, 2018
d065756
Merge pull request #199 from zalando/adapt_util
shansfolder Mar 1, 2018
8d0db33
Merge pull request #198 from zalando/adapt_early_stopping
shansfolder Mar 1, 2018
005cffa
Merge pull request #197 from zalando/adopt_statistics
shansfolder Mar 1, 2018
ee89919
implementing correction module
Mar 2, 2018
6ab897a
added small fixes from comments
Mar 2, 2018
717c0a6
Merge remote-tracking branch 'origin/headache' into correction_module
shansfolder Mar 2, 2018
a09bed6
Remove duplicated check
shansfolder Mar 2, 2018
1e1b269
Use pd.unique instead of numpy
shansfolder Mar 5, 2018
7998bef
Merge branch 'headache' of github.com:zalando/expan into correction_m…
Mar 5, 2018
6076e00
Use enum for correction method
shansfolder Mar 5, 2018
c63767f
Merge branch 'headache' into correction_module
shansfolder Mar 5, 2018
f6bdd7e
Merge branch 'correction_module' of github.com:zalando/expan into cor…
Mar 5, 2018
c50ef62
correction implementation start
Mar 5, 2018
1cb449d
Remove separate type check method
shansfolder Mar 5, 2018
4647e74
Start impl. correction module
shansfolder Mar 5, 2018
ed68813
added unit tests for multiple correction
Mar 6, 2018
b51b28c
small renamings
Mar 6, 2018
9cf5007
adopted unit tests for correction
Mar 6, 2018
8a29d03
added unit test for correction, refactored correction methods
Mar 6, 2018
0a4c983
Small fix
shansfolder Mar 6, 2018
f0c06c0
adapted unit tests for experiment module
Mar 6, 2018
bc8ebd5
fixed csv_fetcher class, fixed unit tests
Mar 6, 2018
fe1590a
added helped methods unit tests
Mar 6, 2018
e2abd26
added unit tests for the reweigtening trick
Mar 6, 2018
a3dc800
adapted unit tests for re-weighting trick
daryadedik Mar 7, 2018
85392ec
refactored util for test_core, added docstring
daryadedik Mar 7, 2018
f78116b
Intermediate step of test_experiment
shansfolder Mar 7, 2018
d0a94fa
Another intermediate step
shansfolder Mar 7, 2018
72dfe3f
Finish test_experiment
shansfolder Mar 7, 2018
2d5becf
Merge pull request #201 from zalando/correction_module
shansfolder Mar 7, 2018
d7a01bf
added logger and replaced assertNumericalEqual with assertAlmostEqual
daryadedik Mar 8, 2018
5d80948
added docstrings where missing
daryadedik Mar 12, 2018
296ed7a
removed a quote, minor
daryadedik Mar 12, 2018
24ef447
Improve docstring in sphinx
shansfolder Mar 12, 2018
85defea
Improve and fix docstrings
shansfolder Mar 14, 2018
3accd73
Improve and fix docstrings
shansfolder Mar 14, 2018
cefa11c
Merge branch 'documentation' of github.com:zalando/expan into documen…
shansfolder Mar 14, 2018
06a2dc5
Update tutorial.rst (half way done)
shansfolder Mar 14, 2018
5fa1a7b
Update tutorial.rst further
shansfolder Mar 14, 2018
406c39b
Merge remote-tracking branch 'origin/master' into headache
shansfolder Mar 15, 2018
4021929
Merge remote-tracking branch 'origin/headache' into documentation
shansfolder Mar 15, 2018
3a78229
Update changelog
shansfolder Mar 15, 2018
38382e7
Update contributing page
shansfolder Mar 15, 2018
c649c8d
Finish doc for the new version
shansfolder Mar 15, 2018
9c15a89
Merge remote-tracking branch 'origin/master' into headache
shansfolder Mar 15, 2018
823506e
Merge remote-tracking branch 'origin/headache' into documentation
shansfolder Mar 15, 2018
fd60447
Merge pull request #204 from zalando/documentation
shansfolder Mar 15, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions expan/core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

from __future__ import absolute_import

# __all__ = ["binning", "experiment", "experimentdata", "results", "statistics", "util", "version"]
__all__ = ["binning", "experiment", "statistics", "util", "version"]

from expan.core.version import __version__, version

__all__ = ["binning", "early_stopping", "experiment", "statistics", "util",
"version", "results", "correction", "statistical_test"]

print(('ExpAn core init: {}'.format(version())))
6 changes: 4 additions & 2 deletions expan/core/binning.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# TODO: This module is deprecated

import logging
import warnings
from heapq import heapify, heappush, heappop

import numpy as np

from expan.core.util import is_number_and_nan
from expan.core.util import is_nan

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -174,7 +176,7 @@ def create_bins(data, n_bins):
raise ValueError('Less than one bin makes no sense.')

insufficient_distinct = False
n_unique_values = len(np.unique([value for value in data if not is_number_and_nan(value)]))
n_unique_values = len(np.unique([value for value in data if not is_nan(value)]))
if n_unique_values < n_bins:
insufficient_distinct = True
warnings.warn("Insufficient unique values for requested number of bins. " +
Expand Down
Empty file added expan/core/correction.py
Empty file.
399 changes: 221 additions & 178 deletions expan/core/early_stopping.py

Large diffs are not rendered by default.

402 changes: 172 additions & 230 deletions expan/core/experiment.py

Large diffs are not rendered by default.

117 changes: 117 additions & 0 deletions expan/core/results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
from expan.core.util import JsonSerializable


# --------- Below are the data structure of statistics --------- #
class BaseTestStatistics(JsonSerializable):
""" Holds only statistics for the control and treatment group.

:param control_statistics: statistics within the control group
:type control_statistics: SampleStatistics
:param treatment_statistics: statistics within the treatment group
:type treatment_statistics: SampleStatistics
"""
def __init__(self, control_statistics, treatment_statistics):
self.control_statistics = control_statistics
self.treatment_statistics = treatment_statistics


class SampleStatistics(JsonSerializable):
""" This class holds sample size, mean and variance.

:type sample_size: int
:type mean: float
:type variance: float
"""
def __init__(self, sample_size, mean, variance):
self.sample_size = sample_size
self.mean = mean
self.variance = variance


class SimpleTestStatistics(BaseTestStatistics):
""" Additionally to BaseTestStatistics, holds delta, confidence interval, statistical power, and p value.

:type control_statistics: SampleStatistics
:type treatment_statistics: SampleStatistics
:type delta: float
:type p: float
:type statistical_power: float
:param ci: a dict where keys are percentiles and values are the corresponding value for the statistic.
:type ci: dict
"""
def __init__(self, control_statistics, treatment_statistics, delta, ci, p, statistical_power):
super(SimpleTestStatistics, self).__init__(control_statistics, treatment_statistics)
self.delta = delta
self.p = p
self.statistical_power = statistical_power
# TODO: think of structure {p: v} (the same as ci)
self.confidence_interval = [{'percentile': p, 'value': v} for (p, v) in ci.items()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we have a class for ConfidenceInterval?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a class previously, but then we found that it leads to two times of the fields:

"confidence_interval": {
    "confidence_interval" : [
        {"percentile": 2.5, "value": -1},
        {"percentile": 97.5.5, "value": 1}
    ]
}

Copy link
Contributor Author

@shansfolder shansfolder Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO here is to think of whether we want:

    "confidence_interval" : [
        {"percentile": 2.5, "value": -1},
        {"percentile": 97.5.5, "value": 1}
    ]

or

    "confidence_interval" : [
        {"2.5": -1},
        {"97.5", 1}
    ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you for clarifying this one for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem I see here with structure {'2.5': 0.5, '97.5': 0.4} is that everywhere where we need to access those fields we always need to know those numbers (2.5 and 97.5). If they are fixed, then it's ok-ish, but things can change if we decide to have another percentiles (for example) or add some intermediate (we allowed that before as far as I remember).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. Then I think we can stay with {"percentile": 2.5, "value": -1}.



class EarlyStoppingTestStatistics(SimpleTestStatistics):
""" Additionally to SimpleTestStatistics, holds boolean flag for early stopping.

:type control_statistics: SampleStatistics
:type treatment_statistics: SampleStatistics
:type delta: float
:type p: float
:type statistical_power: float
:param ci: a dict where keys are percentiles and values are the corresponding value for the statistic.
:type ci: dict
:type stop: bool
"""
def __init__(self, control_statistics, treatment_statistics, delta, ci, p, statistical_power, stop):
super(EarlyStoppingTestStatistics, self).__init__(control_statistics, treatment_statistics, delta, ci, p, statistical_power)
self.stop = stop


class CorrectedTestStatistics(JsonSerializable):
""" Holds original and corrected statistics. This class should be used to hold statistics for multiple testing.
original_test_statistics and corrected_test_statistics should have the same type.

:param original_test_statistics: test result before correction
:type original_test_statistics: SimpleTestStatistics or EarlyStoppingTestStatistics
:param corrected_test_statistics: test result after correction
:type corrected_test_statistics: SimpleTestStatistics or EarlyStoppingTestStatistics
"""
def __init__(self, original_test_statistics, corrected_test_statistics):
type1 = type(original_test_statistics)
type2 = type(corrected_test_statistics)
if type1 != type2:
raise RuntimeError("Type mismatch for type " + str(type1) + " and " + str(type2))
if not isinstance(original_test_statistics, BaseTestStatistics):
raise RuntimeError("Input should be instances of BaseTestStatistics or its subclass")
self.original_test_statistics = original_test_statistics
self.corrected_test_statistics = corrected_test_statistics


# --------- Below are the data structure of test results --------- #
class StatisticalTestResult(JsonSerializable):
""" This class holds the results of a single statistical test.

:param test: information about the statistical test
:type test: StatisticalTest
:param result: result of this statistical test
:type result: BaseTestStatistics or its subclasses or CorrectedTestStatistics #TODO: better approach?
"""
def __init__(self, test, result):
self.test = test
self.result = result


class MultipleTestSuiteResult(JsonSerializable):
""" This class holds the results of a MultipleTestSuite.

:param statistical_test_results: test results for all statistical testing unit
:type statistical_test_results: list[StatisticalTestResult]
:param correction_method: method used for multiple testing correction. Possible values are:
"none": no correction
"bh": benjamini hochberg correction
"bf": bonferroni correction
:type correction_method: str
"""
def __init__(self, statistical_test_results, correction_method="none"):
self.statistical_test_results = statistical_test_results
if correction_method not in ["none", "bh", "bf"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not stick to something like an Enum?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, for analysis service it's ok to have parameters for correction as 0, 1 or 2. We wanted to leave at least a short names for correction for somebody who can use expan from outside Zalando. I also don't think we need to have a class here. Or you have some advantages in mind for doing it as Enum?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each time you add a new correction type or change an existing one, you'll have to find all occurences of, say, 'bh' in the code. A dedicated Enum-type class would be the unique source of truth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gbordyugov yes I agree. One more question: do you also think we need to create a Enum for the delta methods? We currently maintain a lookup table for "group_sequential", "bayes_factor", "fixed_horizon", etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be another application of this strategy, totally agree!

raise ValueError('Correction method is not implemented. We support "none", "bh", and "bf".')
self.correction_method = correction_method
114 changes: 114 additions & 0 deletions expan/core/statistical_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
import pandas as pd
from expan.core.util import JsonSerializable


class StatisticalTest(JsonSerializable):
""" This class describes what has to be tested against what and represent a unit of statistical testing.

:param kpi: the kpi to perform on
:type kpi: KPI or its subclass
:param features: list of features used for subgroups
:type features: list[FeatureFilter]
:param variants: variant column name and their values
:type variants: Variants
"""
def __init__(self, kpi, features, variants):
if not isinstance(features, list):
raise TypeError("Features should be a list.")
if not all(isinstance(n, FeatureFilter) for n in features):
raise TypeError("Some features are not of the type FeatureFilter.")
self.kpi = kpi
self.features = features
self.variants = variants


class KPI(JsonSerializable):
""" This class represents a basic kpi.
:param name: name of the kpi
:type name: str
"""
def __init__(self, name):
self.name = name


class DerivedKPI(KPI):
""" This class represents a derived KPI which is a ratio of two columns.
Names of the the two columns are passed as numerator and denominator.

:param name: name of the kpi
:type name: str
:param numerator: the numerator for the derived KPI
:type numerator: str
:param denominator: the denominator for the derived KPI
:type denominator: str
"""
def __init__(self, name, numerator, denominator):
super(DerivedKPI, self).__init__(name)
self.numerator = numerator
self.denominator = denominator

def make_derived_kpi(self, data):
""" Create the derived kpi column if it is not yet created. """
if self.name not in data.columns:
data.loc[:, self.name] = data[self.numerator]/data[self.denominator].astype(float)


class StatisticalTestSuite(JsonSerializable):
""" This class consists of a number of tests plus choice of the correction method.

:param tests: list of statistical tests in the suite
:type tests: list[StatisticalTest]
:param correction_method: method used for multiple testing correction. Possible values are:
"none": no correction
"bh": benjamini hochberg correction
"bf": bonferroni correction
:type correction_method: str
"""
def __init__(self, tests, correction_method="none"):
self.tests = tests
if correction_method not in ["none", "bh", "bf"]:
raise ValueError('Correction method is not implemented. We support "none", "bh", and "bf".')
self.correction_method = correction_method

@property
def size(self):
return len(self.tests)


class FeatureFilter(JsonSerializable):
""" This class represents a filter, restricting a DataFrame to rows with column_value in column_name.

It can be used to specify subgroup conditions.
:param column_name: name of the column to perform filter on
:type column_name: str
:param column_value: value of the column to perform filter on
:type column_value: str
"""
def __init__(self, column_name, column_value):
self.column_name = column_name
self.column_value = column_value

def apply_to_data(self, data):
return data[data[self.column_name] == self.column_value]


class Variants(JsonSerializable):
""" This class represents information of variants.

:param variant_column_name: name of the column that represents variant
:type variant_column_name: str
:param control_name: value of the variant that represents control group
:type control_name: str
:param treatment_name: value of the variant that represents control group
:type treatment_name: str
"""
def __init__(self, variant_column_name, control_name, treatment_name):
self.variant_column_name = variant_column_name
self.control_name = control_name
self.treatment_name = treatment_name

def get_variant(self, data, variant_name):
result = data[data[self.variant_column_name] == variant_name]
if not isinstance(result, pd.DataFrame):
result = pd.DataFrame([result])
return result
Loading