A/B test with python

This is just going to make our generated datas as well as is able to emulate a real data collection surface.

```class DataGenerator:
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2

def next(self):
if np.random.random() < self.p1:
click1 = 1
else:
click1 = 0

if np.random.random() < self.p2:
click2 = 1
else:
click2 = 0
return click1, click2
```

p1 and p2 are probability of click for group 1 and group 2.

Next, I will write a function for obtaining the p-value.

```def get_p_value(T):
det = T[0,0]*T[1,1] - T[0,1]*T[1,0]
c2 = float(det) / T.sum() * det / T.sum() * T.sum() / T[:,0].sum() / T[:,1].sum()
p = 1 - chi2.cdf(x=c2, df=1)
return p
```

I am going to explain p-value later in the other article.

Next, I will write a function for running a experiment.
That is going to include the probability of click for group1 and group2 and the number of samples.
In this case, I am going to take 2500 samples.

```def run_experiment(p1, p2, N):
data = DataGenerator(p1, p2)
p_values = np.empty(N)
T = np.zeros((2,2)).astype(np.float32)
for i in range(N):
c1, c2 = data.next()
T[0,c1] += 1
T[1,c2] += 1
if i < 10:
p_values[i] = None
else:
p_values[i] = get_p_value(T)
plt.plot(p_values)
plt.plot(np.ones(N)*0.05)
plt.show()

run_experiment(0.1, 0.11, 2500)
```
```    data = DataGenerator(p1, p2)
```

The data written above is to create an instance of data generator.

```     if i < 10:
p_values[i] = None
else:
p_values[i] = get_p_value(T)
```

We have to ignore the first a few datas in terms of taking into account p-value.
Because if we try to calculate the p-value too early, the formula might be broken.

```      c2 = float(det) / T.sum() * det / T.sum() * T.sum() / T[:,0].sum() / T[:,1].sum()
```

I divided by row sums and column sums, so if any of those are 0, I cannot calculate it.