A/B Testing Python Project: Testing whether a Smart Ad increases consumer response rates

April 21, 2022 10 minute read

1) Problem Scenario

As part of its operations, a business shows its customers an ad, the purpose of which is to encourage the customer to fill out a ‘BIO questionnaire’. Customers may see this questionnaire and choose to either:

fill it out with the option of clicking on ‘Yes’
reject filling it out with the option of clicking ‘No’
Ignore the questionnaire entirely by clicking neither ‘Yes’ or ‘No’

The ad the business is currently using is called a ‘dummy ad’; it is plain and not interactive. The response rate is a measure of effectiveness of the ad, and is equal to the number of ‘Yes’ or ‘No’ responses collectively divided by the total number of customers the ad is shown to.

The business wants to come up with a new ad that increases this response rate. It calls this the ‘Smart Ad’ or the ‘Creative Ad’, an interactive, colourful cousin of its blander counterpart. It wishes to see whether or not showing this Smart Ad to customers will actually increase the response rate

1.2) Business Objective and Effect Size

The company aims to see a minimum detectable effect of a 2% increase in response rates as a result of the new Creative Ad

1.3) The question we want to answer and our hypotheses:

“Is there a statistically significant difference in response rates to the BIO questionnaire between the control and exposed groups?”

We will see shortly that in the column descriptions, the description of the auction_id column says that:

“The user may see the BIO questionnaire but choose not to respond. In that case both the yes and no columns are zero.”

In this context, we are not trying to find out which group saw more people choose ‘Yes’ and fill out the questionnaire. We are simply finding which group elicited more people to respond, whether their response was ‘Yes’ or ‘No’

So, to create our outcome variable, we need to find out the proportions of respondents (either ‘yes’ = 1 or ‘no’ = 1) to the total number of people, for both the control and exposed groups

1.31) Null and Alternative Hypotheses

We hypothesise that our ‘exposed’ group will have a higher response rate compared to our ‘control’ group. So we will conduct a one-tailed test.

If p_c is the response rate of the control group and p_e is the response rate of the exposed group, then:

H0: SmartAd recipients that receive the Creative Ad will have the same response rate as the response rate of recipients who receive the dummy ad

p_c = p_e

H1: SmartAd recipients that receive the Creative Ad will have a higher response rate than the response rate of recipients who receive the dummy ad

p_c < p_e

1.4) Limitations

These primarily result from the existence of other independent variables we do not account for, as they might have effects on the outcome. For example, the day of the week or hour of the day that the creative ad is shown to individuals might influence whether or not they respond to it.

One way to deal with this is to fix these values. This would mean, for example, we only use those observations from the dataset where the date or hour value is the same. That is to say, we may only look at those individuals who saw the creative ad on the same day or even within the same hour.

However, this will lead to a substantial reduction in sample size, which will directly and negatively impact our hypothesis testing. Therefore, we do not use this approach.

Another way to minimise these effects is to have truly random sampling. This is done to a. Distribute co-variates evenly and b. eliminate statistical bias as much as possible.

Since we did not sample this data ourselves, we cannot be 100% certain about the randomness of the sampling

1.5) Data Source

The dataset is obtained from Kaggle and was uploaded by user Osuolale Emmanuel

2) Loading and Pre-Processing Data

# imports

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.style as style
import seaborn as sns

%matplotlib inline

# visualisation presets

style.use('fivethirtyeight')
preferred_font = {'fontname':'Helvetica'}

# read in data

data = pd.read_csv('/Users/alitaimurshabbir/Desktop/AdSmartABdata - AdSmartABdata.csv')
print('size of dataset: {0} and number of columns: {1}'.format(len(data), len(data.columns)))

# data types

print('------------')
print('column types:')

for column, dtype in data.dtypes.iteritems():
    print(column, dtype)

size of dataset: 8077 and number of columns: 9
------------
column types:
auction_id object
experiment object
date object
hour int64
device_make object
platform_os int64
browser object
yes int64
no int64

2.1) Description of Columns

Column	Description
auction_id	the unique id of the online user who has been presented the BIO. In standard terminologies this is called an impression id. The user may see the BIO questionnaire but choose not to respond. In that case both the yes and no columns are zero
experiment	which group the user belongs to - control or exposed.
date	the date in YYYY-MM-DD format
hour	the hour of the day in HH format.
device_make	the name of the type of device the user has e.g. Samsung
platform_os	the id of the OS the user has.
browser	the name of the browser the user uses to see the BIO questionnaire.
yes	1 if the user chooses the “Yes” radio button for the BIO questionnaire.
no	1 if the user chooses the “No” radio button for the BIO questionnaire.

2.2) Check for duplicates and null values

data[data.duplicated(['auction_id'], keep=False)]

	auction_id	experiment	date	hour	device_make	platform_os	browser	yes	no

There are no duplicates

data.isnull().sum().sum()

There are no null values

2.3) Get data in the required format

# create outcome variable 'response'. If either 'yes' or 'no' = 1, then response = 1
# if both 'yes' and 'no' = 0, then response = 0

conditions = [
    (data['yes'] == 0) & (data['no'] == 0),
    (data['yes'] == 1) & (data['no'] == 0),
    (data['yes'] == 0) & (data['no'] == 1),
]

outputs = [0, 1, 1]

data['response'] = np.select(conditions, outputs, 99)

# create a crosstab to understand relevant variables

# responders in control group
control = data[data['experiment'] == 'control']['response']

# responders in exposed group
exposed = data[data['experiment'] == 'exposed']['response']

cross_tab = pd.DataFrame(pd.crosstab(data['experiment'], data['response']))

# add response rate for each group
cross_tab['response_rate'] = cross_tab[1]/(cross_tab[0] + cross_tab[1])

# create dataframes that hold data for either group, then concatenate

control_sample = data[data['experiment'] == 'control']

exposed_sample = data[data['experiment'] == 'exposed']

ab_test = pd.concat([control_sample, exposed_sample], axis = 0)

ab_test

	auction_id	experiment	date	hour	device_make	platform_os	browser	yes	no	response
3	00187412-2932-4542-a8ef-3633901c98d9	control	2020-07-03	15	Samsung SM-A705FN	6	Facebook	0	0	0
4	001a7785-d3fe-4e11-a344-c8735acacc2c	control	2020-07-03	15	Generic Smartphone	6	Chrome Mobile	0	0	0
5	0027ce48-d3c6-4935-bb12-dfb5d5627857	control	2020-07-03	15	Samsung SM-G960F	6	Facebook	0	0	0
6	002e308b-1a07-49d6-8560-0fbcdcd71e4b	control	2020-07-03	15	Generic Smartphone	6	Chrome Mobile	0	0	0
7	00393fb9-ca32-40c0-bfcb-1bd83f319820	control	2020-07-09	5	Samsung SM-G973F	6	Facebook	0	0	0
...	...	...	...	...	...	...	...	...	...	...
8065	ffbc02cb-628a-4de5-87fc-5d76b7d796e5	exposed	2020-07-09	17	Generic Smartphone	6	Chrome Mobile	0	0	0
8067	ffc594ef-756c-4d24-a310-0d8eb4e11eb7	exposed	2020-07-05	1	Samsung SM-G950F	6	Chrome Mobile WebView	0	0	0
8071	ffdfdc09-48c7-4bfb-80f8-ec1eb633602b	exposed	2020-07-03	4	Generic Smartphone	6	Chrome Mobile	0	1	1
8072	ffea24ec-cec1-43fb-b1d1-8f93828c2be2	exposed	2020-07-05	7	Generic Smartphone	6	Chrome Mobile	0	0	0
8075	ffeeed62-3f7c-4a6e-8ba7-95d303d40969	exposed	2020-07-05	15	Samsung SM-A515F	6	Samsung Internet	0	0	0

8077 rows × 10 columns

# find standard deviation for each group and display crosstab

std_deviation = ab_test.groupby('experiment')['response'].std()
cross_tab = pd.concat([cross_tab, std_deviation], axis = 1)
cross_tab.rename({'response':'std_deviation'}, axis = 1, inplace = True)
cross_tab

	0	1	response_rate	std_deviation
experiment
control	3485	586	0.143945	0.351077
exposed	3349	657	0.164004	0.370325

It seems that the response rate for the exposed group is about 2% greater than in the control group, which is what we expected to see
Standard deviation across both groups is also very similar

3) EDA

3.1) Number of Observations in Each Sample

colours = ['#564787', '#F2FDFF']
experiment_countValues = ab_test['experiment'].value_counts()
experiment_countValues.plot(kind = 'pie', colors = colours, figsize = (6, 6),
            title = 'Sizes of Control and Exposed Groups - % of Total Observsations',
            fontsize = 13, autopct='%1.1f%%', shadow = True)

<AxesSubplot:title={'center':'Sizes of Control and Exposed Groups - % of Total Observsations'}, ylabel='experiment'>

None

There is thankfully an equal split of observations in our control and exposed groups

3.2) Response Rates of Both Groups

plot = pd.DataFrame(ab_test.groupby('experiment')['response'].mean()).reset_index()
plt.figure(figsize = (6, 6))
plt.bar(plot['experiment'], plot['response'], color = 'slateblue')
plt.title('Overall Response Rates of control vs exposed group', **preferred_font, fontsize = 16)
plt.xlabel('Group', **preferred_font, fontsize = 12)
plt.ylabel('Response Rate', **preferred_font, fontsize = 12)

Text(0, 0.5, 'Response Rate')

None

As shown in the crosstab previously, the exposed group has about a 2% higher response rate

3.3) Number of Responses By Time of Day

plt.figure(figsize = (8, 4))
plt.bar(ab_test.groupby('hour')['response'].sum().index,
        ab_test.groupby('hour')['response'].sum(), color = '#9AD4D6')
plt.title('Number of Responses by Hour - both groups combined', **preferred_font, fontsize = 16)
plt.xlabel('Hour of Day', **preferred_font, fontsize = 12)
plt.ylabel('Number of Responses', **preferred_font, fontsize = 12)

Text(0, 0.5, 'Number of Responses')

None

The data is dominated for both groups by the time slot of 3 pm. In other words, customers in both the control and exposed groups responded to the dummy and Creative Ad respectively to a much greater extent when they saw the ad at 3 pm.
This may simply be because most people check their computers/phones at around this hour, perhaps just before their lunch break finishes. However, this is purely theoretical on my part

3.4) Total Number of Individuals Who Received Either Ad

plt.figure(figsize = (8, 4))
plt.bar(ab_test.groupby('hour')['auction_id'].count().index,
        ab_test.groupby('hour')['auction_id'].count(), color = 'slateblue')
plt.title('Total Number of Individuals Who Received an Ad, by Hour - both groups combined',
          **preferred_font, fontsize = 14)
plt.xlabel('Hour of Day', **preferred_font, fontsize = 11)
plt.ylabel('Number of Individuals', **preferred_font, fontsize = 11)

Text(0, 0.5, 'Number of Individuals')

None

We see the same phenomenon as above, only this time for all ads received, whether they were responded to or not

3.5) Response Rates by Date and Group

# sort dataframe by date and make a copy

ab_test_date_sorted = ab_test.sort_values(by = ['experiment', 'date'])

# Draw a grouped barplot by date and group

sns.catplot(data = ab_test_date_sorted, kind = "bar", x="date",
            y = "response", hue = "experiment", palette="mako",
            alpha = 0.9, ci = None, aspect = 18/9)

plt.title('Response Rates by Date and Group', size = 18)

Text(0.5, 1.0, 'Response Rates by Date and Group')

None

The response rate for the exposed group is greater for all days except for the very last date in the experiment.
This is interesting as it could point to the existence of a ‘novelty effect’; perhaps customers are only responding to the Creative Ad (exposed group) to a greater extent compared to the dummy ad (control group) simply because the Creative Ad is ‘new’

4) Hypothesis Testing

4.1) What test should we use?

Given that we are working with proportions, 2 samples and a fairly large sample size (n > 4000 for either group), we will use a Z-test for proportions

4.2) Conduct the test

Reiterating our hypotheses:

H0: SmartAd recipients that receive the Creative Ad will have the same response rate as the response rate of recipients who receive the dummy ad

p_c = p_e

H1: SmartAd recipients that receive the Creative Ad will have a higher response rate than the response rate of recipients who receive the dummy ad

p_c < p_e

from statsmodels.stats.proportion import proportions_ztest, proportion_confint

# initialise responses for either group into separate variables

control_results = ab_test[ab_test['experiment'] == 'control']['response']
exposed_results = ab_test[ab_test['experiment'] == 'exposed']['response']

# find the number of observations for each group/sample

control_n = control_results.count()
exposed_n = exposed_results.count()

# find the number of successes for each group/sample

successes = [control_results.sum(), exposed_results.sum()]

n_observations = [control_n, exposed_n]

# conduct the test

z_stat, p_val = proportions_ztest(successes, nobs = n_observations)

(lower_con, lower_treat),(upper_con, upper_treat) = proportion_confint(successes,
                                                                       nobs = n_observations,
                                                                       alpha=0.05)

print('z-statistic: {}'.format(z_stat))
print('p-value: {}'.format(p_val))
print(f'95% Confidence Interval for control group: [{lower_con:.3f}, {upper_con:.3f}]')
print(f'95% Confidence Interval for exposed group: [{lower_treat:.3f}, {upper_treat:.3f}]')

z-statistic: -2.4978575502090976
p-value: 0.012494639102152035
95% Confidence Interval for control group: [0.133, 0.155]
95% Confidence Interval for exposed group: [0.153, 0.175]

4.3) Test and Business Conclusions

We see that our p-value < 0.05. Therefore, we can reject the null hypothesis. It is highly likely that the response rate of the exposed group which sees the creative ad is greater than the response rate of the control group which sees the dummy ad.
Moreover, the confidence intervals tell us a similar story. The CI for the exposed group i) does not include the control group response rate of 0.143945 and ii) does include the 2% increase the business aimed to see (0.143945 + 0.02 = 0.163945)
This means that the business can be highly confident that the Creative Ad elicits more response rates from its customers, and that it should from now on show the Creative Ad to all of its customers (the population)

Taimur Shabbir

A/B Testing Python Project: Testing whether a Smart Ad increases consumer response rates

1) Problem Scenario

1.2) Business Objective and Effect Size

1.3) The question we want to answer and our hypotheses:

1.31) Null and Alternative Hypotheses

1.4) Limitations

1.5) Data Source

2) Loading and Pre-Processing Data

2.1) Description of Columns

2.2) Check for duplicates and null values

2.3) Get data in the required format

3) EDA

3.1) Number of Observations in Each Sample

3.2) Response Rates of Both Groups

3.3) Number of Responses By Time of Day

3.4) Total Number of Individuals Who Received Either Ad

3.5) Response Rates by Date and Group

4) Hypothesis Testing

4.1) What test should we use?

4.2) Conduct the test

4.3) Test and Business Conclusions

You may also enjoy

SQL Ongoing Post: SQL Problems & Solutions to LeetCode and DataLemur Problems - ‘Medium’ & ‘Hard’ Difficulties

Time Series Forecast: Predicting Daily Sales for Walmart

Python Project: Analysing Customer Purchasing Behaviour in Retail

Python Project: Evaluating Marketing Campaign and Performing Customer Clustering