in thermodynamics) Fundamental principles (rather than brute force ML) Eﬀective for implementation/business logic/research (this work is based on a real-life story)
Programmatic buying of ad-inventory on ad-exchanges through real-time auctions Choose the right places to buy ad-inventory related to the campaign objectives Proﬁling, tracking, targeting users potentially in aﬃnity with the campaign
client goals, metrics (CPM / CPA) Advertising perspective: marketing funnel, ad-pressure Diﬀerent goals (CPM, CPA, best sites, best users), which seem contradictory, and only one variable to control: the bid. How to optimize this process?
user-per-user. This is quite complex. So, as the old computer science proverb says: What to do when we don’t know what to do? Try machine learning! ...or we can use a top-down approach: 1. Fundamental principles (money in/conversions out) 2. Classify on broad categories and reﬁne progressively. 3. Mathematically friendly, provides intuitions, no black-box.
n+1 = θ(k) n + γn+1 1{Kn+1 =k} − θ(k) n Stochastic Approximation guarantees convergence/infallibility for a learning rate γn = O(n−1) (Ref: Lamberton, Pag` es, Tarr` es. When can the two-armed bandit algorithm be trusted?)
event time If conversions are observed right after the spent, then the algorithm converges and it is infallible. If there is a delay between spent and conversions the mathematical analysis is not trivial (open problem). (Ref: Fernandez-Tapia, Monzani. Stochastic Multi-Armed Bandit Algorithm for Optimal Budget Allocation in Programmatic Advertising – Preprint)
of conversions. Examples: post-view visit, post-click visit, ﬁlling a form, put articles in basket, click inside the homepage. Questions: Correlation with ﬁnal conversions? Are this events informative or not? How to ‘value’ each type of event
we are not certain it is a good ‘event’ for optimization Final conversions are the right event to optimize however they arise only few times every day How to measure the impact of intermediate conversions on ﬁnal conversions? An idea: Hawkes processes
non-homogeneous intensity. In our case, the intensity leading to the observation of ﬁnal conversions is excited by an exogenous process. The intensity evolves by an equation of the form λt = λ + t 0 αe−β(t−s)dNs This means that each event of the process N increases the probability of observing events of our process of ﬁnal conversions. (Problem: non-trivial parameter estimation)
optimize with a ﬁner granularity, we would like to stop spending on domains and placements which are under-performing. Because the campaign evolves and so the data-set is increasing we would like to apply a sequential alternative (instead of classical hypothesis testing or a likelihood-ratio test). Solution: Sequential Probability Ratio Test (SPRT).
H0 and H1, the SPRT decides by studying the stopping time when the following process crosses a barrier Tn = Πn k=1 p1(xk) Πn k=1 p0(xk) In the case we want to contrast if the intensity of observing conversions is λ0 or λ1, we track when the process deﬁned as Tn = Nn log λ1 λ0 − (λ1 − λ0)n, crosses some given barriers. (Ref: Peskir, Shiryaev. Sequential Testing Problems for Poisson Processes)
know the parameters λ0 and λ1. In practice at the same time we perform the test, we are estimation λ0 and λ1. The solution of this problem is non-trivial (work in progress: Fernandez-Tapia/Monzani). Strategy vs. Tactics: Allocation, types of conversion and blacklists are ‘Strategic problems’ (focused on CPA). Once we ﬁx a context we need to optimize how to interact with exchanges (i.e. Tactical issues – focus on CPM).
∈ R+, CPM-optimally Dynamic model: (imps) I (t) = w(t)B(t) (spent) S (t) = w(t)B(t)p(t) Def: w: win-rate (pct. auctions won) p: average price per impression B: requests per unit of time
to min. budget with ﬁxed impressions) minS(T ) = T 0 I (t)p(t)dt s.t. I0 = 0, IT = I Euler-Lagrange equation: The solution of min f (t,x, ˙ x)dt with x(0) = a and x(T ) = b is deﬁned by ∂f ∂x − d dt ∂f ∂ ˙ x = 0
(in terms of CPM) is, at each instant, to spend proportional to the number of bid-requests per unit of time. i.e. S (t) = CB(t) (Notice that no hypothesis about the functional relation between bid, price and winrate is needed.) Proof: Solve the Euler-Lagrange equation. Ref. (in a more general setting): Fernandez-Tapia. Optimal Budget-Pacing for Real-Time Bidding. Preprint.
conversion) Homogeneous segments Homogeneous contexts the optimal (in terms of CPM) is bidding the same everywhere (regardless statistical diﬀerences across auctions!) Proof: Classical convex optimization.
at SSRN 2576212. Fernandez-Tapia, J., & Monzani, C. (2015). Stochastic Multi-Armed Bandit Algorithm for Optimal Budget Allocation in Programmatic Advertising. Available at SSRN 2600473. Narendra, K. S., & Thathachar, M. A. (2012). Learning automata: an introduction. Courier Corporation. Lamberton, D., Pag` es, G., & Tarr` es, P. (2004). When can the two-armed bandit algorithm be trusted?. Annals of Applied Probability, 1424-1454. Peskir, G., & Shiryaev, A. N. (2000). Sequential testing problems for Poisson processes. Annals of Statistics, 837-859.