Simplifying time-series forecasting and real-time personalization

Simplifying time-series forecasting and real-time personalization

AWS Innovate – ML & AI Edition, October 17th, 2019

https://aws.amazon.com/events/aws-innovate-2019/emea-machine-learning-edition/

Personalization and forecasting have long been very complex problems for organizations to solve. In this session, we show you how to use Amazon Personalize and Amazon Forecast, two new services that enable you to create individualized recommendations for customers and deliver highly accurate forecasts. Both run on fully managed infrastructure and provide easy-to-use recipes that deliver high-quality models even if you have little machine learning experience.

7c9b8b368924556d8642bdaed3ded1f5?s=128

Danilo Poccia

October 17, 2019
Tweet

Transcript

  1. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Simplifying time-series forecasting and real-time personalization Danilo Poccia Principal Evangelist, Serverless Amazon Web Services @danilop S e s s i o n I D
  2. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Agenda Time-series forecasting Amazon Forecast – introduction & demo Real-time personalization & recommendation Amazon Personalize – introduction & demo Takeaways
  3. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. M L F R A M E W O R K S & I N F R A S T R U C T U R E A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D & C O M P R E H E N D M E D I C A L L E X R E K O G N I T I O N V I D E O Vision Speech Language Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S T Forecasting T E X T R A C T Recommendations D E P L O Y Pre-built algorithms Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization (N E O ) M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 d n E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Reinforcement learning Algorithms & models ( A W S M A R K E T P L A C E F O R M A C H I N E L E A R N I N G ) I N F E R E N T I A Notebook Hosting One-click deployment & hosting Auto-scaling Virtual Private Cloud Private Link Elastic Inference integration Hyper Parameter Optimization P E R S O N A L I Z E
  4. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved.
  5. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Forecasting Product Demand Planning Financial planning Resource planning
  6. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Forecast
  7. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Forecast workflow 1. Create related datasets and a dataset group 2. Get training data • Import historical data to the dataset group 3. Train a predictor (trained model) using an algorithm or AutoML 4. Evaluate the predictor version using metrics 5. Create a forecast (for every item in the dataset group) 6. Retrieve forecasts for users
  8. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. How Amazon Forecast works • Dataset Groups • Datasets • TARGET_TIME_SERIES – (item_id, timestamp, demand) – demand is required • RELATED_TIME_SERIES – (item_id, timestamp, price) – no demand • ITEM_METADATA – (item_id, color, location, genre, category, …) • Predictors • Forecasts
  9. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Dataset domains Domain For RETAIL retail demand forecasting INVENTORY_PLANNING supply chain and inventory planning EC2_CAPACITY forecasting Amazon EC2 capacity WORK_FORCE work force planning WEB_TRAFFIC estimating future web traffic METRICS forecasting metrics, such as revenue and cash flow CUSTOM all other types of time-series forecasting
  10. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. TARGET_TIME_SERIES dataset timestamp item_id store demand 2019-01-01 socks NYC 25 2019-01-05 socks SFO 45 2019-02-01 shoes ORD 10 . . . 2019-06-01 socks NYC 100 2019-06-05 socks SFO 5 2019-07-01 shoes ORD 50
  11. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "attributes": [ { "attributeName": "timestamp", "attributeType": "timestamp" }, { "attributeName": "item_id", "attributeType": "string" }, { "attributeName": "store", "attributeType": "string" }, { "attributeName": "demand", "attributeType": "float" } ] } Dataset schema "YYYY-MM-DD hh:mm:ss"
  12. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Data alignment Data is automatically aggregated by forecast frequency, for example, hourly, daily, or weekly.
  13. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. RELATED_TIME_SERIES dataset timestamp item_id store price 2019-01-01 socks NYC 10 2019-01-02 socks NYC 10 2019-01-03 socks NYC 15 . . . 2019-01-05 socks SFO 45 2019-06-05 socks SFO 10 2019-07-11 socks SFO 30 . . . 2019-02-01 shoes ORD 50 2019-07-01 shoes ORD 75 2019-07-11 shoes ORD 60
  14. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Algorithms Algorithm What ARIMA Autoregressive Integrated Moving Average (ARIMA) is a commonly-used local statistical algorithm for time-series forecasting. DeepAR+ a supervised learning algorithm for forecasting scalar (one- dimensional) time series using recurrent neural networks (RNNs). Supports hyperparameter optimization (HPO). ETS Exponential Smoothing (ETS) is a commonly-used local statistical algorithm for time-series forecasting NPTS Non-Parametric Time Series (NPTS) is a scalable, probabilistic baseline forecaster algorithm. NPTS is especially useful when the time series is intermittent (or sparse, containing many 0s) and bursty. Prophet A popular local Bayesian structural time series model.
  15. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. DeepAR algorithm DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks David Salinas, Valentin Flunkert, Jan Gasthaus Amazon Research Germany <dsalina,flunkert,gasthaus@amazon.com> Abstract Probabilistic forecasting, i.e. estimating the probability distribution of a time se- ries’ future given its past, is a key enabler for optimizing business processes. In retail businesses, for example, forecasting demand is crucial for having the right inventory available at the right time at the right place. In this paper we propose DeepAR, a methodology for producing accurate probabilistic forecasts, based on training an auto-regressive recurrent network model on a large number of related time series. We demonstrate how by applying deep learning techniques to fore- casting, one can overcome many of the challenges faced by widely-used classical approaches to the problem. We show through extensive empirical evaluation on several real-world forecasting data sets accuracy improvements of around 15% compared to state-of-the-art methods. 1 Introduction Forecasting plays a key role in automating and optimizing operational processes in most businesses and enables data driven decision making. In retail for example, probabilistic forecasts of product supply and demand can be used for optimal inventory management, staff scheduling and topology planning [18], and are more generally a crucial technology for most aspects of supply chain opti- mization. The prevalent forecasting methods in use today have been developed in the setting of forecasting individual or small groups of time series. In this approach, model parameters for each given time series are independently estimated from past observations. The model is typically manually selected to account for different factors, such as autocorrelation structure, trend, seasonality, and other ex- planatory variables. The fitted model is then used to forecast the time series into the future according to the model dynamics, possibly admitting probabilistic forecasts through simulation or closed-form expressions for the predictive distributions. Many methods in this class are based on the classical Box-Jenkins methodology [3], exponential smoothing techniques, or state space models [11, 19]. In recent years, a new type of forecasting problem has become increasingly important in many appli- cations. Instead of needing to predict individual or a small number of time series, one is faced with forecasting thousands or millions of related time series. Examples include forecasting the energy consumption of individual households, forecasting the load for servers in a data center, or forecast- ing the demand for all products that a large retailer offers. In all these scenarios, a substantial amount of data on past behavior of similar, related time series can be leveraged for making a forecast for an individual time series. Using data from related time series not only allows fitting more complex (and hence potentially more accurate) models without overfitting, it can also alleviate the time and labor intensive manual feature engineering and model selection steps required by classical techniques. In this work we present DeepAR, a forecasting method based on autoregressive recurrent networks, which learns such a global model from historical data of all time series in the data set. Our method arXiv:1704.04110v3 [cs.AI] 22 Feb 2019 zi,t 2, xi,t 1 hi,t 1 `(zi,t 1 |✓i,t 1) zi,t 1 zi,t 1, xi,t hi,t `(zi,t |✓i,t) zi,t zi,t, xi,t+1 hi,t+1 `(zi,t+1 |✓i,t+1) zi,t+1 inputs network ˜ zi,t 2, xi,t 1 hi,t 1 `(zi,t 1 |✓i,t 1) ˜ zi,t 1 ˜ zi,t 1, xi,t hi,t `(zi,t |✓i,t) ˜ zi,t ˜ zi,t, xi,t+1 hi,t+1 `(zi,t+1 |✓i,t+1) ˜ zi,t+1 inputs network samples ˜ z ⇠ `(·|✓) Figure 2: Summary of the model. Training (left): At each time step t, the inputs to the network are the covariates xi,t , the target value at the previous time step zi,t 1 , as well as the previous network output hi,t 1 . The network output hi,t = h(hi,t 1, zi,t 1, xi,t, ⇥) is then used to compute the parameters ✓i,t = ✓(hi,t, ⇥) of the likelihood `(z|✓), which is used for training the model parameters. For prediction, the history of the time series zi,t is fed in for t < t0 , then in the prediction range (right) for t t0 a sample ˆ zi,t ⇠ `(·|✓i,t) is drawn and fed back for the next point until the end of the prediction range t = t0 + T generating one sample trace. Repeating this prediction process yields many traces representing the joint predicted distribution. often do not alleviate these conditions, forecasting methods have also incorporated more suitable likelihood functions, such as the zero-inflated Poisson distribution, the negative binomial distribution [20], a combination of both [4], or a tailored multi-stage likelihood [19]. Sharing information across time series can improve the forecast accuracy, but is difficult to accom- plish in practice, because of the often heterogeneous nature of the data. Matrix factorization methods (e.g. the recent work of Yu et al. [23]), as well as Bayesian methods that share information via hi- erarchical priors [4] have been proposed as mechanisms for learning across multiple related time series and leveraging hierarchical structure [13]. Neural networks have been investigated in the context of forecasting for a long time (see e.g. the numerous references in the survey [24], or [7] for more recent work considering LSTM cells). More recently, Kourentzes [17] applied neural networks specifically to intermittent data but ob- https://arxiv.org/abs/1704.04110
  16. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Training using a BackTestWindow
  17. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Training & Testing
  18. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Predictor metrics wQuantileLoss[0.5] Mean Absolute Percentage Error Root Mean Square Error
  19. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Predictor metrics – Quantiles
  20. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Getting a forecast – Interpreting P-numbers
  21. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved.
  22. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Forecast examples & notebooks https://github.com/aws-samples/amazon-forecast-samples
  23. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved.
  24. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Personalization & Recommendation Personalized recommendations Personalized search Personalized notifications
  25. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Personalize
  26. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Personalize workflow 1. Create related datasets and a dataset group 2. Get training data • Import historical data to the dataset group • Record live events to the dataset group 3. Create a solution version (trained model) using a recipe or AutoML 4. Evaluate the solution version using metrics 5. Create a campaign (deploy the solution version) 6. Provide recommendations for users
  27. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. How Amazon Personalize works • Dataset Groups • Datasets • Users – age, gender, or loyalty membership • Items – price, type, or availability • Interactions – between users and items • User Events • Recipes and Solutions • Metrics • Campaigns • Recommendations
  28. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Dataset schemas Dataset Type Required Fields Reserved Keywords Users USER_ID (string) 1 metadata field Items ITEM_ID (string) 1 metadata field Interactions USER_ID (string) ITEM_ID (string) TIMESTAMP (long) EVENT_TYPE (string) EVENT_VALUE (string)
  29. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Training data userId movieId timestamp 1 1 964982703 1 3 964981247 1 6 964982224 2 47 964983815 2 50 964982931 2 70 964982400 . . .
  30. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "type": "record", "name": "Users", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "AGE", "type": "int" }, { "name": "GENDER", "type": "string", "categorical": true } ], "version": "1.0" } Training data schema – Users For categories, like genre
  31. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" } Training data schema – Interactions An interaction between a user and an item at a specific point in time
  32. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "EVENT_VALUE", "type": ”float" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" } Using EVENT_TYPE and EVENT_VALUE fields
  33. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Using categorical data You can include more than one category in the training data using the “vertical bar” character, also known as “pipe”: ITEM_ID,GENRE item_123,horror|comedy Multiple categories
  34. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "solutionVersionArn": "arn:aws:personalize:…", "metrics": { "arn:aws:personalize:…": { "coverage": 0.27, "mean_reciprocal_rank_at_25": 0.0379, "normalized_discounted_cumulative_gain_at_5": 0.0405, "normalized_discounted_cumulative_gain_at_10": 0.0513, "normalized_discounted_cumulative_gain_at_25": 0.0828, "precision_at_5": 0.0136, "precision_at_10": 0.0102, "precision_at_25": 0.0091 } } } Solution metrics With the exception of coverage, higher is better
  35. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Evaluating a solution version Metric How When coverage The proportion of unique recommended items from all queries out of the total number of unique items in the training data. mean_reciprocal_ rank_at_25 The mean of the reciprocal ranks of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest- ranked recommendation. normalized_discounted_ cumulative_gain_at_K Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is given a lower weight by a factor dependent on its position. This metric rewards relevant items that appear near the top of the list because the top of a list usually draws more attention. precision_at_K The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendations of the relevant items.
  36. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. import boto3 personalize = boto3.client('personalize') response = personalize.create_event_tracker( name='MovieClickTracker', datasetGroupArn='arn:aws:personalize:…’ ) print(response['eventTrackerArn']) print(response['trackingId']) Recording live events – Getting a tracking ID
  37. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. { "datasets": [ { "name": "ratings-dsgroup/EVENT_INTERACTIONS", "datasetArn": "arn:aws:personalize:…", "datasetType": "EVENT_INTERACTIONS", "status": "ACTIVE", "creationDateTime": 1554304597.806, "lastUpdatedDateTime": 1554304597.806 }, { "name": "ratings-dataset", "datasetArn": "arn:aws:personalize:…", "datasetType": "INTERACTIONS", "status": "ACTIVE", "creationDateTime": 1554299406.53, "lastUpdatedDateTime": 1554299406.53 } ], "nextToken": "..." } Recording live events – Event-interactions dataset New dataset created automatically for the tracking events
  38. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. import boto3 personalize_events = boto3.client(service_name='personalize-events') personalize_events.put_events( trackingId = 'tracking_id', userId= 'USER_ID', sessionId = 'session_id', eventList = [{ 'sentAt': TIMESTAMP, 'eventType': 'EVENT_TYPE', 'properties': "{\"itemId\": \"ITEM_ID\"}" }] ) Recording live events – PutEvents operation
  39. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. personalize_events.put_events( trackingId = 'tracking_id', userId= 'user555', sessionId = 'session1', eventList = [{ 'eventId': 'event1', 'sentAt': '1553631760', 'eventType': 'like', 'properties': json.dumps({ 'itemId': 'choc-panama', 'eventValue': 'true' }) }, { 'eventId': 'event2', 'sentAt': '1553631782', 'eventType': 'rating', 'properties': json.dumps({ 'itemId': 'movie_ten', 'eventValue': '4', 'numRatings': '13' }) }] ) More Advanced PutEvents operation Multiple events with more data
  40. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. import { Analytics, AmazonPersonalizeProvider } from 'aws-amplify'; Analytics.addPluggable(new AmazonPersonalizeProvider()); // Configure the plugin after adding it to the Analytics module Analytics.configure({ AmazonPersonalize: { // REQUIRED - The trackingId to track the events trackingId: '<TRACKING_ID>', // OPTIONAL - Amazon Personalize service region region: 'XX-XXXX-X', // OPTIONAL - The number of events to be deleted from the buffer when flushed flushSize: 10, // OPTIONAL - The interval in ms to perform a buffer check and flush if necessary flushInterval: 5000, // 5s } }); Recording live events with AWS Amplify Using Amazon Personalize
  41. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Analytics.record({ eventType: "Identify", properties: { "userId": "<USER_ID>" } }, 'AmazonPersonalize’); Analytics.record({ eventType: "<EVENT_TYPE>", userId: "<USER_ID>", (optional) properties: { "itemId": "<ITEM_ID>", "eventValue": "<EVENT_VALUE>" } }, "AmazonPersonalize"); Recording live events with AWS Amplify Send events from the browser https://aws-amplify.github.io/docs/js/analytics
  42. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Using predefined recipes Recipe type API userId itemId inputList USER_PERSONALIZATION GetRecommendations required optional N/A PERSONALIZED_RANKING GetPersonalizedRanking required N/A list of itemId's RELATED_ITEMS GetRecommendations not used required N/A
  43. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Predefined USER_PERSONALIZATION Recipes Recipe How When AutoML Metadata HRNN A hierarchical recurrent neural network, which can model the temporal order of user-item interactions. Recommended when user behavior is changing with time (the evolving intent problem). ✔ HRNN- Metadata HRNN with additional features derived from contextual metadata (Interactions dataset), along with user and item metadata (Users and Items datasets). Performs better than non- metadata models when high quality metadata is available. Can involve longer training times. ✔ ✔ HRNN- Coldstart Similar to HRNN-metadata with personalized exploration of new items. Recommended when frequently adding new items to a catalog and you want the items to immediately show up in recommendations. ✔ ✔ Popularity- Count Calculates popularity of items based on a count of events against that item in the user-item interactions dataset. Use as a baseline to compare other user-personalization recipes.
  44. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Hierarchical Recurrent Neural Networks Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks Massimo Quadrana Politecnico di Milano, Milan, Italy massimo.quadrana@polimi.it Alexandros Karatzoglou Telefonica Research, Barcelona, Spain alexk@tid.es Balázs Hidasi Gravity R&D, Budapest, Hungary balazs.hidasi@gravityrd.com Paolo Cremonesi Politecnico di Milano, Milan, Italy paolo.cremonesi@polimi.it ABSTRACT Session-based recommendations are highly relevant in many mod- ern on-line services (e.g. e-commerce, video streaming) and rec- ommendation settings. Recently, Recurrent Neural Networks have been shown to perform very well in session-based settings. While in many session-based recommendation domains user identi￿ers are hard to come by, there are also domains in which user pro￿les are readily available. We propose a seamless way to personalize RNN models with cross-session information transfer and devise a Hierarchical RNN model that relays end evolves latent hidden states of the RNNs across user sessions. Results on two industry datasets show large improvements over the session-only RNNs. CCS CONCEPTS • Information systems → Recommender systems; • Comput- ing methodologies → Neural networks; KEYWORDS recurrent neural networks; personalization; session-based recom- mendation; session-aware recommendation 1 INTRODUCTION In many online systems where recommendations are applied, inter- actions between a user and the system are organized into sessions. A session is a group of interactions that take place within a given time frame. Sessions from a user can occur on the same day, or over several days, weeks, or months. A session usually has a goal, such as ￿nding a good restaurant in a city, or listening to music of a certain style or mood. Providing recommendations in these domains poses unique chal- lenges that until recently have been mainly tackled by applying conventional recommender algorithms [10] on either the last inter- action or the last session (session-based recommenders). Recurrent Neural Networks (RNNs) have been recently used for the purpose of session-based recommendations [7] outperforming item-based Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro￿t or commercial advantage and that copies bear this notice and the full citation on the ￿rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speci￿c permission and/or a fee. Request permissions from permissions@acm.org. RecSys’17, August 27–31, 2017, Como, Italy. © 2017 ACM. 978-1-4503-4652-8/17/08...$15.00 DOI: http://dx.doi.org/10.1145/3109859.3109896 methods by 15% to 30% in terms of ranking metrics. In session-based recommenders, recommendations are provided based solely on the interactions in the current user session, as user are assumed to be anonymous. But in many of these systems there are cases where a user might be logged-in (e.g. music streaming services) or some form of user identi￿er might be present (cookie or other identi￿er). In these cases it is reasonable to assume that the user behavior in past sessions might provide valuable information for providing recommendations in the next session. A simple way of incorporating past user session information in session-based algorithm would be to simply concatenate past and current user sessions. While this seems like a reasonable approach, we will see in the experimental section that this does not yield the best results. In this work we describe a novel algorithm based on RNNs that can deal with both cases: (i) session-aware recommenders, when user identi￿ers are present and propagate information from the previ- ous user session to the next, thus improving the recommendation accuracy, and (ii) session-based recommenders, when there are no past sessions (i.e., no user identi￿ers). The algorithm is based on a Hierarchical RNN where the hidden state of a lower-level RNN at the end of one user session is passed as an input to a higher-level RNN which aims at predicting a good initialization (i.e., a good context vector) for the hidden state of the lower RNN for the next session of the user. We evaluate the Hierarchical RNNs on two datasets from in- dustry comparing them to the plain session-based RNN and to item-based collaborative ￿ltering. Hierarchical RNNs outperform both alternatives by a healthy margin. 2 RELATED WORK Session-based recommendations. Classical CF methods (e.g. ma- trix factorization) break down in the session-based setting when no user pro￿le can be constructed from past user behavior. A natural solution to this problem is the item-to-item recommendation ap- proach [11, 16]. In this setting an item-to-item similarity matrix is precomputed from the available session data, items that are often clicked together in sessions are deemed to be similar. These similar- ities are then used to create recommendations. While simple, this method has been proven to be e￿ective and is widely employed. Though, these methods only take into account the last click of the user, in e￿ect ignoring the information of the previous clicks. arXiv:1706.04148v5 [cs.LG] 23 Aug 2017 s 1 s 2 i 2,4 i 1,3 c 2 c 0 c 1 user representation propagation i 2,3 i 2,1 i 2,2 prediction i 2,5 i 2,4 i 2,2 i 2,3 input item id i 1,4 i 1,2 i 1,3 user-level representation session-level representation session initialization i 1,1 i 1,2 s 1,0 Figure 1: Graphical representation of the proposed Hierarchical RNN model for personalized session-based recommendation. The model is composed of an hierarchy of two GRUs, the session-level GRU (GRUses ) and the user-level GRU (GRUusr ). The session-level GRU models the user activity within sessions and generates recommendations. The user-level GRU models the evolution of the user across sessions and provides personalization capabilities to the session-level GRU by initializing its hidden state and, optionally, by propagating the user representation in input. way, the user-level GRU can track the evolution of the user across sessions and, in turn, model the dynamics user interests seamlessly. Notice that the user-level representation is kept ￿xed throughout the session and it is updated only when the session ends. The user-level representation is then used to initialize the hidden state of the session-level GRU. Given cm, the initial hidden state sm+1,0 of the session-level GRU for the following session is set to sm+1,0 = tanh (Winitcm + binit ) (4) where Winit and binit are the initialization weights and biases respectively. In this way, the information relative to the preferences expressed by the user in the previous sessions is transferred to the session-level. Session-level representations are then updated as follows training) how user sessions evolve during time. We will see in the experimental section that this is crucial in achieving increased per- formance. In e￿ectGRUusr computes and evolves a user pro￿le that is based on the previous user sessions, thus in e￿ect personalizing the GRUses . In the original RNN, users who had clicked/interacted with the same sequence of items in a session would get the same recommendations; in HRNN this is not anymore the case, recom- mendations will be in￿uenced by the the users past sessions as well. In summary, we considered the following two di￿erent HRNN settings, depending on whether the user representation cm is con- sidered in Equation 5: • HRNN Init, in which cm is used only to initialize the repre- sentation of the next session. https://arxiv.org/abs/1706.04148
  45. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Predefined PERSONALIZED_RANKING recipes Recipe How When AutoML Metadata Personalized- Ranking Use this recipe when you’re personalizing the results for your users, such as, personalized reranking of search results or curated lists.
  46. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Predefined RELATED_ITEMS recipes Recipe How When AutoML Metadata SIMS Item-to-item similarities (SIMS) is based on the concept of collaborative filtering. It generates items similar to a given item based on co-occurrence of the item in user history in the user-item interaction dataset. In the absence of sufficient user behavior data for an item, or if the specified item ID is not found, the algorithm returns popular items as recommendations. Use for improving item discoverability and in detail pages. Provides fast performance.
  47. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved.
  48. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Amazon Personalize examples & notebooks https://github.com/aws-samples/amazon-personalize-samples
  49. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved.
  50. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Takeaways • Forecasting and personalization are can help improve your business efficiency • Amazon Forecast provides accurate time-series forecasting • Amazon Personalize provides a real-time personalization and recommendation • They are both based on the same technology used at Amazon.com and don’t require machine learning expertise to be used
  51. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Links • Blogs • https://aws.amazon.com/blogs/aws/amazon-forecast-time-series-forecasting-made-easy/ • https://aws.amazon.com/blogs/aws/amazon-forecast-now-generally-available/ • https://aws.amazon.com/blogs/aws/amazon-personalize-real-time-personalization-and- recommendation-for-everyone/ • https://aws.amazon.com/blogs/aws/amazon-personalize-is-now-generally-available/ • Examples & Notebooks • https://github.com/aws-samples/amazon-forecast-samples • https://github.com/aws-samples/amazon-personalize-samples
  52. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Links • Training algorithms • DeepAR – https://arxiv.org/abs/1704.04110 • HRNN – https://arxiv.org/abs/1706.04148 • Evaluating performance of a trained model • https://en.wikipedia.org/wiki/Mean_absolute_percentage_error (MAPE) • https://en.wikipedia.org/wiki/Quantile_regression • https://en.wikipedia.org/wiki/Mean_reciprocal_rank • https://en.wikipedia.org/wiki/Discounted_cumulative_gain
  53. Thank you! © 2019, Amazon Web Services, Inc. or its

    affiliates. All rights reserved. Danilo Poccia @danilop