Talk about the Tensorflow Eager execution and the Dataset APIs. Covers on how to change your code from using graph mode to eager execution and in the second half covered the datasets api to write performant input pipelines.
Metaprogramming ● Control flow concepts (tf.while_loop) are different than the host language ● Can’t use Python data structures easily Graphs can be annoying
x = tf.random_uniform([2, 2]) with tf.Session() as sess: for i in range(x.shape[0]): for j in range(x.shape[1]): print(sess.run(x[i, j])) Metaprogramming Each iteration adds nodes to the graph
a = tf.constant(6) while a != 1: if a % 2 == 0: a = a / 2 else: a = 3 * a + 1 print(a) Python Control Flow # Outputs tf.Tensor(3, dtype=int32) tf.Tensor(10, dtype=int32) tf.Tensor(5, dtype=int32) tf.Tensor(16, dtype=int32) tf.Tensor(8, dtype=int32) tf.Tensor(4, dtype=int32) tf.Tensor(2, dtype=int32) tf.Tensor(1, dtype=int32)
tf.device() for manual placement with tf.device(“/gpu:0”): x = tf.random_uniform([10, 10]) y = tf.matmul(x, x) # x and y reside in GPU memory Using GPUs
The same APIs as graph building (tf.layers, tf.train.Optimizer, tf.data etc.) model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) Building Models
model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) # Define a loss function def loss(x, y): return tf.reduce_mean(tf.square(y - model(x)) Building Models
Compute and apply gradients grad_fn = tfe.implicit_gradients(loss) for (x, y) in get_next_batch(): optimizer.apply_gradients(grad_fn(x, y)) Training Models
Input data is the lifeblood of machine learning Modern accelerators need faster input pipelines Getting your data into TensorFlow can be painful Why are we here?
Half screen photo slide if text is necessary Feeding sess.run(…, feed_dict={x: features, y: labels}) All the flexibility of Python… …and all the performance
Data elements have the same type Dataset might be too large to materialize all at once… or infinite Compose functions like map() and filter() to preprocess Input pipelines = lazy lists Functional programming to the rescue!
Create a Dataset from one or more tf.Tensor objects: Dataset.from_tensors((features, labels)) Dataset.from_tensor_slices((features, labels)) TextLineDataset(filenames) The Dataset interface Data sources and functional transformations
Or create a Dataset from another Dataset: dataset.map(lambda x: tf.decode_jpeg(x)) dataset.repeat(NUM_EPOCHS) dataset.batch(BATCH_SIZE) ...and many more. The Dataset interface Data sources and functional transformations
Or (in TensorFlow 1.4) create a Dataset from a Python generator: def generator(): while True: yield ... Dataset.from_generator(generator, tf.int32) The Dataset interface Data sources and functional transformations
# Read records from a list of files. dataset = TFRecordDataset(["file1.tfrecord", "file1.tfrecord", …]) # Parse string values into tensors. dataset = dataset.map(lambda record: tf.parse_single_example(record, …)) # Randomly shuffle using a buffer of 10000 examples. dataset = dataset.shuffle(10000) # Repeat for 100 epochs. dataset = dataset.repeat(100) # Combine 128 consecutive elements into a batch. dataset = dataset.batch(128)
Create an Iterator from a Dataset: dataset.make_one_shot_iterator() dataset.make_initializable_iterator() The Iterator interface Sequential access to Dataset elements
Get the next element from the Iterator: next_element = iterator.get_next() while …: sess.run(next_element) The Iterator interface Sequential access to Dataset elements
dataset = … # A one-shot iterator automatically initializes itself on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() train_op = model_and_optimizer(images, labels) # Loop until all elements have been consumed. try: while True: sess.run(train_op) except tf.errors.OutOfRangeError: pass
def input_fn(): dataset = … # A one-shot iterator automatically initializes itself on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() return images, labels # The input_fn can be used as a regular Estimator input function. estimator = tf.estimator.Estimator(…) estimator.train(train_input_fn=input_fn, …)
tf.data.Dataset Represents input pipeline using functional transformations tf.data.Iterator Provides sequential access to elements of a Dataset tf.data API