# Tensor Networks---a brief description

October 06, 2015

## Transcript

1. ### Follow on project Tensor Networks–a brief description Emir Mu˜ noz

Fujitsu Ireland Ltd. Emir.Munoz@ie.fujitsu.com October 2015 Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 1/40
2. ### Tensor Product Networks Main reference: Smolesnky, Paul: Tensor product variable

binding and the representation of symbolic structures in connectionist systems, Artiﬁcial Intelligence 46 (1990) pp 159-216. All this material came from http://www.cse.unsw.edu.au/ ~billw/cs9444/tensor-stuff/tensor-intro-04.html Keywords: tensor product network, variable binding problem, rank, one-shot learning, orthonormality, relational memory, teaching and retrieval modes, proportional analogies, lesioning a network, random representations, sparse random representations, fact recognition scores, representable non-facts. Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 2/40
3. ### Network Topology and Activation Figure: Connection of Units Common model:

assume weighted connections wij from input units with activation xj to unit i Output (activation function) for unit i to be: σ( j wijxj) where σ is a ’squashing’ function such as tanh Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 3/40
4. ### Network Topology Feedforward Nets Figure: Feedforward Nets If the graph

consisting of the neurons as nodes and connections as directed edges is a directed acyclic graph Input nodes (no incoming edge); output nodes (no outgoing edge); anything else is called hidden node or unit Edges labelled with ω signify that there are connections with ‘trainable’ weights between neurons in one “layer” and those in the next Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 4/40
5. ### Network Topology Fully Recurrent Nets Figure: Fully Recurrent Nets Network

Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 5/40
6. ### Tensor Product Nets Other possibilities for the activation function include

linear networks (where σ is the identity function) One particular one is the rank 2 Tensor Product Network (TPN) TPNs come with diﬀerent number of dimensions (rank) In the case of TPN of rank 2, the topology is that of a matrix Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 6/40
7. ### Rank 2 TPN Figure: Network Topology Tensor Product Nets Relational

Memories High Ranks TPN Applications 7/40
8. ### Rank 2 TPN (cont.) The previous network is shown in

teaching mode There is also a retrieval mode, where you feed the net with (the representation of) a variable, and it outputs the value of the symbol (the ‘ﬁller’) Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 8/40
9. ### Teaching Mode In rank 2 TPNs Vectors representing a variable

and a ﬁller are presented to the two sides of the network The fact that the variable has that ﬁller is learned by the network The teaching is one-shot, diﬀerent from other classes of neural network Teaching is accomplished by adjusting the value of the binding unit memory Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 9/40
10. ### Teaching Mode (cont.) In rank 2 TPNs Speciﬁcally, if the

i-th component of the ﬁller vector is fi and the j-th component of the variable vector is vj, then fivj is added to bij, the (i, j)-th binding unit memory, ∀ i and j Another way to look at this is considering: binding units as a matrix B and the ﬁller and variable as column vectors f and v Then what we are doing is forming the outer product fv and adding it to B B = B + fv Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 10/40
11. ### Retrieval Mode In rank 2 TPNs For exact retrieval we

must ensure that: the vectors used to represent variables must be orthogonal to each other (i.e. any two of them should have the dot product equal to zero) the same must be true for the vectors used to represent the ﬁllers Each representation vector should also be of length 1 (i.e. the dot product of each vector with itself should be 1) It is common to refer to a set of vectors with these properties (orthogonality and length 1) as an orthonormal set Orthonormality entails that the representation vectors are linearly independent, and in particular, if the matrix/tensor has m rows and n columns, then it can represent at most m ﬁllers and n variables Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 11/40
12. ### Retrieval from a TP Net In rank 2 TPNs Retrieval

is accomplished by computing dot products To retrieve the value/ﬁller for a variable v = (vj) from a rank 2 tensor with binding unit values bij, compute fi = j bijvj, for each i. The resulting vector (fi) represents the ﬁller To decide whether variable v has ﬁller f, compute D = i j bijvjfi. D will be either 1 or 0. If it is 1, then variable v has ﬁller f, otherwise not Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 12/40
13. ### Learning with Hadamard Representations In rank 2 TPNs Example Suppose

we are using representations as follows: (0.5, 0.5, 0.5, 0.5) to represent rabbit (0.5, −0.5, 0.5, −0.5) to represent mouse (0.5, 0.5, −0.5, −0.5) to represent carrot (0.5, −0.5, −0.5, 0.5) to represent cat and we want to build a tensor to represent the pairs (rabbit, carrot) and (cat, mouse) Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 13/40
14. ### Learning with Hadamard Representations (cont.) In rank 2 TPNs Example

carrot ×rabbit= 1 2          1 1 −1 −1          × 1 2 1 1 1 1 = 1 4          1 1 1 1 1 1 1 1 −1 −1 −1 −1 −1 −1 −1 −1          (Applying Kronecker product) Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 14/40
15. ### Learning with Hadamard Representations (cont.) In rank 2 TPNs We

check that we can recover carrot from this by unbinding with rabbit We must compute fi = j bijvj where bij is the matrix, and (vj) is the rabbit vector Example f1 = b11v1 + b12v2 + b13v3 + b14v4 = 1 2 × 1 4 × (1 × 1 + 1 × 1 + 1 × 1 + 1 × 1) = 0.5 and similarly, f2 = 0.5, f3 = −0.5, and f4 = −0.5, so that f represents carrot Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 15/40
16. ### Learning with Hadamard Representations (cont.) In rank 2 TPNs For

(cat, mouse), we compute Example mouse ×cat= 1 2          1 −1 1 −1          × 1 2 1 −1 −1 1 = 1 4          1 −1 −1 1 −1 1 1 −1 1 −1 −1 1 −1 1 1 −1          (Applying Kronecker product) Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 16/40
17. ### Learning with Hadamard Representations (cont.) In rank 2 TPNs The

tensor representing both of these is the sum of the two matrices: Example 1 4              1 1 1 1 1 1 1 1 −1 −1 −1 −1 −1 −1 −1 −1              + 1 4              1 −1 −1 1 −1 1 1 −1 1 −1 −1 1 −1 1 1 −1              = 1 4              2 0 0 2 0 2 2 0 0 −2 −2 0 −2 0 0 −2              Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 17/40
18. ### Learning with Hadamard Representations (cont.) In rank 2 TPNs We

check that we can still recover carrot from this by unbinding with rabbit We must compute fi = j bijvj where bij is the (new) matrix, and (vj) is the rabbit vector Example f1 = b11v1 + b12v2 + b13v3 + b14v4 = 1 2 × 1 4 × (2 × 1 + 0 × 1 + 0 × 1 + 2 × 1) = 0.5 and similarly, f2 = 0.5, f3 = −0.5, and f4 = −0.5, so that f represents carrot as before Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 18/40
19. ### TP Nets as Relational Memories So far we have been

using TPN to store a particular kind of relational information: variable binding In variable binding, each variable has a unique ﬁller (at any given time) This restriction on the kind of information stored in the tensor is unnecessary A rank 2 tensor will store an arbitrary binary relation Animal Food rabbit carrot mouse cheese crocodile student rabbit lettuce guinea pig lettuce crocodile lecturer Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 19/40
20. ### TP Nets as Relational Memories (cont.) This information can be

represented, and stored in the tensor in the usual way putting the ‘animal’ in the side we have been calling ‘variable’, and the ‘food’ in the side we have been calling ‘ﬁller’ And the retrieval is the same Example We can present the vector representing rabbit to the variable/animal side of the tensor. What we get out of the ﬁller/food side of the tensor will be the sum of the vectors representing the foods that the tensor has been taught that rabbit eats: in this case carrot + lettuce Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 20/40
21. ### TP Nets as Relational Memories (cont.) Checking a particular fact,

like that (mouse, cheese) is in the relation, is done just as before, we compute D = i j bijvjfi where v is for varmint and f for food, and if D = 1 then the varmint eats the food Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 21/40
22. ### Rank 3 Tensors High Ranks TPN We could better call

these nets ‘matrix nets’ The tensor aspect of things comes in when we generalise to enable us to store ternary (or higher rank) relations Suppose ternary relations like kiss(frank, betty) hit(max, frank). Now we need a tensor net with three sides: say a REL side, an ARG1 side and an ARG2 side, or more generally a u side, a v side and a w side Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 22/40
23. ### Gross Structure of a Rank 3 TP Net High Ranks

TPN This shows binding units, some activated (shaded), and neurons in the u, v, and w vectors, but not interconnections Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 23/40
24. ### Gross Structure of a Rank 3 TP Net (cont.) High

Ranks TPN The net is ready for retrieval from the u side, given v and w There are 27 binding units, 3 × 3 × 3 In general, if the u, v, and w sides use vectors with q components, there are q3 binding units Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 24/40
25. ### Connections of a Rank 3 TP Net The binding units

are labelled tijk – t for tensor. Each component of a side is connected to a hyperplane of binding units. e.g. v1 is connected to ti1k for all i and k Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 25/40
26. ### Retrieval in a Rank 3 Tensor If we have concepts

(or their representations) for any two sides of the tensor, then we can retrieve something from the third side Example If we have u = (ui) and v = (vj), then we can compute wk = ij tijkuivj, for each value of k, and the result will be the sum of the vectors representing concepts w such that u(v,w) is stored in the tensor This time the activation function for wk is not linear but multi-linear As usual, we can check facts, too D = ijk tijkuivjwk is 1 exactly when u(v,w) is stored in the tensor, and zero otherwise Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 26/40
27. ### Teaching in a Rank 3 Tensor To teach the network

the fact u(v,w), present u, v and w to the net In teaching mode, this causes the content of each binding unit memory tijk to be altered by adding uivjwk to it Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 27/40
28. ### Higher Rank Tensor Product Networks For a rank r tensor

product network: the binding units would have r subscripts: ti1i2...ir ; there would be r sides; there would be r input/output vectors, say u1 ,u2 ,. . . ,ur ; to teach the tensor the fact u1 (u2 , . . . , ur ), add u1i1 × u2i2 × . . . × urir to each binding unit ti1i2...ir ; to retrieve, say, the r-th component given the ﬁrst r − 1, you would compute urir = i1,i2,...,ir−1 ti1i2...ir u1i1 u2i2 . . . urir−1 Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 28/40
29. ### Higher Rank Tensor Product Networks (cont.) This rapidly becomes impractical,

as the size of the network (number of binding units) grows as nr it is desirable to have n fairly large in practice, since n is the largest number of concepts that can be represented (per side of the tensor) For example, with a rank 6 tensor, with 64 concepts per side, we would need 646 = 236 ∼ 64 billion binding units Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 29/40
30. ### Gross Topology of a Rank 4 TPN High Ranks TPN

This one has 3 components for each of the 4 ‘directions’, so has a total of 34 = 81 binding units Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 30/40
31. ### Applications of TPN Theory building for connectionist models Construction of

theories of cognition Detailed diagrams of tensor product networks are complicated. Here a rank 3 tensor: Here v and w are inputs and u is output, but we could make any side the output Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 31/40
32. ### Solving Proportional Analogy Problems using TPN The aim is to

simulate simple human analogical reasoning The TPN is used to store facts relevant to the analogical reasoning problem Proportional analogy problems are sometimes used in psychological testing They are fairly easy for a human over a certain age, but it is not particularly clear how to solve them on a machine A typical example is: dog : kennel :: rabbit : what? The aim is to ﬁnd the best replacement for the what?. Here the answer is burrow Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 32/40
33. ### Solving Proportional Analogy Problems using TPN (cont.) The human mind

exercise is: “The dog lives-in the kennel – what does the rabbit live in? – a burrow” The human names a relationship between dog and kennel, and then proceeds from there However, the human does not pick just any relation between dog and kennel (like smaller-than(dog, kennel)): they pick the most salient relation How? And how could we do this with a machine? The TPN approach actually ﬁnesses this question Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 33/40
34. ### A set of “facts” Example woman loves baby woman mother-of

baby woman bigger-than baby woman feeds baby mare feeds foal mare mother-of foal mare bigger-than foal mare bigger-than rabbit woman bigger-than rabbit woman bigger-than foal woman lives-in house baby lives-in house mare lives-in barn foal lives-in barn rabbit lives-in burrow barn bigger-than woman barn bigger-than baby barn bigger-than mare barn bigger-than foal barn bigger-than rabbit (Did someone think about RDF? ;-)) Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 34/40
35. ### A set of “facts” (cont.) Network Topology Tensor Product Nets

Relational Memories High Ranks TPN Applications 35/40
36. ### Steps in the Simple Analogical Reasoning Algorithm Present WOMAN and

BABY to the arg1 and arg2 sides of the net; From the rel(ation) side of the network we get a “predicate bundle” The sum of the vectors representing predicates or relation symbols P such that the net has been taught that P(WOMAN, BABY) holds; Present this predicate bundle to the rel side of the same network and present MARE to the arg1 side of the net; Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 36/40
37. ### Steps in the Simple Analogical Reasoning Algorithm (cont.) From the

arg2 side of the net we get a “weighted argument bundle” The sum of the vectors representing second arguments y such that the net has been taught that P(MARE, y) holds for some P in the predicate bundle The weight associated with each y is the number of predicates P in the predicate bundle for which P(MARE, y) holds For the given set of facts, the arg2 bundle is 3×FOAL + 1×RABBIT Pick the concept (arg2 item) which has the largest weight - FOAL Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 37/40
38. ### Omnidirectional Access In solving the analogy problem, the TPN was

accessed in two diﬀerent ways: 1 ARG1 and ARG2 in, REL out 2 ARG1 and REL in, ARG2 out This would not have been possible with a backprop net - the input/output structure is “hard-wired” in backprop nets In the TPN, the same information in the tensor supports both these modes of operation Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 38/40
39. ### Omnidirectional Access (cont.) This is like when kids learn addition/subtraction

- you learn that 9 + 7 = 16, and from this you also know that 16 − 7 = 9. We learn addition tables, but not subtraction tables An obvious third access mode: ARG2 and REL in, ARG1 out, is possible And of course, you can have and ARG1, ARG2, and REL in, YES/NO out access mode Less obviously, you can have access modes like: REL in, ARG1 ⊗ ARG2 out In fact there are a total of 7 access modes to a rank 3 tensor There are 2k − 1 access modes for a rank k tensor. This property is referred to as omnidirectional access Network Topology Tensor Product Nets Relational Memories High Ranks TPN Applications 39/40
40. ### thanks! Emir Mu˜ noz Emir.Munoz@ie.fujitsu.com Network Topology Tensor Product Nets

Relational Memories High Ranks TPN Applications 40/40