Tensor Products Introduced as Most General Multiplication (and as Lazy Evaluation)
This note is a fresh attempt at introducing tensor products in a simple and intuitive way. The setting is a vector space on which we wish to define a multiplication. Importantly, we do not wish to limit ourselves to a multiplication producing an element of . For example, we would like to think of and as two examples of multiplying elements of . The first multiplication, , is a map from to . The second multiplication, , is a map from to . In fact, since is well-defined even if and have different dimensions, the most general setting is the following.
Let and be fixed vector spaces agreed upon at the start. Assume someone has defined a multiplication rule but has not told us what it is. Here, is a function and is a vector space, both of which are unknown to us. For example, if and , a possible choice might be where .
What does it mean for to be a rule for multiplication? We wish it to have certain properties characteristic of multiplication. In a field, multiplication is involved in a number of axioms, including the distributive law . Since and are vector spaces, potentially distinct from each other, not all the axioms for multiplication in a field make sense in this more general setting. (For instance, commutativity makes no sense if .) The outcome is that is declared to be a rule for multiplication if it is bilinear: , and .
Not knowing does not prevent us from working with it; we simply write it as and are free to manipulate it according to the aforementioned rules. We do the equivalent all the time when we prove general results about a metric; we say “let be a metric” and then make use of only the axioms of a metric, thereby ensuring our derivation is valid regardless of which metric is actually being represented by .
While the tensor product is more than this, in the first instance, it suffices to treat it merely as an unspecified rule for multiplication. (It is specified, but no harm comes from forgetting about this in the first instance.) Rather than write it is more convenient to write , and thinking of as multiplication reminds us that .
The first point is that we can treat as an unspecified multiplication, simplify expressions by manipulating it in accordance with the rules for multiplication, and at the end of the day, if we are eventually told the rule , we can evaluate its simplified expression. In computer science parlance, this can be thought of as lazy evaluation.
Whereas there are many metrics “incompatible” with each other, the rules for multiplication are sufficiently rigid that there exists a “most general multiplication”, and all other multiplications are special cases of it, as explained presently. The tensor product represents this most general multiplication possible.
To consider a specific case first, take and to be . We already know two possible multiplications are and . The claim is that the latter represents the most general multiplication possible. This means, among other things, that it must be possible to write in terms of , and indeed it is: . Note that trace is a linear operator. The precise claim is the following. No matter what is, we can pretend it is , work with this definition, then at the very end, if we are told what actually is, we can obtain the true answer by applying a particular linear transform. At the risk of belabouring the point, if we wish to simplify then we can first pretend is to obtain . Later when we are told we can deduce that the linear map required to convert from to is . Applying this linear map to yields and this is indeed the correct answer because it equals .
Readers wishing to think further about the above example are encouraged to consider that the most general multiplication returning a real-valued number is of the form for some matrix . How can be obtained from ? What about for multiplication rules that return a vector or a matrix?
The general situation works as follows. Given vector spaces and , two things can be constructed, both of which use the same symbol for brevity. We need a space for our most general multiplication, and we denote this by . We also need a rule for multiplication and we denote this by . By definition, is an element of . (Just like not all matrices can be written as , there are elements in that cannot be written as . They can, however, be written as linear combinations of such terms.)
At this point, I leave the reader to look up elsewhere the definition of a tensor product and relate it with the above.
Nice point of view.
Though it would be great of in the next post you explicitly show it on tensors.
Good suggestion. I will try to find time to do this in the next few weeks.
Nice down-to-earth explanation of universal properties!