Archive for September, 2012

Introduction to the Grassmann Algebra and Exterior Products

September 3, 2012 1 comment

Sadly, Grassmann’s mathematical work was not appreciated during his lifetime. Among other things, he introduced what is now called the Grassmann algebra.  It appears that Grassmann did this in part by looking for all possible ways a product structure could be introduced. Although there is strong geometric intuition behind the Grassmann algebra, it is not necessarily straightforward to grasp quickly this intuition from current introductory texts.  For example, if the Grassmann algebra is about lengths, areas and volumes of parallelotopes, why can v_1 and v_2 be added together to form a new vector v_3 = v_1 + v_2 when in general the length of v_3 will not be the sum of the lengths of v_1 and v_2?

In my mind, the key point to keep in mind, and which I have not seen written down elsewhere, is that in the context of Grassmann algebras, lower-dimensional parallelotopes should be considered merely as building blocks for higher-dimensional parallelotopes; some background is required before getting to this point though.

Stepping back, this note endeavours to re-invent the Grassmann algebra in an intuitive way, motivating the operations of addition and multiplication.  The point of departure is the desire to measure the relative volume of a d-dimensional oriented parallelotope in a vector space V of dimension d.  Let us initially denote an oriented parallelotope by the ordered set [v_1,\cdots,v_d] of vectors v_1,\cdots,v_d \in V that form the sides of the parallelotope.  (See the wiki for a picture of the three-dimensional case.)  Here, “oriented” just means that the sides of the parallelotope are ordered. In hindsight, it becomes clear that it is simpler to work with oriented parallelotopes than non-oriented ones; a (multi-)linear theory can be developed for the former.  (Perhaps better motivation would come from considering how to define integration on a manifold, but I am endeavouring here to introduce Grassmann algebras without mention of forms from differential geometry.)

Given a metric on V, the volume of the parallelotope [v_1,\cdots,v_d] can be computed by choosing an orthonormal basis for V and computing the determinant of the matrix A whose columns are the vectors v_1,\cdots,v_d expressed as linear combinations of the basis vectors; put simply, if we assume V is \mathbb{R}^d and we use the Euclidean inner product then A is the matrix whose ith column is v_i. Note that negative volumes are permissible, a consequence of working with oriented parallelotopes.  For brevity, parallelotopes will mean oriented parallelotopes and volumes will mean signed volumes.

If we don’t have a metric — or, precisely, we want to state results that are true regardless of which metric is being used — we can still make sense of one parallelotope being twice as big as another one, at least in certain situations.  For example, the parallelotope [2v_1,\cdots,v_d] is twice as big as [v_1,\cdots,v_d] because, no matter how we choose the metric, the volume of the former really will be twice that of the latter.  A key question to ask is: if [v_1,\cdots,v_d] and [w_1,\cdots,w_d] are two parallelotopes, will the ratio of their volumes be independent of the metric chosen?

Since we have limited ourselves to d-dimensional parallelotopes in d-dimensional space, it turns out that the answer is in the affirmative.  A proof can be based on the equality \mathrm{det}(B^{-1}AB) = \mathrm{det}(A) by choosing the matrix B to be the change of orthonormal basis induced by a change of metric.

If we decide that two (oriented) parallelotopes are equivalent whenever their (signed) volume is the same regardless of the metric chosen then it turns out that we can form a vector space structure on the set P_V of all d-dimensional parallelotopes up to equivalence in a given d-dimensional vector space V.  Note that we are working with a quotient space structure; although we use the notation [v_1,\cdots,v_d] to represent an element of P_V, different representations may correspond to the same element.  (Precisely, we have a projection \pi: V \times \cdots \times V \rightarrow P_V taking d vectors and returning the corresponding element of P_V, where \pi(v_1,\cdots,v_d) = \pi(w_1,\cdots,w_d) if and only if the signed volume of [v_1,\cdots,v_d] equals the signed volume of [w_1,\cdots,w_d] regardless of the metric chosen.)  We choose to define scalar multiplication in P_V by \alpha \cdot [v_1,\cdots,v_d] \mapsto [\alpha v_1,\cdots,v_d].  (Note that the \alpha could have multiplied any one of the v_i because elements of P_V are only distinguished up to differences in volume.)  That is to say, scalar multiplication corresponds to scaling the volume of the parallelotope.

Vector space addition in P_V is worthy of contemplation even if the ultimate definition is straightforward.  (From a pedagogical perspective, having a simple mathematical definition does not imply having an intuitive understanding; Grassmann algebras have a simple mathematical definition, but one that belies the ingenuity required by Grassmann to develop them and one that potentially lacks the intuition required to feel comfortable with them.)  Thinking first in terms of cubes then in terms of parallelotopes, it is clear geometrically that [v_1,v_2,\cdots,v_d] + [w_1,v_2,\cdots,v_d] = [v_1 + w_1, v_2, \cdots, v_d]. In other words, if all but one vector are the same, there is an obvious geometric meaning that can be given to vector space addition in P_V.  Perhaps other special cases can be found.  Nevertheless, the general rule we wish to follow (if at all possible) is that if [v_1,\cdots,v_d] + [w_1,\cdots,w_d] = [u_1,\cdots,u_d] then this should be taken to mean that the volume of the parallelotope [v_1,\cdots,v_d] plus the volume of the parallelotope [w_1,\cdots,w_d] is equal to the volume of the parallelotope [u_1,\cdots,u_d].  If this is possible, then one way to achieve it is to define [v_1,\cdots,v_d] + [w_1,\cdots,w_d] as follows.  Arbitrarily choose a basis e_1,\cdots,e_d for V.  Then we know that there exist constants \alpha and \beta such that the volume of [v_1,\cdots,v_d] is equal to \alpha times the volume of [e_1,\cdots,e_d], and the volume of [w_1,\cdots,w_d] equals \beta times the volume of [e_1,\cdots,e_d].  Then [v_1,\cdots,v_d] + [w_1,\cdots,w_d] is defined to be (\alpha + \beta) \cdot [e_1,\cdots,e_d].  One can check that this indeed works; it endows P_V with a well-defined vector space structure. (Precisely, one must first verify that our definitions are consistent — given x, y \in P_V, we claim that no matter which parallelotopes [v_1,\cdots,v_d] \in \pi^{-1}(x), [w_1,\cdots,w_d] \in \pi^{-1}(y) and [e_1,\cdots,e_d] we used, the same element \pi((\alpha + \beta) \cdot [e_1,\cdots,e_d]) will be obtained — and then verify that the axioms of a vector space are satisfied.)

After all this effort, one may be disappointed to learn that P_V is one-dimensional.  However, that is to be expected; we wanted P_V to represent the (signed) volume of an (oriented) parallelotope and hence P_V is essentially just the set of real numbers with the usual scalar multiplication and vector addition.  What we have done though is introduce the notation and mindset to pave the way for generalising this reasoning to parallelotopes of arbitrary dimension in V.

Importantly, the following approach will not work, in that it will not re-create the Grassmann algebra.  Consider all one-dimensional parallelotopes in V, where now \dim V > 1.  If [v_1] and [v_2] are two such parallelotopes then one might be tempted to declare that [v_3] = [v_1] + [v_2] if and only if the length of v_3 is equal to the sum of the lengths of v_1 and v_2 with respect to all metrics.  This would lead to an infinite-dimensional vector space though, since it would only be possible to add two vectors that were linearly dependent.

An algebra (in this context) is a vector space that also has defined on it a rule for multiplying two elements, such that the multiplicative structure is consistent with the vector space structure, e.g., the associative and distributive laws hold.  Does multiplication enter the picture in any way when we think of volume?  For a start, the area of a rectangle can be calculated by taking the product of the lengths of two adjoining sides.  We are thus tempted to introduce a symbol * that allows us to construct a higher-dimensional parallelotope from two lower-dimensional ones — namely, [v_1,\cdots,v_i] * [w_1,\cdots,w_j] = [v_1,\cdots,v_i,w_1,\cdots,w_j] — and have some faint hope that this simple concatenation-of-parallelotopes operator behaves in a way expected of a multiplication operator.

Now for the key decision, which I have not seen stated elsewhere yet believe to be the key to understanding Grassmann algebras in a simple way.  Because the paragraph before last pointed out that we cannot treat length in a metric-independent way if we wish to stay in finite dimensions, we must use our definition of metric-independent volume to induce a weaker notion of metric-independent length, area and volume on lower-dimensional parallelotopes of the ambient space V. Precisely, we declare that [v_1,\cdots,v_i] is equivalent to [w_1,\cdots,w_i] if and only if, for all vectors u_1,\cdots,u_{d-i}, we have that [v_1,\cdots,v_i,u_1,\cdots,u_{d-i}] has the same volume as [w_1,\cdots,w_i,u_1,\cdots,u_{d-i}], where as usual d is the dimension of V.  In particular, lower-dimensional parallelotopes are considered merely as building blocks for d-dimensional parallelotopes in d-dimensional spaces.  Immediate questions to ask are does this work in theory and is it useful in practice.  It does work; it leads to the Grassmann algebra. And it has found numerous uses in practice, but that is a different story which will not be told here.

It is now a straightforward journey to the finish line.  Let P_V^d denote what was earlier denoted P_V, and in general, let P_V^i denote the set of all i-dimensional (oriented) parallelotopes up to the aforementioned equivalence relation.  Each of these sets can be made into a vector space with vector space operations relating directly to volumes. Precisely, if [v_1,\cdots,v_i] \in P_V^i then the scalar multiple \alpha \cdot [v_1,\cdots,v_i] is the parallelotope [w_1,\cdots,w_i] (unique up to equivalence) such that, for all vectors u_1,\cdots,u_{d-i}, the volume of [w_1,\cdots,w_i,u_1,\cdots,u_{d-i}] is precisely \alpha times the volume of [v_1,\cdots,v_i,u_1,\cdots,u_{d-i}] regardless of which metric is used to measure volume.  (This implies that the volume of [w_1,\cdots,w_i] is precisely \alpha times the volume of [v_1,\cdots,v_i] but the converse is not necessarily true.)  Vector addition can be defined in a similar way.

It can be shown that P_V^1 is linearly isomorphic to V.  Indeed, if v_3 = v_1 + v_2 then [v_3] = [v_1] + [v_2] because, for any vectors u_1,\cdots,u_{d-1}, the volume of the parallelotope [v_3,u_1,\cdots,u_{d-1}] will equal the sum of the volumes of [v_1,u_1,\cdots,u_{d-1}] and [v_2,u_1,\cdots,u_{d-1}]. Conversely, if [v_3] = [v_1] + [v_2] then one can deduce by strategic choices of  u_1,\cdots,u_{d-1} that the only possibility is v_3 = v_1 + v_2.  (Think in terms of determinants of matrices.)

As hinted at before, we expect multiplication to come into play and we expect it to behave nicely with respect to addition because we know, for example, that a rectangle of side lengths a,c and a rectangle of side lengths b,c have total area ac+bc = (a+b)c.  In other words, in P_V^2 at least, we expect that [v_1] * [v_3] + [v_2] * [v_3] = ([v_1]+[v_2]) * [v_3].  This is indeed the case — for any u_1,\cdots,u_{d-2} it is clear that [v_1,v_3,u_1,\cdots,u_{d-2}] + [v_2,v_3,u_1,\cdots,u_{d-2}] = [v_1+v_2,v_3,u_1,\cdots,u_{d-2}] — and here the point is to explain why * should behave like multiplication rather than prove rigorously that it does.

When it comes to rigorous proofs, it is time to switch from geometric intuition to mathematical precision.  Here, the key step is in recognising that the volume of a d-dimensional parallelotope [v_1,\cdots,v_d] in a d-dimensional vector space is a multi-linear function of the constituent vectors v_1,\cdots,v_d.  In fact, it is not just any multi-linear map but an alternating one, meaning that if two adjacent vectors are swapped then the volume changes sign.  This is the starting point for the modern definition of exterior algebra, also known as the Grassmann algebra.

I intentionally used non-conventional notation because it was important to introduce concepts one by one.  First, because the operator * introduced above is anti-commutative (it is almost as familiar as ordinary multiplication except that the sign can change, e.g., [v_1] * [v_2] = - [v_2] * [v_1]) it is common to denote it by the wedge product \wedge instead.  Furthermore, since P_V^1 is isomorphic to V it is customary to omit the square brackets, writing v_1 for [v_1], writing v_1 \wedge v_2 for [v_1,v_2], and so forth.

There are some loose ends which I do not tidy up since the aim of this note is to prepare the reader for a standard account of the exterior algebra; perhaps though the last point to clarify is that the Grassmann algebra is the direct sum of the base field plus P_V^1 plus P_V^2 up to P_V^d.  Thus, if two parallelotopes cannot be added geometrically to form a new parallelotope, either because they are of differing dimensions, or roughly speaking because changing metrics would cause them to change in incongruous ways as building blocks, then they are just left written as a sum.

In summary:

  • The exterior algebra of a vector space V is a vector space whose elements represent equivalence classes of linear combinations of oriented parallelotopes in V.
  • If d is the dimension of V then two d-dimensional parallelotopes are equivalent if and only if they have the same d-dimensional volume as each other with respect to any and all metrics.
  • Multiplying a parallelotope by a scalar just multiplies its volume by the same amount (without changing the subspace in which it lies).
  • A higher-dimensional parallelotope is constructed from lower-dimensional ones via the wedge product \wedge which, except for possible sign changes, behaves precisely like a multiplication operator (because, roughly speaking, volume is determined by multiplying one-dimensional lengths together).
  • Two i-dimensional parallelotopes x and y are equivalent if and only if, when treated as building blocks for constructing parallelotopes x \wedge t and y \wedge t of the same dimension as V, the volumes of the resulting parallelotopes x \wedge t and y \wedge t are always the same, regardless of which metric is used and how t is chosen.
  • The sum of two i-dimensional parallelotopes x and y equals the i-dimensional parallelotope z if and only if, for all (d-i)-dimensional parallelotopes t, the volume of z \wedge t equals the sum of the volumes of x \wedge t and y \wedge t regardless of which metric is used.  (Such a z need not exist, in which case the resulting vector space sum is denoted simply by x+y.)

As always, this note may be unnecessarily long because it was written in a linear fashion from start to finish. Hopefully though, the general direction taken has some appeal.