Wednesday 1 January 2020

Baptizing perpendicularity and understanding dot product

In my studies, I like to spot mental models that appear repeatedly, in either the same or different areas. This is useful, because often when analyzing a problem, you hit on the idea that it can be solved through a model that you are acquainted with.

One of these models is the "generalization of a concept", which is often used in Mathematics and about which there is extensive literature.

One way, among others, to look at this process is seeing it as the realization that one was assuming an arbitrary restriction in the conditions of the phenomenon. Since reality proves that the world does not always have such restriction, it is lifted. Thus you elaborate a more general concept, which is valid for cases built with and without the relevant restriction.

But the point of this post is that, once that you have given birth to the general concept, you should be able to baptize it. By this I mean, not so much giving it a more or less appropriate name, as giving it a faithful abstract description, which communicates its essence. This is what is going to enlighten you as to the meaning of the concept in question, both at the elementary and the general level. Thus it may happen that you do not understand well what you were doing with an apparently basic idea until you super-generalize it; or you don't grasp the general concept unless you see its evolution from the basic level.

I will illustrate this idea with an example: the generalized meaning of perpendicularity and how this helps understanding why and to what extent dot product works, when done analytically, i.e. component-wise. In this work, I will start with a more mathematical/abstract approach, but soon give way to the paradigm that this Blog promotes, which is: see everything from a practical point of view as a problem-solving technique.

Perpendicularity or orthogonality

This notion appears with a geometric meaning: two vectors (understood as "little arrows") are perpendicular if they form an angle of 90 degrees.

However, geometric definitions work in 2D or 3D spaces.  But what if you lift this restriction and start playing with 4D or 10D or even infinitely dimensional vectors? (That is by the way another nice generalization: functions are infinitely dimensional vectors, where the input plays the role of dimensions and the output acts as coefficients or coordinates or values of the vector in each dimension.)

Well, in that case, you use the algebraic version of the dot product.

The dot product initially receives also a geometric definition: it is an operation whereby the moduli of the vectors are multiplied, but then you apply a percentage, a trigonometric ratio (the cosine of the angle formed by the vectors), which in turn measures to what extent the vectors point in the same direction. Thus when one vector is lying over the other (they are parallel = angle is zero), the cosine is 1, so the ratio is 100%, because both vectors point in the same direction. Instead, when the vectors are perpendicular (angle is 90 degrees) cosine is 0, which means that the ratio is 0% as well, because the vectors point in totally different directions.

However, the cosine is of no use anymore after you leave 3D behind. Fortunately, the dot product technique also evolves to adapt to the new higher-than-3D environment and takes an algebraic form guaranteeing the same effect: the dot product is also the result of (i) multiplying the respective coordinates and (ii) adding up those products.

This works for the simplest cases. For example, the dot product of the unit vectors of a 2D orthonormal basis [(0,1) and (1,0)] is 0x1 +1x0 = 0 + 0 = 0, thus proving that such vectors are perpendicular. But it also works in the advanced cases, like in Fourier analysis, where the sum is an integral (because the number of dimensions is infinite), but the structure is analogous.

Time now for the baptism. We have kept the term "orthogonality", but this is just a vestige of the geometric context where the concept was born, in which dimensions were directions in the plane. The abstract meaning is "total dimensional discrepancy", that is to say, if we are comparing vectors a and b, a has no component at all (zero amount) in the dimensions where b has components and viceversa. As to the dot product, in the new context, it is often given another name, "inner product". But more important than that is its new abstract meaning: if it initially revealed the extent to which vectors share a direction in the plane, in a generalized sense it means "dimensional similarity". Thus for example in Fourier analysis dimensions are time points in one reference frame or frequencies in another. 

Problem: orthogonality is the aim but it is also the condition

Now to the problem. This algebraic dot product is a tool for detecting orthogonality, but at the same time it only works when there is orthogonality, in that the basis is orthogonal. How to prove this?

At StackExchange forum, I have found an answer that presents the solution very "mathematically". Let us follow it and later check how you could have also followed this path "intuitively", just by understanding the deep meaning of perpendicularity:

We first multiply the components of the two vectors in a "total war" manner (each component against the other two):



Then we we apply the definition of the dot product, in two senses:

  • the product of the two unit basis vectors i and j, if we are requiring that the basis be orthogonal, will be again 1x1 but multiplied by a 0% ratio of dimensional coincidence, so it is 0; because of this, the two middle terms vanish out;
  • the product of the unit basis vector i with itself is the product of the moduli (1x1) with a ratio of 100% dimensional coincidence, so it is 1; the same applies to the product of j with itself; because of this, the first and the last products become simple scalar products of the homogeneous components.
Thus the expression reduces to the following:



We thus conclude that this algebraic dot product technique is valid to the extent that we are relying on an orthogonal basis, because otherwise the two middle terms would not have vanished out and the answer would depend on which angle, other than 90 degrees, separates the two basis vectors, i and j.

Now, the intuitive and practical approach. The coordinates of a vector are like the information provided by Cinderella's slipper: quantities that you measure to serve as "clues" for catching criminals (solving problems). You can also call them "whistle-blowers". Obviously, a spy who repeats exactly the same as another one, is superfluous: you shouldn't pay him! Hence the minimum requirement for hiring a set of whistle-blowers is that each of them contributes with something new, even if they somehow repeat themselves. In mathematical jargon, it is said that those informers are "linearly independent". I would say that they are "helpful". But one that provides totally fresh and new information might be preferable, because it is "original" (technically, "perpendicular"), so that this way you can optimize your network: each specialist will investigate a different fact.

Let us check if that is the case. The dot product is like combining two sets of reports about two suspects (two vectors), so as to check to what extent both are "pointing to the same solution of the crime". For this purpose, the informers lay their reports on the table. All possible combinations among reports are like the "total war" product mentioned before. But soon you realize that you can mix apples with apples and pears with pears, but not apples with pears.

Combining apples with apples is what you do when you multiply homogeneous (100% dimensionally  coincident) quantities, like in the above mentioned first and last terms: ax* bx and ay * by.  For example, you combine the reports for direction X (ax* bxand it may happen that you get a higher or  lower positive product (because you combine + with + or - with -) or you may get a higher or lower negative product (because you multiply  + with -); that will mean that direction X contributes, respectively, with a vote for coincidence (if product is +) or for discrepancy (if it is -). Finally, you do the ballot or vote counting: you add up the mutually scaled reports for X and Y and thus get the modulus of the overall coincidence.

Should you also add up the the middle terms, i.e. the products between heterogeneous quantities, between apples and pears (ax* by or ay * bx)? No, because by definition you know that these clues are not overlapping at all, they refer to completely different facts; hence if the purpose is to learn to what extent they point the same direction (generically, they are dimensionally coincident), the answer is zero, so they make your life simpler by not casting any vote.

That is how orthogonality at the level of information sources (definition of basis vectors) helps you detect orthogonality (or any other degree of dimensional coincidence) at the level of problem solving.








No comments:

Post a Comment