I don't think there's any mathematical reason to lay out the elements in memory that way. Sure given no context I would probably use i = row + n col as index, but it doesn't really matter much me.
If I had to pick between a matrix being a row of vectors or a column of covectors, I'd pick the latter. And M[i][j] should be the element in row i column j, which is nonnegotiable.
>> It is often fortunate that the OpenGL matrix array is laid out the way it is because it results in those three elements being consecutive in memory.
Matlab deliberately notes that its matrices are laid out like this since most matrix operations occur on columns, a whole column can be loaded on a cache line.
> Mathematicians like to see their matrices laid out on paper this way (with the array indices increasing down the columns instead of across the rows as a programmer would usually write them).
Could a mathematician please confirm of disconfirm this?
I think that different branches of mathematics have different rules about this, which is why careful writers make it explicit.
Not a mathematician, just an engineer that used matrix a lot (and even worked for MathWorks at one point), I would say that most mathematicians don't care. Matrix is 2D, they don't have a good way to be laid out in 1D (which is what is done here, by giving them linear indices). They should not be represented in 1D.
The only type of mathematicians that actually care are:
- the one that use software where using one or the other and the "incorrect" algorithm may impact the performance significantly. Or worse, the one that would use software that don't use the same arbitrary choice (column major vs row major). And when I say that they care, it's probably a pain for them to think about it.
- the one that write these kind of software (they may describe themselves as software engineer, but some may still call themselves mathematicians, applied mathematicians, or other things like that).
Now maybe what the author wanted to say is that some language "favored by mathematician" (Fortran, MATLAB, Julia, R) are column major, while language "favored by computer scientist" (C, C++) are row major
Languages that don't have multidimensional arrays tend to have "arrays of arrays" instead, and that naturally leads to a layout where the last subscript varies fastest. Languages that do have multidimensional arrays can of course lay them out in any order.
ah, so you can get row vectors with a type cast in C but not column ones. While in Fortran and friends is the converse (if they casted). Yep that is more mathy. Linear map evaluation is linear combination of columns.
EDIT: type cast or just single square bracket application in C
What I suspect he really means is that FORTRAN lays out its arrays column-major, whilst C choose row-major. Historically most math software was written in the former, including the de facto standard BLAS and LAPACK APIs used for most linear algebra. Mix-and-matching memory layouts is a recipe for confusion and bugs, so "mathematicians" (which I'll read as people writing a lot of non-ML matrix-related code) tend to prefer to stick with column major.
Of course things have moved on since then and a lot of software these days is written in languages that inherited their array ordering from C, leading to much fun and confusion.
The other gotcha with a lot of these APIs is of course 0 vs 1-based array numbering.
The MKL blas/lapack implementation also provides the “cblas” interface (I’m sure most blas implementations do, I’m just familiar with MKL—BLIS seems quite willing to provide additional interfaces to I bet they provide it as well) which explicitly accepts arguments for row or column ordering.
Internally the matrix is tiled out anyway (for gemm at least) so column vs row ordering is probably a little less important nowadays (which isn’t to say it never matters).
Oh yes, from an actual implementation POV you can just apply some transpose and ordering transforms to convert from row major to column major or vice-versa. cblas is pretty universal though I don't think any LAPACK C API ever gained as wide support for non column-major usage (and actually has some routines where you can't just pull transpose tricks for the transformation).
Certain layouts have performance advantages for certain operations on certain microarchitectures due to data access patterns (especially for level 2 BLAS), but that's largely irrelevant to historical discussion of the API's evolution.
This is one of those things that's perennially annoying in computer graphics. Depending on the API you can have different conventions for:
- data layout (row-major vs. column major)
- pre-multiplication vs. post-multiplication of matrices
Switching either of these conventions results in a transposition of the data in memory, but knowing which one you're switching is important to getting the implementation of the math right.
And then on top of that for even more fun you have:
- Left- vs. right-handed coordinate systems
- Y-up or Z-up
- Winding order for inside/outside
There are lots of small parity things like this where you can get it right by accident but then have weird issues down the line if you aren't scrupulous. (I once had a very tedious time tracking down some inconsistencies in inside/outside determination in a production rendering system.)
At this point might as well make them match the x/y convention, with first index increasing to the right, and second index increasing from bottom to top.
(In many branches the idea is that you care about the abstract linear transformation and properties instead of the dirty coefficients that depend on the specific base. I don't expect a mathematician to have an strong opinion on the order. All are equivalent via isomorphism.)
I'm an applied mathematician and this is the most common layout for dense matrices due to BLAS and LAPACK. Note, many of these routines have a flag to denote when working with a transpose, which can be used to cheat a different memory layout in a pinch. There are also parameters for increments in memory, which can help when computing across a row as opposed to down a column, which can also be co-opted. Unless there's a reason not to, I personally default to column major ordering for all matrices and tensors and use explicit indexing functions, which tends to avoid headaches since my codes are consistent with most others.
Abstractly, there's no such thing as memory layout, so it doesn't matter for things like proofs, normally.
I'm a mathematician. It's kind of a strange statement since, if we are talking about a matrix, it has two indices not one. Even if we do flatten the matrix to a vector, rows then columns are an almost universal ordering of those two indices and the natural lexicographic ordering would stride down the rows.
Recently graduated math student here. The definition of the "vec" operator which turns a matrix into a vector works like this, stacking up columns rather than rows.
Most fields of math that use matrices don't number each element of the matrix separately, and if they do there will usually be two subscripts (one for the row number and one for the column number).
Generally, matrices would be thought in terms of the vectors that make up each row or column.
People must get taught math terribly if they think "I don't need to worry about piles of abstract math to understand a rotation, all I have to do is think about what happens to the XYZ axes under the matrix rotation". That is what you should learn in the math class!
Anyone who has taken linear algebra should know that (1) a rotation is a linear operation, (2) the result of a linear operation is calculated with matrix multiplication, (3) the result of a matrix multiplication is determined by what it does to the standard basis vectors, the results of which form the columns of the matrix.
This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
I took a linear algebra class, as well as many others. It didn't work.
Most math classes I've taken granted me some kind of intuition for the subject material. Like I could understand the concept independent from the name of the thing.
In linear algebra, it was all a series of arbitrary facts without reason for existing. I memorized them for the final exam, and probably forgot them all the next day, as they weren't attached to anything in my mind.
"The inverse of the eigen-something is the determinant of the abelian".
It was just a list of facts like this to memorize by rote.
I passed the class with a decent grade I think. But I really understood nothing. At this point, I can't remember how to multiply matrices. Specifically do the rows go with the columns or do the columns go with the rows?
I don't know if there's something about linear algebra or I just didn't connect with the instructor. But I've taken a lot of other math classes, and usually been able to understand the subject material readily. Maybe linear algebra is different. It was completely impenetrable for me.
You might want to try Linear Algebra Done Right by Sheldon Axler. It's a short book, succinct but extremely clear and approachable. It explains Linear Algebra without using determinants, which are relegated to the end, and emphasises understanding the powerful ideas underpinning the subject rather than learning seemingly arbitrary manipulations of lists and tables of numbers.
Those manipulations are of course extremely useful and worth learning, but the reasons why, and where they come
from, will be a lot clearer after reading Axler.
As someone pointed out elsewhere in this thread, the book is available free at https://linear.axler.net/
The page count suggests that we have different ideas of what's meant by "short". In any case, it looks great from the forewords. If I ever want to make a serious try to really get it, this is probably what I'll use.
To remind oneself how to multiply matrices together, it suffices to remember how to apply a matrix to a column vector, and that ((A B) v) = (A (B v)).
For each 1-hot vector e_i (i.e. the row vector that has a 1 in the i-th position and 0s elsewhere), apply B e_i to get the i-th column of the matrix B. Then, apply the matrix A to the result, to obtain A (B e_i), which equals (A B) e_i . This is then the i-th column of the matrix A B.
And, when applying the matrix A to some column vector v, for each entry/row of the resulting vector, it is obtained by combining the corresponding row of A, with the column vector v.
So, to get the entry at the j-th row of the i-th column of (A B), one therefore combines the i-th column of B with the j-th row of A.
Or, alternatively/equivalently, you can just compute the matrix (A B) column by column, by, for each e_i , computing that the i-th column of (A B) is (A (B e_i)) (which is how I usually think of it).
To be clear, I don't have the process totally memorized; I actually use the above reasoning to remind myself of the computation process a fair portion of the time that I need to compute actual products of matrices, which is surprisingly often given that I don't have it totally memorized.
When I took linear algebra, the professor emphasized the linear maps, and somewhat de-emphasized the matrices that are used to notate them. I think this made understanding what is going on easier, but made the computations less familiar. I very much enjoyed the class.
Here's a recipe for matrix multiplication that you can't forget: choose bases b_i/c_j for your domain/codomain. Then all a matrix is is listing the outputs of a function for your basis: if you have a linear function f, then the ith column of its matrix A is just f(b_i). If you have another function g from f's codomain, then same thing, its matrix B is just the list of outputs g(c_j). Then the ith column of BA is just g(f(b_i)). If you write these things down on paper and expand out what I wrote, you'll see the usual row and column thing pop out. The point is that f(b_i) is a weighted sum of the c_i (since c_i is a basis for the target of f), but you can pull the weighted sums through the definition of g because it's linear. A basis gives you a minimal description/set of points where you need to define a function, and the definition for all other points follows from linearity.
The point of the eigen-stuff is that along some directions, linear functions are just scalar multiplication: f(v) = av. If the action in a direction is multiplication by a, then it can't also be multiplication by b. So unequal eigenvalues must mean different directions/linearly independent subspaces. So e.g. if you can find n different eigenvalues/eigenvectors, you've found a simple basis where each direction is just multiplication. You also know that it's invertible if the eigenvalues are nonzero since all you did was multiply by a_i along each direction, so you can invert it by multiplying by 1/a_i on each direction.
Taught properly it's all very straightforward, though determinants require some more buildup with a detour through things like quotienting and wedge products if you really want it to be straightforward IMO. You start by saying you want to look at oriented areas/volumes, and look at the properties you need. Then quotienting gives you a standard tool to say "I want exactly the thing that has those properties" (wedge products). Then the action on wedges gives you what your map does to volumes, with the determinant as the action on the full space. You basically define it to be what you want, and then you can calculate it by linearity/functoriality just like you expand out the definition of a linear map from a basis.
I'm an applied math PhD who thinks linear algebra is the best thing ever, and it's the nuts and bolts of modern AI, so for fun and profit I'll attempt a quick cheat sheet.
To manage expectations, this won't be very satisfying by itself. You have to do a lot of exercises for this stuff to become second nature. But hopefully it at least imparts a sense that the topic is conceptually meaningful and not just a profusion of interacting symbols. For brevity, we'll pretend real numbers are the only numbers that exist; assume basic knowledge of vectors; and, I won't say anything about eigenvalues.
1. The most important thing to know about matrices is that they are linear maps. Specifically, an m x n matrix is a map from n-dimensional space (R^n) to m-dimensional space (R^m). That means that you can use the matrix as a function, one which takes as input a vector with n entries and outputs a vector with m entries.
2. The columns of a matrix are vectors. They tell you what outputs are generated when you take the standard basis vectors and feed them as inputs to the associated linear map. The standard basis vectors of R^n are the n vectors of length 1 that point along the n coordinate axes of the space (the x-axis, y-axis, z-axis, and beyond for higher-dimensional spaces). Conversely, a vector with n entries is also an n x 1 column matrix.
3. Every vector can be expressed uniquely as a linear combination (weighted sum) of standard basis vectors, and linear maps work nicely with linear combinations. Specifically, F(ax + by) = aF(x) + bF(y) for any real-valued "weights" a,b and vectors x,y. From this, you can show that a linear map is uniquely determined by what it maps the standard basis vectors to. This + #2 explains why linear maps and matrices are equivalent concepts.
4a. The way you apply the linear map to an arbitrary vector is by matrix-vector multiplication. If you write out (for example) a 3 x 2 matrix and a 2 x 1 vector, you will see that there is only one reasonable way to do this: each 1 x 2 row of the matrix must combine with the 2 x 1 input vector to produce an entry of the 3 x 1 output vector. The combination operation is, you flip the row from horizontal to vertical so it's a vector, then you dot-product it with the input vector.
4b. Notice how when you multiply 3x2 matrix with 2x1 vector, you get a 3x1 vector. In the "size math" of matrix multiplication, (3x2) x (2x1) = (3x1); the inner 2's go away, leaving only the outer numbers. This "contraction" of the inner dimensions, which happens via the dot product of matching vectors, is a general feature of matrix multiplication. Contraction is also the defining feature of how we multiply tensors, the 3D and higher-dimensional analogues of matrices.
5. Matrix-matrix multiplication is just a bunch of matrix-vector multiplications put side-by-side into a single matrix. That is to say, if you multiply two matrices A and B, the columns of the resulting matrix C are just the individual matrix-vector multiplications of A with the columns of B.
6. Many basic geometric operations, such as rotation, shearing, and scaling, are linear operations, so long as you use a version of them that keeps the origin fixed (maps the zero vector to zero vector). This is why they can be represented by matrices and implemented in computers with matrix multiplication.
I think this is pretty instructor-dependent. I had two LinAlg courses, and in the first, I felt like I was building a great intuition. In the second, the instructor seemed to make even the stuff I previously learned seem obtuse and like "facts to memorize."
Maybe linear algebra is more instructor-dependent, since we have fewer preexisting concepts to build on?
A lot of people who find themselves having to deal with matrices when programming have never taken that class or learned those things (or did so such a long time ago that they've completely forgotten). I assume this is aimed at such people, and he's just reassuring them that he's not going to talk about the abstract aspects of linear algebra, which certainly exist.
I'd take issue with his "most programmers are visual thinkers", though. Maybe most graphics programmers are, but I doubt it's an overwhelming majority even there.
This is interesting because, to me, programing is a deeply visual activity. It feels like wandering around in a world of forms until I find the structures I need and actually writing out the code is mostly a formality.
I would describe my experience of it similarly, but wouldn't call it "visual thinking" in the sense meant in the article, where one uses actual imagery and visual-spatial reasoning. Indeed, I almost completely lack the ability to conjure mental imagery (aphantasia) and I've speculated it might be because a part of my visual cortex is given over to the pseudo-visual activity that seems to take place when I program.
I'm especially sure my sort of pseudo-visual thinking isn't what the article means by "visual thinking" because I also use it when working through "piles of abstract math", which I take to very kindly indeed.
Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn? Very intriguing if the latter, and I'd be curious to know what they look like.
> Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn?
They're like if the abstract machines you talk about in CS theory classes were physical objects.
For example, thinking about a data processing pipeline, I might see the different components performing transformations on messages flowing through it. I can focus on one component and think about how it take apart the message to extract the structure it's trying to manipulate, interacts with its local state, etc. If something is active and stateful it feels different than if it's just a plain piece of data. I run the machine through its motions to understand where the complexity is and where things could break, comparing different designs against each other.
I'm thinking about a data format, I think about the relationships between containers, headers, offsets between structures, etc, like pieces that I can move around to see how their relationships change and understand how it would work in practice.
It's more than an image that can be drawn because the pieces are in motion as they operate. It's the same kind of "material" that mathematical objects are made out of when I'm thinking about abstract math. It's immensely useful skill for doing my job, in designing systems.
I actually struggle a lot with translating the systems in my head into prose. To me, certain design decisions are completely obvious and wouldn't need to be stated, so when we all understand the product goals I often to neglect to explain why a certain thing works the way it does, because to me it's completely obvious how it's useful towards achieving the product goals. So that's something I have to actively put more effort into.
I also really struggled when I took a real linear algebra class, since it was taught in a very blackboardy "tabular" style which was harder for me to visualize. I was unfamiliar with it due to being used to thinking about matrices in the context of computer graphics and game engines.
Do you have anything I can read about that? I'm definitely on the spectrum and have whatever the opposite of aphantasia is, I can see things very clearly in my head
"In Experiment 2 we have shown that people with aphantasia report higher AQ scores (more traits associated with autism than controls), and fall more often within the range suggestive of autism (≥32)."
Math achievement correlates strongly with visuospatial reasoning. Programmers may not be as proficient in math as economists, but they are better at it than biologists or lawyers.
I would distinguish between visual imagination and visuospatial reasoning.
For people like myself with aphantasia, there are often problems solving strategies that can help you when you can’t visualize. Like draw a picture.
And lots of problems don’t really require as much visual imagination as you would think. I’m pretty good at math, programming, and economics. Not top tier, but pretty good.
If there are problems out there that you struggle with compared to others, then that’s the universe telling you that you don’t have a comparative advantage in it. Do something else and hire the people who can more easily solve them if you need it.
It sounds like you have routed around your spatial visualization deficit, but that just proves the importance of alternate cognitive strategies rather than indicate that such an aptitude or deficit doesn’t ceteris paribus impact mathematical achievement.
I took some sort of IQ test when I was a kid and there was an entire section that was "if you rotate this object around that axis, it matches which of the followin g options". Try as I might, I can't picture this in my head (picturing anything other than a sphere or a cube is tough) but I found that I could look at the options and logically exclude them in a very tedious way by inspection.
It's one of the reasons I like computer graphics so much: the computer does the rotation for you! Stereo graphics (using the funny LCD glasses) was a true revelation to me, and learning how to rotate things using matrics was another.
I have taken several linear algebra courses, one from my high school and two from universities. The thing is, not all courses of linear algebra will discuss rotations the way you discuss it. One reason is that sometimes a high school linear algebra course cannot assume students have learned trigonometry. I've seen teachers teach it just to solve larger linear systems of equations. Another reason is that sometimes a course will focus just on properties of vector spaces without relating them to geometry; after all who can visualize things when the course routinely deals with 10-dimensional vectors or N-dimensional ones where N isn't a constant.
I think teaching beginner linear algebra using matrices representing systems of equations is a pedagogical mistake. It gives the wrong impression that matrices are linear algebra and makes it difficult for students to think about it in an abstract way. A better way is to start by discussing abstract linear combinations and then illustrating what can be done with this using visualizations in various coordinate systems. Once the student understands this intuitively, systems of equations and matrices can be brought up as equivalent ways to represent linear transformations on paper. It’s important to emphasize that matrices are convenient but not the only way to write the language of linear algebra.
Usually you just draw a 2D or 3D picture and say "n" while pointing to it. e.g. I had a professor that drew a 2D picture on a board where he labeled one axis R^m and the other R^n and then drew a "graph" when discussing something like the implicit function theorem. One takeaway of a lot of linear algebra is that doing this is more-or-less correct (and then functional analysis tells you this is still kind-of correct-ish even in infinite dimensions). Actually SVD tells you in some sense that if you look at things in the right way, the "heart" of a linear map is just 1D multiplication acting independently along different axes, so you don't need to consider all n dimensions at once.
> Anyone who has taken linear algebra should know that [...]
My university level linear algebra class didn't touch practical applications at all, which was frustrating to me because I knew that it could be very useful to some background doing hobbyist game dev. I still wish I had a better understanding of the use cases for things like eigenvectors/values.
Here are some applications of eigenvectors and eigenvalues:
1) If you have a set of states and a stochastic transition function, which gives for each starting state, the probability distribution over what the state will be at the next time step, you can describe this as a matrix. The long-term behavior of applying this can be described using the eigenvectors and eigenvalues of this matrix. Any stable distribution will be an eigenvector with eigenvalue 1. If there is periodicity to the behavior, where for some initial distributions, the distribution will change over time in a periodic way, where it ends up endlessly cycling through a finite set of distributions, then the matrix will have eigenvalues that are roots of unity (i.e. a complex number s such that s^n = 1 for some positive integer n). Eigenvalues with absolute value less than 1 correspond to transient contributions to the distribution which will decay (the closer to 0, the quicker the decay.). When there are finitely many states, there will always be at least one eigenvector with eigenvalue 1.
2) Related to (1), there is the PageRank algorithm, where one takes a graph where each node has links to other nodes, and one models a random walk on these nodes, and one uses the eigenvector one (approximately) finds in order to find the relative importance of the different nodes.
3) Rotations generally have eigenvalues that are complex numbers with length 1. As mentioned in (1), eigenvalues that are complex numbers with length 1 are associated with periodic/oscillating behavior. Well, I guess it sorta depends how you are using the matrix. If you have a matrix M with all of its eigenvalues purely imaginary, then exp(t M) (with t representing time) will describe oscillations with rates given by those eigenvalues. exp(t M) itself will have eigenvalues that are complex numbers of length 1. This is very relevant in solutions to higher order differential equations or differential equations where the quantity changing over time is a vector quantity.
____
But, for purposes of gamedev, I think probably the eigenvalues/eigenvectors are probably the less relevant things.
Probably instead, at least for rendering and such, you want stuff like, "you can use homogeneous coordinates in order to incorporate translations and rotations into a single 4x4 matrix (and also for other things relating the 3d scene to the 2d screen)", and stuff about like... well, quaternions can be helpful.
Of course, it all depends what you are trying to do...
> This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
The problem with this kind of thinking is that it encourages the exact kind of teaching you disparage. It's very easy to get lost in the sauce of arbitrary notational choices that the underlying concepts end up completely lost. Before you know it, matrices are opaque self-justified atoms in an ecosystem rather than an arbitrary tabular shorthand. Mathematics is not a singular immutable dogma that dropped out of the sky as natural law, and rediscovery is the most powerful tool for understanding.
When I was studying and made the mistake of choosing 3D computer graphics as a lecture, I remember some 4x4 matrix that was used for rotation, with all kinds of weird terms in it, derived only once, in a way I was not able to understand and that didn't relate to any visual idea or imagination, which makes it extra hard for me to understand it, because I rely a lot on visualization of everything. So basically, there was a "magical formula" to rotate things and I didn't memorize it. Exam came and demanded having memorized this shitty rotation matrix. Failed the exam, changed lectures. High quality lecturing.
Later in another lecture at another university, I had to rotate points around a center point again. This time found 3 3x3 matrices on wikipedia, one for each axis. Maybe making at least seemingly a little bit more sense, but I think I never got to the basis of that stuff. Never seen a good visual explanation of this stuff. I ended up implementing the 3 matrices multiplications and checked the 3D coordinates coming out of that in my head by visualizing and thinking hard about whether the coordinates could be correct.
I think visualization is the least of my problems. Most math teaching sucks though, and sometimes it is just the wrong format or not visualized at all, which makes it very hard to understand.
The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
Since you can combine a series of matrix multiplications by just pre-multiplying the matrix, this sets you up for doing a very efficient "move, scale, rotate" of an object using a single matrix multiplication of that pre-calculated 4x4 matrix.
If you just want to, e.g., scale and rotate the object, a 3x3 matrix suffices. Sounds like your first lecture jumped way too fast to the "here's the fully general version of this", which is much harder for building intuition for.
Sorry you had a bad intro to this stuff. It's actually kinda cool when explained well. I think they probably should have started by showing how you can use a matrix for scaling:
[[2, 0, 0],
[0, 1.5, 0],
[0, 0, 1]]
for example, will grow an object by 2x in the x dimension, 1.5x in the y dimension, and keep it unchanged in the z dimension. (You'll note that it follows the pattern of the identity matrix). The derivation of the rotation matrix is probably best first derived for 2d; the wikipedia article has a decentish explanation:
The first time I learned it was from a book by LaMothe in the 90s and it starts with your demonstration of 3D matrix transforms, then goes "ha! gimbal lock" then shows 4D transforms and the extension to projection transforms, and from there you just have an abstraction of your world coordinate transform and your camera transform(s) and most everything else becomes vectors. I think it's probably the best way to teach it, with some 2D work leading into it as you suggest. It also sets up well for how most modern game dev platforms deal with coordinates.
Think OpenGL used all those 2,3,4D critters at API level. It must be very hardware friendly to reduce your pipeline to matrix product. Also your scene graph (tree) is just this, you attach relative rotations and translations to graph nodes. You push your mesh (stream of triangles) at tree nodes, and composition of relative transforms up to the root is matrix product (or was the inverse?) that transform the meshes that go to the pipeline. For instance character skeletons are scene subgraphs, bones have translations, articulations have rotations. That's why it is so convenient to have rotations and translations in a common representation, and a linear one (4D matrix) is super. All this excluding materials, textures, and so on, I mean.
> The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
I think this is mixing up concepts that are orthogonal to linear spaces, linear transformations, and even specific operations such as rotations.
The way you mention "more general set of transformations" suggests you're actually referring to homogeneous coordinates, which is a trick that allows a subset of matrix-vector multiplication and vector addition in 3D spaces to be expressed as a single matrix-vector multiplication in 4D space.
This is fine and dandy if your goal is to take a headstart to 3D programming, where APIs are already designed around this. This is however a constrained level of abstraction above actual linear algebra, which may be and often is more confusing.
You can do a rotation or some rotations but SO(3) is not simply connected.
It mostly works for rigid bodies centered on the origin, but gimbal lock or Dirac's Plate Trick are good counter example lenses. Twirling a baton or a lasso will show that 720 degrees is the invariant rotation in SO(3)
The point at infinity with a 4x4 matrix is one solution, SU(3), quaternions, or recently geometric product are other options with benefits at the cost of complexity.
I think you are confused about what 'simply connected' means. A 3x3 matrix can represent any rotation. Also from a given rotation there is a path through the space of rotations to any other rotation. It's just that some paths can't be smoothly mapped to some other paths.
SO(3) contains all of the orthogonal 3x3 matrices of determinant 1.
If you are dealing with rigid bodies rotated though the origin like with the product of linear translations you can avoid the problem. At least with an orthonormal basis R^3 with an orthogonal real valued 3x3 matrix real entries which, where the product of it with its transpose produces the identity matrix and with determinant 1
But as soon as you are dealing with balls, where the magnitude can be from the origin to the radius, you run into the issue that the antipodes are actually the same point, consider the north and south poles being the same point, that is what I am saying when the topology is not simply connected.
The rigid body rotation about the origin is just a special case.
Twist a belt twice and tape one end to the table and you can untwist it with just horizontal translation, twist it once (360deg) and you cannot.
In computer graphics, 4x4 matrices let you do a rotation and a translation together (among other things). There's the 3x3 rotation block you found later as well as a translation vector embedded in it. Multiplying a sequence of 4x4 matrices together accumulates the rotations and translations appropriately as if they were just a bunch of function applications. i.e. rotate(translate(point)) is just rotation_matrix * translation_matrix * point_vector if you construct your matrices properly. Multiplying a 4x4 matrix with another 4x4 matrix yields a 4x4 matrix result, which means that you can store an arbitrary chain of rotations and translations accumulated together into a single matrix...
Yeah you need to build up the understanding so that you can re-derive those matrices as needed (it's mostly just basic trigonometry). If you can't, that means a failure of your lecturer or a failure in your studying.
The mathematical term for the four by four matrices you were looking at is "quaternion" (I.e. you were looking at a set of four by four matrices isomorphic to the unit quaternions).
Why use quaternions at all, when three by three matrices can also represent rotations? Three by three matrices contain lots of redundant information beyond rotation, and multiplying quaternions requires fewer scalar additions and multiplications than multiplying three by three matrices. So it is cheaper to compose rotations. It also avoids singularities (gimbal lock).
This was part of Steve Baker’s (“Omniverous Hexapod”, sic) extensions to a long-standing Usenet FAQ about graphics programming, put out by “Carniverous Hexapod” (sic). It’s at least two decades old, and the FAQ from which he it on may be from the 1990s? I have the niggling recollection that the Carniverous name may have been based on Vernor Vinge’s _Fire upon the deep_ aliens.
He did not invent it, but he probably had to deal with aspiring graphics programmers who were not very math-savvy.
Honestly, many math teachers are kinda bad at conveying all that.
When everything clicked a few years down the line it all became so simple.
Like you mention "linear operation", the word linear doesn't always make intuitive sense in terms of rotations or scaling if you have encountered simple 1 or 2 dimensional linear transformations when doing more basic graphics programming.
As a teacher, I think the biggest lesson I had to learn was to always have at least 3 different ways of explaining everything to give different kinds of people different entrypoints into understanding concepts.
For someone uninitiated a term like "basis vector" can be pure gibberish if it doesn't follow an example of a transform as a viewport change, and it needs to be repeated after your other explanations (of for example how vector components in the source view just are scalars upon the basis vectors when multiplied with a matrix instead of a heavy un-intuitive concept).
Math is just a standardized way to communicate those concepts though, it's a model of the world like any other. I get what you mean, but these intuitive or visualising approaches help many people with different thinking processes.
Just imagine that everyone has equal math ability, except the model of math and representations of mathematical concepts and notation is more made for a certain type of brains than others. These kind of explanations allow bringing those people in as well.
For my fellow visual thinkers who might be looking for a linear algebra book that focuses more on developing the geometric intuition for stuff like this, rather than just pure numeric linear system solving, let me recommend:
"Practical Linear Algebra: A Geometry Toolbox" by Farin and Hansford.
There are a lot more ways to look at and understand these mysterious beasts called matrices. They seem to represent a more fundamental primordial truth. I'm not sure what it is. Determinant of a matrix indicate the area of or volume spanned by its component vectors. Complex matrices used in Fourier transform are beautiful. Quantum mechanics and AI seem to be built on matrices. There is hardly any area of mathematics that doesn't utilize matrices as tools. What exactly is a matrix? Just a grid of numbers? don't think so.
The fundamental truth is that matrices represent linear transformations, and all of linear algebra is developed in terms of linear transformations rather than just grid of numbers. It all becomes much clearer when you let go of the tabular representation and study the original intentions that motivated the operations you do on matrices.
My appreciation for the subject grew considerably after working through the book "Linear Algebra done right" by Axler https://linear.axler.net
Spatial transformations? Take a look at the complex matrices in Fourier transforms with nth roots of unity as its elements. The values are cyclic, and do not represent points in an n-D space of Euclidean coordinates.
Yes; I wrote linear transformation on purpose not to remain constrained on spatial or geometric interpretations.
The (discrete) Fourier transform is also a linear transformation, which is why the initial effort of thinking abstractly in terms of vector spaces and transformations between them pays lots of dividends when it's time to understand more advanced topics such as the DFT, which is "just" a change of basis.
>[Matrices] seem to represent a more fundamental primordial truth.
No, matrices (or more specifically matrix multiplication) are a useful result picked out of a huge search space defined as "all the ways to combine piles of numbers with arithmetic operators". The utility of the discovery is determined by humans looking for compact ways to represent ideas (abstraction). One of the most interesting anecdotes in the history of linear algebra was how Hamilton finally "discovered" a way to multiply them. "...he was out walking along the Royal Canal in Dublin with his wife when the solution in the form of the equation i2 = j2 = k2 = ijk = −1 occurred to him; Hamilton then carved this equation using his penknife into the side of the nearby Broom Bridge" [0]
The "primordial truth" is found in the selection criteria of the human minds performing the search.
A lot of areas use use grid of numbers. And matrix theory actually incorporates every area that uses grids of numbers, and every rule in those areas.
For example the simplest difficult thing in matrix theory, matrix multiplication is an example for this IMO. It looks really weird in the context of grid of numbers, and its properties seem incidental, and the proofs are complicated. But matrix multiplication is really simple and natural in the context of linear transformations between vector spaces.
Rotation (turn something 2 degrees, 90 degrees, 180 degrees, 360 degrees back to the same heading)
Scaling (make something larger, smaller, etc)
(And a few more that doesn't help right now)
The 2 first can be visualized simply in 2d, just take a paper/book/etc. Move it left-right, up down, rotate it.. the book in the original position and rotation compared to the new position and rotation can be described as a vector space transformation, why?
Because you can look at it in 2 ways, either the book moved from your vantage point, or you follow the book looking at it the same way and the world around the book moved.
In both cases, something moved from one space (point of reference) to another "space".
The thing that defines the space is a "basis vector", basically it says what is "up", what is "left" and was in "in" in the way we move from one space to another.
Think of it as, you have a piece card on a paper. Draw an line/axis along the bottom edge as the X count, then draw on the left side upwards the Y count. In the X,Y space (from space) you count the X and Y steps of various feature points.
Now draw the "to space" as another X axis and another Y axis (could be rotated, could be scaled, could just be moved) and take the counts in steps and put them inside the "to space" measured in equal units as they were in the from space.
Once the feature points are replicated in the "to space" you should have the same image as before, just within the new space.
This is the essence of a so called linear(equal number steps) transform (moved somewhere else), and also exactly what multiplying a set of vectors by a matrix achieves (simplified, in this context, the matrix really is mostly a representation of a number of above mentioned basis vectors that defines the X, Y,etc of the movement).
the set of all matrices of a fixed size are a vector space because matrix addition and scalar multiplication are well-defined and follow all vector space axioms.
But be careful of the map–territory relation.
If you can find a model that is a vector space, that you can extend to an Inner product space and extend that to a Hilbert space; nice things happen.
Really the amazing part is finding a map (model) that works within the superpowers of algorithms, which often depends upon finding many to one reductions.
Get stuck with a hay in the haystack problem and math as we know it now can be intractable.
Vector spaces are nice and you can map them to abstract algebra, categories, or topos and see why.
A matrix is just a list of where a linear map sends each basis element (the nth column of a matrix is the output vector for the nth input basis vector). Lots of things are linear (e.g. scaling, rotating, differentiating, integrating, projecting, and any weighted sums of these things). Lots of other things are approximately linear locally (the derivative if it exists is the best linear approximation. i.e. the best matrix to approximate a more general function), and e.g. knowing the linear behavior near a fixed point can tell you a lot about even nonlinear systems.
Yes, I think of them as saying "and this is what the coordinates in our coordinate system [basis] shall mean from now on". Systems of nonlinear equations, on the other hand, are some kind of sea monsters.
The age doesn't affect this matrix part, but just FYI that any specific APIs discussed will probably be out of date compared to modern GPU programming.
> What stops most novice graphics programmers from getting friendly with matrices is that they look like 16 utterly random numbers.
Wait until they see how physicists and chemists treat matrices! They will pine for the day when a transformation can be described by 16 numbers in a table.
I don't think there's any mathematical reason to lay out the elements in memory that way. Sure given no context I would probably use i = row + n col as index, but it doesn't really matter much me.
If I had to pick between a matrix being a row of vectors or a column of covectors, I'd pick the latter. And M[i][j] should be the element in row i column j, which is nonnegotiable.
This was answered in the article.
>> It is often fortunate that the OpenGL matrix array is laid out the way it is because it results in those three elements being consecutive in memory.
Matlab deliberately notes that its matrices are laid out like this since most matrix operations occur on columns, a whole column can be loaded on a cache line.
The only thing that ever made this click for me was that the columns can be interpreted as the values for the new, post-transform basis vectors
Anecdote from someone does a lot of graphics programming. (Building a molecule viewer/editor):
I've only needed matrices exactly once, when building the engine. It does a few standard transforms (model view matrices etc).
The rest is all len-3 Vecs, and unit quaternions. You can use matrices for these, but I'm on team Vec+Quaternion!
> Mathematicians like to see their matrices laid out on paper this way (with the array indices increasing down the columns instead of across the rows as a programmer would usually write them).
Could a mathematician please confirm of disconfirm this?
I think that different branches of mathematics have different rules about this, which is why careful writers make it explicit.
Not a mathematician, just an engineer that used matrix a lot (and even worked for MathWorks at one point), I would say that most mathematicians don't care. Matrix is 2D, they don't have a good way to be laid out in 1D (which is what is done here, by giving them linear indices). They should not be represented in 1D.
The only type of mathematicians that actually care are: - the one that use software where using one or the other and the "incorrect" algorithm may impact the performance significantly. Or worse, the one that would use software that don't use the same arbitrary choice (column major vs row major). And when I say that they care, it's probably a pain for them to think about it. - the one that write these kind of software (they may describe themselves as software engineer, but some may still call themselves mathematicians, applied mathematicians, or other things like that).
Now maybe what the author wanted to say is that some language "favored by mathematician" (Fortran, MATLAB, Julia, R) are column major, while language "favored by computer scientist" (C, C++) are row major
Languages that don't have multidimensional arrays tend to have "arrays of arrays" instead, and that naturally leads to a layout where the last subscript varies fastest. Languages that do have multidimensional arrays can of course lay them out in any order.
ah, so you can get row vectors with a type cast in C but not column ones. While in Fortran and friends is the converse (if they casted). Yep that is more mathy. Linear map evaluation is linear combination of columns.
EDIT: type cast or just single square bracket application in C
What I suspect he really means is that FORTRAN lays out its arrays column-major, whilst C choose row-major. Historically most math software was written in the former, including the de facto standard BLAS and LAPACK APIs used for most linear algebra. Mix-and-matching memory layouts is a recipe for confusion and bugs, so "mathematicians" (which I'll read as people writing a lot of non-ML matrix-related code) tend to prefer to stick with column major.
Of course things have moved on since then and a lot of software these days is written in languages that inherited their array ordering from C, leading to much fun and confusion.
The other gotcha with a lot of these APIs is of course 0 vs 1-based array numbering.
The MKL blas/lapack implementation also provides the “cblas” interface (I’m sure most blas implementations do, I’m just familiar with MKL—BLIS seems quite willing to provide additional interfaces to I bet they provide it as well) which explicitly accepts arguments for row or column ordering.
Internally the matrix is tiled out anyway (for gemm at least) so column vs row ordering is probably a little less important nowadays (which isn’t to say it never matters).
Oh yes, from an actual implementation POV you can just apply some transpose and ordering transforms to convert from row major to column major or vice-versa. cblas is pretty universal though I don't think any LAPACK C API ever gained as wide support for non column-major usage (and actually has some routines where you can't just pull transpose tricks for the transformation).
Certain layouts have performance advantages for certain operations on certain microarchitectures due to data access patterns (especially for level 2 BLAS), but that's largely irrelevant to historical discussion of the API's evolution.
This is one of those things that's perennially annoying in computer graphics. Depending on the API you can have different conventions for:
- data layout (row-major vs. column major)
- pre-multiplication vs. post-multiplication of matrices
Switching either of these conventions results in a transposition of the data in memory, but knowing which one you're switching is important to getting the implementation of the math right.
And then on top of that for even more fun you have:
- Left- vs. right-handed coordinate systems
- Y-up or Z-up
- Winding order for inside/outside
There are lots of small parity things like this where you can get it right by accident but then have weird issues down the line if you aren't scrupulous. (I once had a very tedious time tracking down some inconsistencies in inside/outside determination in a production rendering system.)
Not a mathematician, but programmers definitely don't agree on whether matrices should be row-major or column-major.
I'm surprised we even agree that they should be top-down.
At this point might as well make them match the x/y convention, with first index increasing to the right, and second index increasing from bottom to top.
Programmers don’t agree on the x/y convention.
Mathematician here. I never heard that.
(In many branches the idea is that you care about the abstract linear transformation and properties instead of the dirty coefficients that depend on the specific base. I don't expect a mathematician to have an strong opinion on the order. All are equivalent via isomorphism.)
I'm an applied mathematician and this is the most common layout for dense matrices due to BLAS and LAPACK. Note, many of these routines have a flag to denote when working with a transpose, which can be used to cheat a different memory layout in a pinch. There are also parameters for increments in memory, which can help when computing across a row as opposed to down a column, which can also be co-opted. Unless there's a reason not to, I personally default to column major ordering for all matrices and tensors and use explicit indexing functions, which tends to avoid headaches since my codes are consistent with most others.
Abstractly, there's no such thing as memory layout, so it doesn't matter for things like proofs, normally.
I'm a mathematician. It's kind of a strange statement since, if we are talking about a matrix, it has two indices not one. Even if we do flatten the matrix to a vector, rows then columns are an almost universal ordering of those two indices and the natural lexicographic ordering would stride down the rows.
Yes. I think what all mathematicians can agree on is that the layout (and the starting index! :-) is like this:
Recently graduated math student here. The definition of the "vec" operator which turns a matrix into a vector works like this, stacking up columns rather than rows.
https://en.wikipedia.org/wiki/Vectorization_(mathematics)
Most fields of math that use matrices don't number each element of the matrix separately, and if they do there will usually be two subscripts (one for the row number and one for the column number).
Generally, matrices would be thought in terms of the vectors that make up each row or column.
People must get taught math terribly if they think "I don't need to worry about piles of abstract math to understand a rotation, all I have to do is think about what happens to the XYZ axes under the matrix rotation". That is what you should learn in the math class!
Anyone who has taken linear algebra should know that (1) a rotation is a linear operation, (2) the result of a linear operation is calculated with matrix multiplication, (3) the result of a matrix multiplication is determined by what it does to the standard basis vectors, the results of which form the columns of the matrix.
This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
I took a linear algebra class, as well as many others. It didn't work.
Most math classes I've taken granted me some kind of intuition for the subject material. Like I could understand the concept independent from the name of the thing.
In linear algebra, it was all a series of arbitrary facts without reason for existing. I memorized them for the final exam, and probably forgot them all the next day, as they weren't attached to anything in my mind.
"The inverse of the eigen-something is the determinant of the abelian".
It was just a list of facts like this to memorize by rote.
I passed the class with a decent grade I think. But I really understood nothing. At this point, I can't remember how to multiply matrices. Specifically do the rows go with the columns or do the columns go with the rows?
I don't know if there's something about linear algebra or I just didn't connect with the instructor. But I've taken a lot of other math classes, and usually been able to understand the subject material readily. Maybe linear algebra is different. It was completely impenetrable for me.
You might want to try Linear Algebra Done Right by Sheldon Axler. It's a short book, succinct but extremely clear and approachable. It explains Linear Algebra without using determinants, which are relegated to the end, and emphasises understanding the powerful ideas underpinning the subject rather than learning seemingly arbitrary manipulations of lists and tables of numbers.
Those manipulations are of course extremely useful and worth learning, but the reasons why, and where they come from, will be a lot clearer after reading Axler.
As someone pointed out elsewhere in this thread, the book is available free at https://linear.axler.net/
The page count suggests that we have different ideas of what's meant by "short". In any case, it looks great from the forewords. If I ever want to make a serious try to really get it, this is probably what I'll use.
It is widely considered to deliver on the promise of done right!
To remind oneself how to multiply matrices together, it suffices to remember how to apply a matrix to a column vector, and that ((A B) v) = (A (B v)).
For each 1-hot vector e_i (i.e. the row vector that has a 1 in the i-th position and 0s elsewhere), apply B e_i to get the i-th column of the matrix B. Then, apply the matrix A to the result, to obtain A (B e_i), which equals (A B) e_i . This is then the i-th column of the matrix A B. And, when applying the matrix A to some column vector v, for each entry/row of the resulting vector, it is obtained by combining the corresponding row of A, with the column vector v.
So, to get the entry at the j-th row of the i-th column of (A B), one therefore combines the i-th column of B with the j-th row of A. Or, alternatively/equivalently, you can just compute the matrix (A B) column by column, by, for each e_i , computing that the i-th column of (A B) is (A (B e_i)) (which is how I usually think of it).
To be clear, I don't have the process totally memorized; I actually use the above reasoning to remind myself of the computation process a fair portion of the time that I need to compute actual products of matrices, which is surprisingly often given that I don't have it totally memorized.
When I took linear algebra, the professor emphasized the linear maps, and somewhat de-emphasized the matrices that are used to notate them. I think this made understanding what is going on easier, but made the computations less familiar. I very much enjoyed the class.
Here's a recipe for matrix multiplication that you can't forget: choose bases b_i/c_j for your domain/codomain. Then all a matrix is is listing the outputs of a function for your basis: if you have a linear function f, then the ith column of its matrix A is just f(b_i). If you have another function g from f's codomain, then same thing, its matrix B is just the list of outputs g(c_j). Then the ith column of BA is just g(f(b_i)). If you write these things down on paper and expand out what I wrote, you'll see the usual row and column thing pop out. The point is that f(b_i) is a weighted sum of the c_i (since c_i is a basis for the target of f), but you can pull the weighted sums through the definition of g because it's linear. A basis gives you a minimal description/set of points where you need to define a function, and the definition for all other points follows from linearity.
The point of the eigen-stuff is that along some directions, linear functions are just scalar multiplication: f(v) = av. If the action in a direction is multiplication by a, then it can't also be multiplication by b. So unequal eigenvalues must mean different directions/linearly independent subspaces. So e.g. if you can find n different eigenvalues/eigenvectors, you've found a simple basis where each direction is just multiplication. You also know that it's invertible if the eigenvalues are nonzero since all you did was multiply by a_i along each direction, so you can invert it by multiplying by 1/a_i on each direction.
Taught properly it's all very straightforward, though determinants require some more buildup with a detour through things like quotienting and wedge products if you really want it to be straightforward IMO. You start by saying you want to look at oriented areas/volumes, and look at the properties you need. Then quotienting gives you a standard tool to say "I want exactly the thing that has those properties" (wedge products). Then the action on wedges gives you what your map does to volumes, with the determinant as the action on the full space. You basically define it to be what you want, and then you can calculate it by linearity/functoriality just like you expand out the definition of a linear map from a basis.
IDK why but the replies to your comment crack me up because they ended up confusing me rather than helped. It's the same for me. Impenetrable.
I'm an applied math PhD who thinks linear algebra is the best thing ever, and it's the nuts and bolts of modern AI, so for fun and profit I'll attempt a quick cheat sheet.
To manage expectations, this won't be very satisfying by itself. You have to do a lot of exercises for this stuff to become second nature. But hopefully it at least imparts a sense that the topic is conceptually meaningful and not just a profusion of interacting symbols. For brevity, we'll pretend real numbers are the only numbers that exist; assume basic knowledge of vectors; and, I won't say anything about eigenvalues.
1. The most important thing to know about matrices is that they are linear maps. Specifically, an m x n matrix is a map from n-dimensional space (R^n) to m-dimensional space (R^m). That means that you can use the matrix as a function, one which takes as input a vector with n entries and outputs a vector with m entries.
2. The columns of a matrix are vectors. They tell you what outputs are generated when you take the standard basis vectors and feed them as inputs to the associated linear map. The standard basis vectors of R^n are the n vectors of length 1 that point along the n coordinate axes of the space (the x-axis, y-axis, z-axis, and beyond for higher-dimensional spaces). Conversely, a vector with n entries is also an n x 1 column matrix.
3. Every vector can be expressed uniquely as a linear combination (weighted sum) of standard basis vectors, and linear maps work nicely with linear combinations. Specifically, F(ax + by) = aF(x) + bF(y) for any real-valued "weights" a,b and vectors x,y. From this, you can show that a linear map is uniquely determined by what it maps the standard basis vectors to. This + #2 explains why linear maps and matrices are equivalent concepts.
4a. The way you apply the linear map to an arbitrary vector is by matrix-vector multiplication. If you write out (for example) a 3 x 2 matrix and a 2 x 1 vector, you will see that there is only one reasonable way to do this: each 1 x 2 row of the matrix must combine with the 2 x 1 input vector to produce an entry of the 3 x 1 output vector. The combination operation is, you flip the row from horizontal to vertical so it's a vector, then you dot-product it with the input vector.
4b. Notice how when you multiply 3x2 matrix with 2x1 vector, you get a 3x1 vector. In the "size math" of matrix multiplication, (3x2) x (2x1) = (3x1); the inner 2's go away, leaving only the outer numbers. This "contraction" of the inner dimensions, which happens via the dot product of matching vectors, is a general feature of matrix multiplication. Contraction is also the defining feature of how we multiply tensors, the 3D and higher-dimensional analogues of matrices.
5. Matrix-matrix multiplication is just a bunch of matrix-vector multiplications put side-by-side into a single matrix. That is to say, if you multiply two matrices A and B, the columns of the resulting matrix C are just the individual matrix-vector multiplications of A with the columns of B.
6. Many basic geometric operations, such as rotation, shearing, and scaling, are linear operations, so long as you use a version of them that keeps the origin fixed (maps the zero vector to zero vector). This is why they can be represented by matrices and implemented in computers with matrix multiplication.
I think this is pretty instructor-dependent. I had two LinAlg courses, and in the first, I felt like I was building a great intuition. In the second, the instructor seemed to make even the stuff I previously learned seem obtuse and like "facts to memorize."
Maybe linear algebra is more instructor-dependent, since we have fewer preexisting concepts to build on?
A lot of people who find themselves having to deal with matrices when programming have never taken that class or learned those things (or did so such a long time ago that they've completely forgotten). I assume this is aimed at such people, and he's just reassuring them that he's not going to talk about the abstract aspects of linear algebra, which certainly exist.
I'd take issue with his "most programmers are visual thinkers", though. Maybe most graphics programmers are, but I doubt it's an overwhelming majority even there.
> most programmers are visual thinkers
I remember reading that there's a link between aphantasia (inability to visualize) and being on the spectrum.
Being an armchair psychologist expert with decades of experience, I can say with absolute certainty that a lot of programmers are NOT visual thinkers.
This is interesting because, to me, programing is a deeply visual activity. It feels like wandering around in a world of forms until I find the structures I need and actually writing out the code is mostly a formality.
I would describe my experience of it similarly, but wouldn't call it "visual thinking" in the sense meant in the article, where one uses actual imagery and visual-spatial reasoning. Indeed, I almost completely lack the ability to conjure mental imagery (aphantasia) and I've speculated it might be because a part of my visual cortex is given over to the pseudo-visual activity that seems to take place when I program.
I'm especially sure my sort of pseudo-visual thinking isn't what the article means by "visual thinking" because I also use it when working through "piles of abstract math", which I take to very kindly indeed.
Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn? Very intriguing if the latter, and I'd be curious to know what they look like.
> Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn?
They're like if the abstract machines you talk about in CS theory classes were physical objects.
For example, thinking about a data processing pipeline, I might see the different components performing transformations on messages flowing through it. I can focus on one component and think about how it take apart the message to extract the structure it's trying to manipulate, interacts with its local state, etc. If something is active and stateful it feels different than if it's just a plain piece of data. I run the machine through its motions to understand where the complexity is and where things could break, comparing different designs against each other.
I'm thinking about a data format, I think about the relationships between containers, headers, offsets between structures, etc, like pieces that I can move around to see how their relationships change and understand how it would work in practice.
It's more than an image that can be drawn because the pieces are in motion as they operate. It's the same kind of "material" that mathematical objects are made out of when I'm thinking about abstract math. It's immensely useful skill for doing my job, in designing systems.
I actually struggle a lot with translating the systems in my head into prose. To me, certain design decisions are completely obvious and wouldn't need to be stated, so when we all understand the product goals I often to neglect to explain why a certain thing works the way it does, because to me it's completely obvious how it's useful towards achieving the product goals. So that's something I have to actively put more effort into.
I also really struggled when I took a real linear algebra class, since it was taught in a very blackboardy "tabular" style which was harder for me to visualize. I was unfamiliar with it due to being used to thinking about matrices in the context of computer graphics and game engines.
Do you have anything I can read about that? I'm definitely on the spectrum and have whatever the opposite of aphantasia is, I can see things very clearly in my head
"In Experiment 2 we have shown that people with aphantasia report higher AQ scores (more traits associated with autism than controls), and fall more often within the range suggestive of autism (≥32)."
https://www.sciencedirect.com/science/article/abs/pii/S10538...
Math achievement correlates strongly with visuospatial reasoning. Programmers may not be as proficient in math as economists, but they are better at it than biologists or lawyers.
I would distinguish between visual imagination and visuospatial reasoning.
For people like myself with aphantasia, there are often problems solving strategies that can help you when you can’t visualize. Like draw a picture.
And lots of problems don’t really require as much visual imagination as you would think. I’m pretty good at math, programming, and economics. Not top tier, but pretty good.
If there are problems out there that you struggle with compared to others, then that’s the universe telling you that you don’t have a comparative advantage in it. Do something else and hire the people who can more easily solve them if you need it.
It sounds like you have routed around your spatial visualization deficit, but that just proves the importance of alternate cognitive strategies rather than indicate that such an aptitude or deficit doesn’t ceteris paribus impact mathematical achievement.
https://en.wikipedia.org/wiki/Spatial_visualization_ability
You probably are high g (iq), which has, historically at least, dominated other factors in determining overall outcomes.
I took some sort of IQ test when I was a kid and there was an entire section that was "if you rotate this object around that axis, it matches which of the followin g options". Try as I might, I can't picture this in my head (picturing anything other than a sphere or a cube is tough) but I found that I could look at the options and logically exclude them in a very tedious way by inspection.
It's one of the reasons I like computer graphics so much: the computer does the rotation for you! Stereo graphics (using the funny LCD glasses) was a true revelation to me, and learning how to rotate things using matrics was another.
And since the economist's main skill at math is fitting a very short ruler to a very large curve... i wouldn't put them ahead of lawyers...
There are economists and there are economists. I doubt Pam Bondi was top in real analysis or other college level maths. Maybe, but I doubt it.
I have taken several linear algebra courses, one from my high school and two from universities. The thing is, not all courses of linear algebra will discuss rotations the way you discuss it. One reason is that sometimes a high school linear algebra course cannot assume students have learned trigonometry. I've seen teachers teach it just to solve larger linear systems of equations. Another reason is that sometimes a course will focus just on properties of vector spaces without relating them to geometry; after all who can visualize things when the course routinely deals with 10-dimensional vectors or N-dimensional ones where N isn't a constant.
I think teaching beginner linear algebra using matrices representing systems of equations is a pedagogical mistake. It gives the wrong impression that matrices are linear algebra and makes it difficult for students to think about it in an abstract way. A better way is to start by discussing abstract linear combinations and then illustrating what can be done with this using visualizations in various coordinate systems. Once the student understands this intuitively, systems of equations and matrices can be brought up as equivalent ways to represent linear transformations on paper. It’s important to emphasize that matrices are convenient but not the only way to write the language of linear algebra.
Usually you just draw a 2D or 3D picture and say "n" while pointing to it. e.g. I had a professor that drew a 2D picture on a board where he labeled one axis R^m and the other R^n and then drew a "graph" when discussing something like the implicit function theorem. One takeaway of a lot of linear algebra is that doing this is more-or-less correct (and then functional analysis tells you this is still kind-of correct-ish even in infinite dimensions). Actually SVD tells you in some sense that if you look at things in the right way, the "heart" of a linear map is just 1D multiplication acting independently along different axes, so you don't need to consider all n dimensions at once.
> Anyone who has taken linear algebra should know that [...]
My university level linear algebra class didn't touch practical applications at all, which was frustrating to me because I knew that it could be very useful to some background doing hobbyist game dev. I still wish I had a better understanding of the use cases for things like eigenvectors/values.
Here are some applications of eigenvectors and eigenvalues:
1) If you have a set of states and a stochastic transition function, which gives for each starting state, the probability distribution over what the state will be at the next time step, you can describe this as a matrix. The long-term behavior of applying this can be described using the eigenvectors and eigenvalues of this matrix. Any stable distribution will be an eigenvector with eigenvalue 1. If there is periodicity to the behavior, where for some initial distributions, the distribution will change over time in a periodic way, where it ends up endlessly cycling through a finite set of distributions, then the matrix will have eigenvalues that are roots of unity (i.e. a complex number s such that s^n = 1 for some positive integer n). Eigenvalues with absolute value less than 1 correspond to transient contributions to the distribution which will decay (the closer to 0, the quicker the decay.). When there are finitely many states, there will always be at least one eigenvector with eigenvalue 1.
2) Related to (1), there is the PageRank algorithm, where one takes a graph where each node has links to other nodes, and one models a random walk on these nodes, and one uses the eigenvector one (approximately) finds in order to find the relative importance of the different nodes.
3) Rotations generally have eigenvalues that are complex numbers with length 1. As mentioned in (1), eigenvalues that are complex numbers with length 1 are associated with periodic/oscillating behavior. Well, I guess it sorta depends how you are using the matrix. If you have a matrix M with all of its eigenvalues purely imaginary, then exp(t M) (with t representing time) will describe oscillations with rates given by those eigenvalues. exp(t M) itself will have eigenvalues that are complex numbers of length 1. This is very relevant in solutions to higher order differential equations or differential equations where the quantity changing over time is a vector quantity.
____
But, for purposes of gamedev, I think probably the eigenvalues/eigenvectors are probably the less relevant things. Probably instead, at least for rendering and such, you want stuff like, "you can use homogeneous coordinates in order to incorporate translations and rotations into a single 4x4 matrix (and also for other things relating the 3d scene to the 2d screen)", and stuff about like... well, quaternions can be helpful.
Of course, it all depends what you are trying to do...
> This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
The problem with this kind of thinking is that it encourages the exact kind of teaching you disparage. It's very easy to get lost in the sauce of arbitrary notational choices that the underlying concepts end up completely lost. Before you know it, matrices are opaque self-justified atoms in an ecosystem rather than an arbitrary tabular shorthand. Mathematics is not a singular immutable dogma that dropped out of the sky as natural law, and rediscovery is the most powerful tool for understanding.
When I was studying and made the mistake of choosing 3D computer graphics as a lecture, I remember some 4x4 matrix that was used for rotation, with all kinds of weird terms in it, derived only once, in a way I was not able to understand and that didn't relate to any visual idea or imagination, which makes it extra hard for me to understand it, because I rely a lot on visualization of everything. So basically, there was a "magical formula" to rotate things and I didn't memorize it. Exam came and demanded having memorized this shitty rotation matrix. Failed the exam, changed lectures. High quality lecturing.
Later in another lecture at another university, I had to rotate points around a center point again. This time found 3 3x3 matrices on wikipedia, one for each axis. Maybe making at least seemingly a little bit more sense, but I think I never got to the basis of that stuff. Never seen a good visual explanation of this stuff. I ended up implementing the 3 matrices multiplications and checked the 3D coordinates coming out of that in my head by visualizing and thinking hard about whether the coordinates could be correct.
I think visualization is the least of my problems. Most math teaching sucks though, and sometimes it is just the wrong format or not visualized at all, which makes it very hard to understand.
You can do rotation with a 3x3 matrix.
The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
Since you can combine a series of matrix multiplications by just pre-multiplying the matrix, this sets you up for doing a very efficient "move, scale, rotate" of an object using a single matrix multiplication of that pre-calculated 4x4 matrix.
If you just want to, e.g., scale and rotate the object, a 3x3 matrix suffices. Sounds like your first lecture jumped way too fast to the "here's the fully general version of this", which is much harder for building intuition for.
Sorry you had a bad intro to this stuff. It's actually kinda cool when explained well. I think they probably should have started by showing how you can use a matrix for scaling:
for example, will grow an object by 2x in the x dimension, 1.5x in the y dimension, and keep it unchanged in the z dimension. (You'll note that it follows the pattern of the identity matrix). The derivation of the rotation matrix is probably best first derived for 2d; the wikipedia article has a decentish explanation:https://en.wikipedia.org/wiki/Rotation_matrix
The first time I learned it was from a book by LaMothe in the 90s and it starts with your demonstration of 3D matrix transforms, then goes "ha! gimbal lock" then shows 4D transforms and the extension to projection transforms, and from there you just have an abstraction of your world coordinate transform and your camera transform(s) and most everything else becomes vectors. I think it's probably the best way to teach it, with some 2D work leading into it as you suggest. It also sets up well for how most modern game dev platforms deal with coordinates.
Think OpenGL used all those 2,3,4D critters at API level. It must be very hardware friendly to reduce your pipeline to matrix product. Also your scene graph (tree) is just this, you attach relative rotations and translations to graph nodes. You push your mesh (stream of triangles) at tree nodes, and composition of relative transforms up to the root is matrix product (or was the inverse?) that transform the meshes that go to the pipeline. For instance character skeletons are scene subgraphs, bones have translations, articulations have rotations. That's why it is so convenient to have rotations and translations in a common representation, and a linear one (4D matrix) is super. All this excluding materials, textures, and so on, I mean.
Tricks of the * Game Programming Gurus :)
> The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
I think this is mixing up concepts that are orthogonal to linear spaces, linear transformations, and even specific operations such as rotations.
The way you mention "more general set of transformations" suggests you're actually referring to homogeneous coordinates, which is a trick that allows a subset of matrix-vector multiplication and vector addition in 3D spaces to be expressed as a single matrix-vector multiplication in 4D space.
This is fine and dandy if your goal is to take a headstart to 3D programming, where APIs are already designed around this. This is however a constrained level of abstraction above actual linear algebra, which may be and often is more confusing.
> You can do rotation with a 3x3 matrix.
You can do a rotation or some rotations but SO(3) is not simply connected.
It mostly works for rigid bodies centered on the origin, but gimbal lock or Dirac's Plate Trick are good counter example lenses. Twirling a baton or a lasso will show that 720 degrees is the invariant rotation in SO(3)
The point at infinity with a 4x4 matrix is one solution, SU(3), quaternions, or recently geometric product are other options with benefits at the cost of complexity.
I think you are confused about what 'simply connected' means. A 3x3 matrix can represent any rotation. Also from a given rotation there is a path through the space of rotations to any other rotation. It's just that some paths can't be smoothly mapped to some other paths.
SO(3) contains all of the orthogonal 3x3 matrices of determinant 1.
If you are dealing with rigid bodies rotated though the origin like with the product of linear translations you can avoid the problem. At least with an orthonormal basis R^3 with an orthogonal real valued 3x3 matrix real entries which, where the product of it with its transpose produces the identity matrix and with determinant 1
But as soon as you are dealing with balls, where the magnitude can be from the origin to the radius, you run into the issue that the antipodes are actually the same point, consider the north and south poles being the same point, that is what I am saying when the topology is not simply connected.
The rigid body rotation about the origin is just a special case.
Twist a belt twice and tape one end to the table and you can untwist it with just horizontal translation, twist it once (360deg) and you cannot.
In computer graphics, 4x4 matrices let you do a rotation and a translation together (among other things). There's the 3x3 rotation block you found later as well as a translation vector embedded in it. Multiplying a sequence of 4x4 matrices together accumulates the rotations and translations appropriately as if they were just a bunch of function applications. i.e. rotate(translate(point)) is just rotation_matrix * translation_matrix * point_vector if you construct your matrices properly. Multiplying a 4x4 matrix with another 4x4 matrix yields a 4x4 matrix result, which means that you can store an arbitrary chain of rotations and translations accumulated together into a single matrix...
Yeah you need to build up the understanding so that you can re-derive those matrices as needed (it's mostly just basic trigonometry). If you can't, that means a failure of your lecturer or a failure in your studying.
The mathematical term for the four by four matrices you were looking at is "quaternion" (I.e. you were looking at a set of four by four matrices isomorphic to the unit quaternions).
Why use quaternions at all, when three by three matrices can also represent rotations? Three by three matrices contain lots of redundant information beyond rotation, and multiplying quaternions requires fewer scalar additions and multiplications than multiplying three by three matrices. So it is cheaper to compose rotations. It also avoids singularities (gimbal lock).
This was part of Steve Baker’s (“Omniverous Hexapod”, sic) extensions to a long-standing Usenet FAQ about graphics programming, put out by “Carniverous Hexapod” (sic). It’s at least two decades old, and the FAQ from which he it on may be from the 1990s? I have the niggling recollection that the Carniverous name may have been based on Vernor Vinge’s _Fire upon the deep_ aliens.
He did not invent it, but he probably had to deal with aspiring graphics programmers who were not very math-savvy.
Honestly, many math teachers are kinda bad at conveying all that.
When everything clicked a few years down the line it all became so simple.
Like you mention "linear operation", the word linear doesn't always make intuitive sense in terms of rotations or scaling if you have encountered simple 1 or 2 dimensional linear transformations when doing more basic graphics programming.
As a teacher, I think the biggest lesson I had to learn was to always have at least 3 different ways of explaining everything to give different kinds of people different entrypoints into understanding concepts.
For someone uninitiated a term like "basis vector" can be pure gibberish if it doesn't follow an example of a transform as a viewport change, and it needs to be repeated after your other explanations (of for example how vector components in the source view just are scalars upon the basis vectors when multiplied with a matrix instead of a heavy un-intuitive concept).
Math is just a standardized way to communicate those concepts though, it's a model of the world like any other. I get what you mean, but these intuitive or visualising approaches help many people with different thinking processes.
Just imagine that everyone has equal math ability, except the model of math and representations of mathematical concepts and notation is more made for a certain type of brains than others. These kind of explanations allow bringing those people in as well.
For my fellow visual thinkers who might be looking for a linear algebra book that focuses more on developing the geometric intuition for stuff like this, rather than just pure numeric linear system solving, let me recommend:
"Practical Linear Algebra: A Geometry Toolbox" by Farin and Hansford.
There are a lot more ways to look at and understand these mysterious beasts called matrices. They seem to represent a more fundamental primordial truth. I'm not sure what it is. Determinant of a matrix indicate the area of or volume spanned by its component vectors. Complex matrices used in Fourier transform are beautiful. Quantum mechanics and AI seem to be built on matrices. There is hardly any area of mathematics that doesn't utilize matrices as tools. What exactly is a matrix? Just a grid of numbers? don't think so.
The fundamental truth is that matrices represent linear transformations, and all of linear algebra is developed in terms of linear transformations rather than just grid of numbers. It all becomes much clearer when you let go of the tabular representation and study the original intentions that motivated the operations you do on matrices.
My appreciation for the subject grew considerably after working through the book "Linear Algebra done right" by Axler https://linear.axler.net
Spatial transformations? Take a look at the complex matrices in Fourier transforms with nth roots of unity as its elements. The values are cyclic, and do not represent points in an n-D space of Euclidean coordinates.
Yes; I wrote linear transformation on purpose not to remain constrained on spatial or geometric interpretations.
The (discrete) Fourier transform is also a linear transformation, which is why the initial effort of thinking abstractly in terms of vector spaces and transformations between them pays lots of dividends when it's time to understand more advanced topics such as the DFT, which is "just" a change of basis.
>[Matrices] seem to represent a more fundamental primordial truth.
No, matrices (or more specifically matrix multiplication) are a useful result picked out of a huge search space defined as "all the ways to combine piles of numbers with arithmetic operators". The utility of the discovery is determined by humans looking for compact ways to represent ideas (abstraction). One of the most interesting anecdotes in the history of linear algebra was how Hamilton finally "discovered" a way to multiply them. "...he was out walking along the Royal Canal in Dublin with his wife when the solution in the form of the equation i2 = j2 = k2 = ijk = −1 occurred to him; Hamilton then carved this equation using his penknife into the side of the nearby Broom Bridge" [0]
The "primordial truth" is found in the selection criteria of the human minds performing the search.
0 - https://en.wikipedia.org/wiki/William_Rowan_Hamilton
A matrix is just a grid of numbers.
A lot of areas use use grid of numbers. And matrix theory actually incorporates every area that uses grids of numbers, and every rule in those areas.
For example the simplest difficult thing in matrix theory, matrix multiplication is an example for this IMO. It looks really weird in the context of grid of numbers, and its properties seem incidental, and the proofs are complicated. But matrix multiplication is really simple and natural in the context of linear transformations between vector spaces.
This is the most important part.
"...linear transformations between vector spaces."
When you understand what that implies you can start reasoning about it visually.
The 3 simplest (that you can find in blender or any other 3d program, or even partly in 2d programs).
Translation (moving something left,right,up,down,in,out).
Rotation (turn something 2 degrees, 90 degrees, 180 degrees, 360 degrees back to the same heading)
Scaling (make something larger, smaller, etc)
(And a few more that doesn't help right now)
The 2 first can be visualized simply in 2d, just take a paper/book/etc. Move it left-right, up down, rotate it.. the book in the original position and rotation compared to the new position and rotation can be described as a vector space transformation, why?
Because you can look at it in 2 ways, either the book moved from your vantage point, or you follow the book looking at it the same way and the world around the book moved.
In both cases, something moved from one space (point of reference) to another "space".
The thing that defines the space is a "basis vector", basically it says what is "up", what is "left" and was in "in" in the way we move from one space to another.
Think of it as, you have a piece card on a paper. Draw an line/axis along the bottom edge as the X count, then draw on the left side upwards the Y count. In the X,Y space (from space) you count the X and Y steps of various feature points.
Now draw the "to space" as another X axis and another Y axis (could be rotated, could be scaled, could just be moved) and take the counts in steps and put them inside the "to space" measured in equal units as they were in the from space.
Once the feature points are replicated in the "to space" you should have the same image as before, just within the new space.
This is the essence of a so called linear(equal number steps) transform (moved somewhere else), and also exactly what multiplying a set of vectors by a matrix achieves (simplified, in this context, the matrix really is mostly a representation of a number of above mentioned basis vectors that defines the X, Y,etc of the movement).
the set of all matrices of a fixed size are a vector space because matrix addition and scalar multiplication are well-defined and follow all vector space axioms.
But be careful of the map–territory relation.
If you can find a model that is a vector space, that you can extend to an Inner product space and extend that to a Hilbert space; nice things happen.
Really the amazing part is finding a map (model) that works within the superpowers of algorithms, which often depends upon finding many to one reductions.
Get stuck with a hay in the haystack problem and math as we know it now can be intractable.
Vector spaces are nice and you can map them to abstract algebra, categories, or topos and see why.
I encourage you to dig into the above.
Take linear algebra
OK, now what?
A matrix is just a list of where a linear map sends each basis element (the nth column of a matrix is the output vector for the nth input basis vector). Lots of things are linear (e.g. scaling, rotating, differentiating, integrating, projecting, and any weighted sums of these things). Lots of other things are approximately linear locally (the derivative if it exists is the best linear approximation. i.e. the best matrix to approximate a more general function), and e.g. knowing the linear behavior near a fixed point can tell you a lot about even nonlinear systems.
Yes, I think of them as saying "and this is what the coordinates in our coordinate system [basis] shall mean from now on". Systems of nonlinear equations, on the other hand, are some kind of sea monsters.
Part of a fairly old OpenGL tutorial, about 2002.
The age doesn't affect this matrix part, but just FYI that any specific APIs discussed will probably be out of date compared to modern GPU programming.
With the advent of things like r/MyBoyfriendIsAI I was expecting a substantially different article than the one I clicked into.
> What stops most novice graphics programmers from getting friendly with matrices is that they look like 16 utterly random numbers.
Wait until they see how physicists and chemists treat matrices! They will pine for the day when a transformation can be described by 16 numbers in a table.
yellow text on green background... my eyes!
Was thinking that background colors isn't the author's friend. Sadly great math articles are written on visually horrific web press.
FireFox Reader Mode can also be your friend... ; - )
Yea this is horrendous. Not reading this
[dead]