Introduction to Modern Algebra
Algebra is all about the study of sets imbued with additional operator structure. Generally, one can view all different fields of algebra as revolving around the following process:
- We posit the existence of some underlying set for our algebra.
- We introduce an algebraic operation to , for example, a binary multiplication operator .
- We then pose additional restrictions on the operator , for example, might be associative, commutative, etc.
- We then use these axioms to study the unified behavior of this class of objects.
Of course, the more restrictions we pose on what , the interesting the resulting algebra will be. The crux to algebra is the delicate balance between defining a class of axioms that are general enough to capture a large swath of mathematical objects one might care about, but strict enough to enable the mathematician to prove things about the structure of the object . It can be easy to fall onto the former pitfall. For example, consider one of the simplest algebraic structures there is --- the magma.
Definition: A magma is a set with the following additional structure:
- A binary multiplication operator .
A very broad class of objects, indeed, but from the perspective of identifying interesting structure and common patterns, quite useless. There is very little you can tell me about a magma without knowing more about the object's structure.
Now, luckily, we have a good two hundred years of development of the subject of modern algebra. A lot of cool math dudes have already figured out a zoo of object classes that actually do have interesting structure. Probably one of the most studied is the group.
Groups
The definition of a group is slightly longer than the definition of a magma.
Definition: A group is a set with the following additional structure:
- A binary operator satisfying:
- Associativity: for , one has .
- Identity Element: there exists an element such that for every , .
- Inverses: every element has an inverse such that .
The most common example of a group I can give you is the nonzero real numbers under multiplication (here denotes with removed). Additionally, the real numbers are also a group under addition. But there are definitely some sets that are also not groups: the nonzero integers under multiplication fail to be a group because inverses do not exist in the set . However, if we go with addition instead of multiplication as our operation, then is indeed a group. Note however, that in both of these examples, the binary operator (addition or multiplication over the reals) is commutative, that means that for all ,
Groups with this property are called abelian. And although the examples I've given so far are abelian, but there are definately also many groups that are non-abelian, so don't go around assuming that all groups are abelian unless it's part of your assumptions! That's a classic beginner mistake.
Definition: A group is called abelian if its operation is commutative. When this is the case, one often uses to denote the operation, to communicate that it is commutative.
Groups, by definition, involve only a single binary operation; however, it is possible to define algebras with more than one operation. Just like with one operation, there are a zoo of different algebras one can study, but the important one for linear algebra is the field.
Fields
A field is like a generalization of our conception of the real numbers . A field has two fundamental operations, an addition operation, and a multiplication operation, and both are commutative. This means that we can treat elements in fields more or less like the real numbers when we're manipulating them. Division and subtraction are well-defined in fields as the corresponding inverses of multiplication and addition, so a lot of our intuition maps directly from real numbers onto fields. The formal definition reads:
Definition: A field is a set with the following additional structure:
- is an abelian group with the identity .
- is an abelian group with the identity .
- The operations and satisfy the distributive property: for any ,
Of course, there are more complicated fields out there, for example, the complex numbers are also a field, for any prime number , the integers modulo () is also a field. The results of linear algebra generally apply to all of these fields.
Now, while fields are an incredibly rich subject and definitely worthy of further study, this tutorial is primarily about linear algebra. While linear algebra can often involves groups (matrix groups) and fields and group theory sometimes involve linear algebra (representation theory), linear algebra generally revolves around a different fundamental algebraic structure called the vector space.
Vector Spaces
Any discussion about linear algebra necessitates talking about vector spaces. The definition of vector spaces is more involved than groups because instead of supporting just one binary operation, linear algebra involves two fundamental binary operations. Consider a set of vectors,
- We want to allow the addition of vectors. This means for two vectors , , we require a notion of what is.
- We allow the scaling of vectors. If is a length (i.e., a scalar) and is a vector, we require a notion of what it means to scale by the length .
Therefore, we not only need a set of vectors, we also need to define a set of scalars. And we need to define the corresponding operations. This definition is therefore more involved than that of a group,
Definition: A vector space over a scalar field (sometimes written ) is a set with the following additional structure:
- is an abelian group under a vector addition operator .
- is a field.
- A scalar multiplication operator .
- Vector addition and scalar multiplication must satisfy the distributive property:
where and .
- Scalar multiplication must be compatible with the multiplication operation in :
where and . This means it shouldn't matter whether we scale a vector by a length and then by , or just scale once by length .
Now, I hear you ask, "what is the need for all this abstraction? Can't we just think about vectors as arrays of real numbers?" And the answer is: kinda. Of course, it turns out that we can always run computations by writing out vectors using arrays of scalars, it isn't always clear how exactly to convert an abstract vector into an array -- as there can be infinitely many ways to view the same vector. The abstract formalism here allows us to study linear algebra without having to delve into how the vectors are represented, and it can be a very powerful advantage to think about vectors without having to have the baggage of arrays and matrices. Then again, at the end of the day, we eventually will want to run computations, which is when matrices and arrays will become necessary. But before that, we can spend some time thinking about vectors in the abstract. Let's first look at some examples of vector spaces
The Canonical Real Vector Space
The first example of a vector space is the set of arrays of real numbers of length ,
This is one you've likely studied extensively in a previous linear algebra class. The underlying field is simply the real numbers . Scalar multiplication is defined as one might expect,
There is generally nothing surprising here. Let's get a little more exotic.
The Complex Numbers over
Note that one can view the complex numbers as vectors over the set of real scalars. This is kinda weird at first, but scalar multiplication is easily defined:
Note that this vector space is basically equivalent to over since we can identify with and with and perform calculations the same way in both spaces.
Real Polynomials over
Note that the set of real polynomials in the variable is a vector space over , polynomials can be added together to give other polynomials. Moreover, polynomials can be multiplied by scalars,
Note that there's not an easy equivalence between this set and any vector space of the form . That's because this is an infinite dimensional space.
Square-Integrable Functions over
The final example is the set of square integrable functions over . These are defined as the functions such that
where the integral on the left is the Lebesgue integral (opens in a new tab) with respect to the Borel measure (opens in a new tab) on . If you don't understand what that means, that's fine, it's not particularly important right now. Again, note that there is not an equivalence to any finite dimensional vector space .
Morphisms and Linear Functions
The second part of algebra, after one has defined a class of objects to study, is to examine relationships between objects. More precisely, functions between objects. However, we're not interested in just all functions, we're specifically interested in functions that preserve some structure of the underlying object they act on. For example, suppose we have a function between two groups and , denoted
we say that is a morphism (or homomorphism) if the application of is compatible with the underlying operation of the group . Formally,
In this sense, preserves the multiplicative structure of ; it embeds into . However, that is not to say that preserves all of the multiplicative structure. There are always trivial morphisms, for example,
This is clearly a morphism, because,
But as a morphism, it is not very interesting.
Similarly, one can define morphisms for two fields and . A function is a morphism if it preserves both the additive and multiplicative structure of , namely,
Finally, we come to morphisms for vector spaces. Morphisms for vector spaces preserve both vector addition and scalar multiplication. These morphisms are usually called linear functions or linear transformations. Let's give a full definition for this object because it is crucially important to the study of linear algebra,
Definition: A function is a linear function between two vector spaces and over if it is compatible with vector addition and scalar multiplication. That is, for any and ,
In a somewhat meta turn of events, note that the set of linear functions between two vectors spaces and is itself a vector space. This is because for any , we can define the sum
as well as the scalar product
Additive inverses are defined via
And you can check that the trivial linear function that sends everything to is the additive identity.