Thread by @GauntlettConnor, Okay, let's do this. I should mention that this is mostly based [...]

Okay, let's do this.

I should mention that this is mostly based off of the book "Foundations of Free Noncommutative Function Theory" by Kaliuzhnyi-Verbovetskyi and Vinnikov.

This is a very general text; I spend a lot of time translating concepts into something more familiar. https://twitter.com/GauntlettConnor/status/1355956634074267651

https://twitter.com/GauntlettConnor/status/1355956634074267651

What I want to do today is tell you a bit about "nc functions", which are functions of square matrices of any size. I want to build up an analogue of calculus for these functions; the only prerequisites are a little linear algebra and the definition of differentiability.

As I mentioned I might, I'm pulling bits of this from a talk I wrote in December, which was aimed at other Master's students; I'll try to define most new things as I go. If something's unclear feel free to ask (no guarantees I can answer; I'm in no way an expert!).

First, let's talk about the Fréchet derivative. This isn't really related but it gave me some good motivation for later.

This construction works in any Banach space, but I'm interested in matrices so we'll consider the specific Banach space ℂ^{n×n} (n by n matrices over ℂ).

There's a norm (idea of "size" or "length") on ℂ^{n×n} called the matrix norm. I'm not gonna get into it, just trust me - there is one, it works how you might expect length to work, everyone likes this norm.

I'll write ||X|| for X a matrix.

Let g : ℂ^{n×n} --> ℂ^{n×n}. We say that g is "Fréchet differentiable" if there exists a linear map L(X) such that

g(X+ Z) - g(Z) - L(X)(Z) = o(||Z||).

Think about differentiability on ℝ^n: L is just like the Jacobian, but for a matrix "direction" Z rather than a vector h.

We call L the Fréchet derivative of g, if it exists, just like the Jacobian. This gives us an idea of "derivative" for functions on square matrices of (importantly) *fixed* size n×n. This is a whole area of theory itself - for instance, see Higham's "Functions of Matrices".

What's cool here is that a lot of "standard calculus" rules apply, like versions of the chain and product rules. What I *really* wanted to tell you though, was that if X and Z are n×n and f : ℂ^{2n×2n} --> ℂ^{2n×2n} (actually we need f to be "nice"), then we get this equation:

$What's cool here is that a lot of "standard calculus" rules apply, like versions of the chain and product rules. What I *really* wanted to tell you though, was that if X and Z are n×n and f : ℂ^{2n×2n} --> ℂ^{2n×2n} (actually we need f to be "nice"), then we get this equation:$

This looks weird and a little surprising, but bear with me here.

Now I want to define a couple of things. Everything up to now has been an aside, but these ideas are the beginnings of actual noncommutative (herein "nc" for short) function theory - at least for matrices, which is the setting I'm working in.

If X and Y are square matrices (of any size, and they don't have to be the same size), then their "direct sum" is the square matrix here; we use the \\oplus symbol ⊕ for direct summing matrices.

$If X and Y are square matrices (of any size, and they don't have to be the same size), then their "direct sum" is the square matrix here; we use the \\oplus symbol ⊕ for direct summing matrices.$

The "nc space" over ℂ is the disjoint union of the sets of n×n matrices for each n. Convention also includes a "0×0" matrix.

An "nc set" Ω is a subset of nc space that is closed under direct sums (i.e. taking the direct sum of two matrices in Ω gives you another matrix in Ω).

Now, this is nc function theory, so I had best tell you what an nc function is.

We say f : Ω --> ℂ_nc is an "nc function" if f preserves size, and respects direct sums and "similarities" of matrices. That is,

- f(X ⊕ Y) = f(X) ⊕ f(Y), X,Y in Ω;
- f(SXS^{-1}) = Sf(x)S^{-1}.

You might ask, why do we not ask nc sets to be closed under similarities, if we ask nc functions to respect them? The answer is basically "we don't need to"; this is dealt with in the appendix to the book.

This turns out to be the "correct" definition for the structure of ℂ_nc.

Remember the equation from earlier I said we'd come back to? We're doing that now. The real thrust of the theory is to define an analogue of "derivative" for nc functions, and see what analogues of the classical theory come over with the right definition.

These are the "nc difference-differential operators" Δ_L and Δ_R (for Left and Right) - we only look at Δ_R, but the Δ_L theory is identical. The idea is to evaluate nc functions on block-triangular matrices, and look at the top-right block, like the Fréchet derivative equation.

One thing we change: for the Fréchet derivative, we had X in the top-left and bottom-right blocks. Now, we can be more free - in the bottom-right we put another matrix Y, which need not be the same size as X. The Z we choose will have to be of the right size to fill the gap.

Note that X and Y are still square, but Z will usually be rectangular. In block matrix notation, here's how I define the "right nc difference-differential operator" Δ_R of an nc function f. Compare and contrast this with the Fréchet derivative, which is a classical construction.

Notice that now we have a dependence on both X and Y, but otherwise, this is very similar to what we got for L(X)(Z). In fact, the definition is only the top right block: it's a (reasonably quick) theorem that the other three blocks have this form.

This comparison was what really assured me that we had the right definition here. And just like for the Jacobian, just like the Fréchet derivative, this definition gives you calculus rules. Success! That's exactly what we wanted.

A quick list of some nice properties of Δ_R:

- Δ_R of a constant is 0;
- Linearity: Δ_R(af + bg) = aΔ_R(f) + bΔ_R(g) for constants a,b;
- Product rule:
Δ_R(fg)(X,Y)(Z) = f(X) Δ_Rg(X,Y)(Z) + Δ_Rf(X,Y)(Z) g(Y);
- Chain rule:
Δ_R(g∘f)(X,Y)(Z) = Δ_Rg( f(X), f(Y) )(Δ_Rf(X,Y)(Z)).

The last two might look a bit wild at first glance, but take a moment with them. The order of multiplication is important now, but the product rule is still roughly "fg' + gf'", and the chain rule is still roughly "g'(f)f'" (recall the "Z" entry is like the direction h). Awesome!

If you want an example to think about, I would point you to polynomials: these are easy to work with, and it's not difficult to show they're nc functions on the whole nc space, nor is it too hard to compute Δ_R for low degrees.

In fact, they're sort of the "canonical" example.

So now I've given you some motivation and some of the early definitions, and given you a pointer for some nice examples to work out these concepts. Hopefully this is enough for you to get a feel for the basics, maybe enough to think of some questions (feel free to ask if so!).

So what's next?

- Higher order derivatives: notice Δ_R gives another X variable and another Z variable, which continues.
- These give you power series and analyticity;
- You can define integrability with a "differential" equation g = Δ_Rf.

If you've made it this far, thank you!

Latest Threads Unrolled: