I’ll review (extremely rapidly!) to fix notation.
We equip the vector space of smooth compactly supported real valued functions on with
the following topology: a sequence converges to if the and all their derivatives converge
to and all of its derivatives with respect to the norm . The
topological dual of of is the set of distributions on ,
or continuous linear functionals .
Every smooth function naturally gives rise to a distribution via integration,
but not all distributions are of this form. Important examples are given by the delta distributions: for any we
define by , and we write for .
The set of distributions is closed under differentiation, where the derivative of a distribution is defined by its action
on a function in analogy to integration by parts. When this lets us define the first derivative of
by . More generally, if is some differential
operator on we define on by
Distributions can be multiplied by smooth functions: if is smooth and , we define
to be the distribution such that for all .
Distributions can also be convolved with functions in , but this operation now yields
smooth functions rather than distributions: if and
, their convolutional product is defined below, where is the function .
As an example, if , we compute the value of at aa
Thus, , and colvolution with realizes the identity operator on .
Differentiation and convolution interact in a particularly simple way, if is a linear differential operator
and , , then we may differentiate the function by
Once you know this line of equalities, the motivation for introducing distributions to solve PDE’s becomes clear!
The general goal is to calculate (hopefully easier to find) distributional
solutions to a differential equation to actual, real-valued function solutions through convolution.If is a differential
operator we say a distribution is a fundamental solution for if
Such a fundamental solution lets us find a real solution to the differential equation ,
with by simply taking , as we easily confirm:
Of course, this (crucially!) relies on the fact that , and recently I realized I
had completely forgotten how to prove this fact!
Luckily, shortly after Daniel O’Connor showed me how it works, and so
I want to write it down for the next time that I forget.
Proving
To prove this, we start small and build up to the general case. The interesting part is actually this small beginning however,
the rest is just packaging.
Theorem 1: On , suppose and . Then is differentiable
and .
Proof:
Here’s Daniel’s argument:
Let , and consider the difference quotient
If the limit exists, then is differentiable at .
Using the definition of and the linearity of , we may evaluate this as
Using the continuity of , we may take the limit inside, and so
The quantity inside of attempts to assign to each the value
so in our notation, this is the function , which itself is in as was.
Thus, exists, and .
But this new term is exactly the definition of convolved with , when evaluated at ! Thus as functions, we have shown
This is half of what we want, but the rest is just a straightforward application of the definition of the distributional
derivative. By definition, is the linear functional such that for all ,
so computing , we see for
Where is the function sending .
Computing this derivative with the chain rule shows , and so
Stringing all this together, we see , where we recognize this second
term as defining the convolution . As this equality holds for all we have equality
between functions:
Combining with our earlier result proves the theorem, as we have shown both and are equal to .
Corollary 2: Let be the derivative operator on . Then for
any and , the convolution is times differentiable and
Proof: We can proceed inductively using Theorem 1, as is the -fold composition of the first derivative operator.
Lemma 3:
On , let denote the directional derivative with respect to the first coordinate.
Then for any and , is smooth, and
.
Proof:
The proof is exactly analogous to the one dimensional case in Theorem 1, so we can proceed rather quickly.
It suffices to check this equality holds at an arbitrary fixed , where
Evaluatiting the convolutions and using the linearity and continuity of shows this to be
, which is the convolution of with evaluated at .
Thus,
The second equality again follows simply by starting the definition of to compute
at , resulting in
Corollary 4:
If is any monomial in the coordinate partial derivative operators on ,
then for any and ,
Proof: As in corollary 2, we inductively apply Lemma 3 to each partial derivative operator which shows up in .
To build upwards from this, its useful to stop for a second and factorize out a little argument about convolution:
Lemma 5: If are distributions and a smooth compactly supported functions, then
and .
Proof:
Let .
First consider evaluated at . This is by definition , that is,
. Using the linearity of , we see this to be , which
is by definition . Thus,
Next, consider evaluated at . By the definition of convolution,
.
Using the definition of in , we distribute as
, where the last terms are each by definition equal
to and respectively. Thus,
Lemma 6: Let be a differential operator on such that
for and any , .
Then also satisfies for all .
Proof:
The differential operator acts on functions by . Thus if , ,
.
To get the first of the two claimed equalities, we can use half our hypothesis on the to re-write this as
, and then use Lemma 5 to factor out the convolution giving
. Factoring out the gives what we wanted:
To get the second equality, we use the other half of our assumption on the to rewrite
as . We use Lemma 5 to factor out the distribution from this convolution,
followed by the further factoring . All together, this gives what we wanted:
Lemma 7: Let be a differential operator on such that
for any and .
Then if is any smooth function, the differential operator defined by
also satisfies for all .
Proof:
Evaluating ,we can use that satisfies our hypothesis to conclude this is equal to
and . Taking the former and evaluating at , we see it to equal
At this fixed , is a constant, and so this is the same as evaluating the distribution
on . But this is the definition of the convolution of with the function , evaluated at !
Thus, all together is the function which is the first
half of what we want:
The other case is similar, considering evaluated at . This yields ,
and as is a constant at this , we may pull it inside to get .
That is, sends to the result of convolving the function with , so
Finally, we need a description of the class of linear differential operators on which is amenable
to our start-small-and-build-upwards approach:
Lemma 8: Any linear differential operrator on is a multinomial in the partial derivative operators,
with coefficients in .
All the hard work is done, now its just putting the pieces together to state the main result:
Theorem 9: Let be any linear differential operator on . Then
for all , .
Proof: We write as a multinomial in the partial derivatives,
where ranges over some finite subset of all multi-indices,
, and for each index is some
smooth function . But as each satisfies the desired property by Corollary 4,
we can apply Lemmas 6 and 7 finitely many times to conclude that does as well.