Convolution and Differentiation of Distributions

If F is a distribution and g is a function, why is (DF)g equal to FDg?

Steve Trettel

May 20, 2022 | Analysis

I’ll review (extremely rapidly!) to fix notation. We equip the vector space $C_{c}^{\infty}$ of smooth compactly supported real valued functions on $R^{n}$ with the following topology: a sequence $ϕ_{n}$ converges to $ϕ$ if the $ϕ_{n}$ and all their derivatives converge to $ϕ$ and all of its derivatives with respect to the norm $| | ϕ | | = \int_{R^{n}} | ϕ | d vol$ . The topological dual of $D$ of $C_{c}^{\infty}$ is the set of distributions on $R^{n}$ , or continuous linear functionals $C_{c}^{\infty} \to R$ .

Every smooth function $f \in C_{c}^{\infty}$ naturally gives rise to a distribution $F \in D$ via integration, $f \mapsto F F (ϕ) = \int_{R^{n}} f ϕ d vol,$ but not all distributions are of this form. Important examples are given by the delta distributions: for any $p \in R^{n}$ we define $δ_{p} \in D$ by $δ_{p} (ϕ) := ϕ (p)$ , and we write $δ$ for $δ_{0}$ .

The set of distributions is closed under differentiation, where the derivative of a distribution $F$ is defined by its action on a function $ϕ$ in analogy to integration by parts. When $M = R$ this lets us define the first derivative $F^{'}$ of $F$ by $F^{'} (ϕ) := - F (ϕ^{'})$ . More generally, if $\nabla$ is some differential operator on $C_{c}^{\infty}$ we define $\nabla$ on $D$ by $\nabla F (ϕ) := - F (\nabla ϕ)$

Distributions can be multiplied by smooth functions: if $ψ : R^{n} \to R$ is smooth and $F \in D$ , we define $ψ F$ to be the distribution such that $(ψ F) (ϕ) = F (ψ ϕ)$ for all $ϕ \in C_{c}^{\infty}$ . Distributions can also be convolved with functions in $C_{c}^{\infty}$ , but this operation now yields smooth functions rather than distributions: if $F \in D$ and $g \in C_{c}^{\infty}$ , their convolutional product $F ⋆ g$ is defined below, where $g (p - \cdot)$ is the function $x \mapsto g (p - x)$ . $F ⋆ g : p \mapsto F (g (p - \cdot))$

As an example, if $ϕ \in C_{c}^{\infty}$ , we compute the value of $δ ⋆ ϕ$ at $p \in R^{n}$ aa $(δ ⋆ ϕ) (p) := δ (ϕ (p - \cdot)) = ϕ (p - 0) = ϕ (p)$ Thus, $δ ⋆ ϕ = ϕ$ , and colvolution with $δ$ realizes the identity operator on $C_{c}^{\infty}$ .

Differentiation and convolution interact in a particularly simple way, if $\nabla$ is a linear differential operator and $F \in D$ , $g \in C_{c}^{\infty}$ , then we may differentiate the function $F ⋆ g$ by $\nabla (F ⋆ g) = (\nabla F) ⋆ g = F ⋆ (\nabla g)$

Once you know this line of equalities, the motivation for introducing distributions to solve PDE’s becomes clear! The general goal is to calculate (hopefully easier to find) distributional solutions to a differential equation to actual, real-valued function solutions through convolution.If $\nabla$ is a differential operator we say a distribution $F$ is a fundamental solution for $\nabla$ if $\nabla F = δ$ Such a fundamental solution lets us find a real solution to the differential equation $\nabla u = g$ , with $g \in C_{c}^{\infty}$ by simply taking $u = F ⋆ g$ , as we easily confirm:

$\nabla (F ⋆ g) = (\nabla F) ⋆ g = δ ⋆ g = g$

Of course, this (crucially!) relies on the fact that $\nabla (F ⋆ g) = (\nabla F) ⋆ g$ , and recently I realized I had completely forgotten how to prove this fact! Luckily, shortly after Daniel O’Connor showed me how it works, and so I want to write it down for the next time that I forget.

Proving $\nabla (F ⋆ g) = (\nabla F) ⋆ g = F ⋆ (\nabla g)$

To prove this, we start small and build up to the general case. The interesting part is actually this small beginning however, the rest is just packaging.

Theorem 1: On $R$ , suppose $F \in D$ and $g \in C_{c}^{\infty}$ . Then $F ⋆ g$ is differentiable and $(F ⋆ g)^{'} = (F^{'}) ⋆ g$ .

Proof: Here’s Daniel’s argument: Let $x \in R$ , $h \neq 0$ and consider the difference quotient $ψ_{h} (x) := \frac{(F ⋆ g) (x + h) - (F ⋆ g) (x)}{h}$ If the limit $lim_{h \to 0} ψ_{h} (x)$ exists, then $F ⋆ g$ is differentiable at $x$ . Using the definition of $F ⋆ g$ and the linearity of $F$ , we may evaluate this as $ψ_{h} (x) = \frac{F (g (x + h - \cdot)) - F (g (x - \cdot))}{h}$ $= F (\frac{g (x + h - \cdot) - g (x - \cdot)}{h})$

Using the continuity of $F$ , we may take the limit inside, and so $lim_{h \to 0} ψ_{h} (x) = F (lim_{h \to 0} \frac{g (x + h - \cdot) - g (x - \cdot)}{h}) .$

The quantity inside of $F$ attempts to assign to each $p \in R$ the value $p \mapsto lim_{h \to 0} \frac{g (x + h - p) - g (x - p)}{h} = g^{'} (x - p)$ so in our notation, this is the function $g^{'} (x - \cdot)$ , which itself is in $C_{c}^{\infty}$ as $g$ was. Thus, $(F ⋆ g)^{'} (x)$ exists, and $(F ⋆ g)^{'} (x) = F (g^{'} (x - \cdot))$ . But this new term is exactly the definition of $F$ convolved with $g^{'}$ , when evaluated at $x$ ! Thus as functions, we have shown $(F ⋆ g)^{'} = F ⋆ g^{'}$

This is half of what we want, but the rest is just a straightforward application of the definition of the distributional derivative. By definition, $F^{'}$ is the linear functional such that $F^{'} (ϕ) = - F (ϕ^{'})$ for all $ϕ \in C_{c}^{\infty}$ , so computing $(F^{'}) ⋆ g$ , we see for $x \in R$ $(F^{'} ⋆ g) (x) = F^{'} (g (x - \cdot)) := - F (g (x - \cdot)^{'})$

Where $g (x - \cdot)^{'}$ is the function sending $p \mapsto \frac{d}{d p} g (x - p)$ . Computing this derivative with the chain rule shows $g (x - \cdot)^{'} = - g^{'} (x - \cdot)$ , and so $- F (g (x - \cdot)^{'}) = - F (- g^{'} (x - \cdot)) = F (g^{'} (x - \cdot))$

Stringing all this together, we see $(F^{'} ⋆ g) (x) = F (g^{'} (x - \cdot))$ , where we recognize this second term as defining the convolution $F ⋆ g^{'} (x)$ . As this equality holds for all $x \in R$ we have equality between functions: $F^{'} ⋆ g = F ⋆ g^{'}$

Combining with our earlier result proves the theorem, as we have shown both $(F ⋆ g)^{'}$ and $F^{'} ⋆ g$ are equal to $F ⋆ g^{'}$ .

Corollary 2: Let $D^{k}$ be the $k^{t h}$ derivative operator on $C_{c}^{\infty} (R)$ . Then for any $F \in D$ and $g \in C_{c}^{\infty}$ , the convolution $F ⋆ g$ is $k$ times differentiable and $D^{k} (F ⋆ g) = (D^{k} F) ⋆ g = F ⋆ (D^{k} g)$

Proof: We can proceed inductively using Theorem 1, as $D^{k} = D \circ D^{k - 1}$ is the $k$ -fold composition of the first derivative operator.

Lemma 3: On $R^{n}$ , let $\partial_{x}$ denote the directional derivative with respect to the first coordinate. Then for any $F \in D$ and $g \in C_{c}^{\infty}$ , $\partial_{x} (F ⋆ g)$ is smooth, and $\partial_{x} (F ⋆ g) = (\partial_{x} F) ⋆ g = F ⋆ (\partial_{x} g)$ .

Proof: The proof is exactly analogous to the one dimensional case in Theorem 1, so we can proceed rather quickly. It suffices to check this equality holds at an arbitrary fixed $p \in R^{n}$ , where $\partial_{x} (F ⋆ g) (p) = lim_{h \to 0} \frac{(F ⋆ g) (p + h e_{1}) - (F ⋆ g) (p)}{h}$ Evaluatiting the convolutions and using the linearity and continuity of $F$ shows this to be $F (\partial_{x} g (p - \cdot))$ , which is the convolution of $F$ with $\partial_{x} g$ evaluated at $p$ . Thus, $\partial_{x} (F ⋆ g) = F ⋆ (\partial_{x} g) .$ The second equality again follows simply by starting the definition of $\partial_{x} F$ to compute $(\partial_{x} F) ⋆ g$ at $p$ , resulting in $(\partial_{x} F) ⋆ g = F ⋆ (\partial_{x} g) .$

Corollary 4: If $L = \partial_{x}^{a} \partial_{y}^{b} \partial_{z}^{c} \dots$ is any monomial in the coordinate partial derivative operators on $R^{n}$ , then for any $F \in D$ and $g \in C_{c}^{\infty}$ , $L (F ⋆ g) = (L F) ⋆ g = F ⋆ (L g)$

Proof: As in corollary 2, we inductively apply Lemma 3 to each partial derivative operator which shows up in $L$ .

To build upwards from this, its useful to stop for a second and factorize out a little argument about convolution:

Lemma 5: If $F, Φ$ are distributions and $g, γ$ a smooth compactly supported functions, then $(F + Φ) ⋆ g = F ⋆ g + Φ ⋆ g$ and $F ⋆ (g + γ) = F ⋆ g + F ⋆ γ$ .

Proof: Let $x \in R^{n}$ . First consider $F ⋆ (g + γ)$ evaluated at $x$ . This is by definition $F ((g + γ) (x - \cdot))$ , that is, $F (g (x - \cdot) + γ (x - \cdot))$ . Using the linearity of $F$ , we see this to be $F (g (x - \cdot)) + F (γ (x - \cdot))$ , which is by definition $(F ⋆ g) (x) + (F ⋆ γ) (x)$ . Thus, $F ⋆ (g + γ) = F ⋆ g + F ⋆ γ .$

Next, consider $(F + Φ) ⋆ g$ evaluated at $x$ . By the definition of convolution, $((F + Φ) ⋆ g) (x) =$ $(F + Φ) (g (x - \cdot))$ . Using the definition of $+$ in $D$ , we distribute as $(F + Φ) (g (x - \cdot)) =$ $F (g (x - \cdot)) + Φ (g (x - \cdot))$ , where the last terms are each by definition equal to $(F ⋆ g) (x)$ and $(Φ ⋆ g) (x)$ respectively. Thus, $(F + Φ) ⋆ g = F ⋆ g + Φ ⋆ g .$

Lemma 6: Let $L_{1}, L_{2}$ be a differential operator on $R^{n}$ such that $L_{i} (F ⋆ g) = (L_{i} F) ⋆ g = F ⋆ (L_{i} g)$ for $i \in 1, 2$ and any $F \in D$ , $g \in C_{c}^{\infty}$ . Then $L = L_{1} + L_{2}$ also satisfies $L (F ⋆ g) = (L F) ⋆ g = F ⋆ L g$ for all $F, g$ .

Proof: The differential operator $L = L_{1} + L_{2}$ acts on functions by $L ϕ = L_{1} ϕ + L_{2} ϕ$ . Thus if $F \in D$ , $g \in C_{c}^{\infty}$ , $(L_{1} + L_{2}) (F ⋆ g) =$ $L_{1} (F ⋆ g) + L_{2} (F ⋆ g)$ .

To get the first of the two claimed equalities, we can use half our hypothesis on the $L_{i}$ to re-write this as $(L_{1} F) ⋆ g + (L_{2} F) ⋆ g$ , and then use Lemma 5 to factor out the convolution giving $(L_{1} + L_{2}) (F ⋆ g) =$ $(L_{1} F + L_{2} F) ⋆ g$ . Factoring out the $F$ gives what we wanted: $(L_{1} + L_{2}) (F ⋆ g) = ((L_{1} + L_{2}) F) ⋆ g$

To get the second equality, we use the other half of our assumption on the $L_{i}$ to rewrite $L_{1} (F ⋆ g) + L_{2} (F ⋆ g)$ as $F ⋆ (L_{1} g) + F ⋆ (L_{2} g)$ . We use Lemma 5 to factor out the distribution from this convolution, followed by the further factoring $L_{1} g + L_{2} g = (L_{1} + L_{2}) g$ . All together, this gives what we wanted: $(L_{1} + L_{2}) (F ⋆ g) = F ⋆ ((L_{1} + L_{2}) g)$

Lemma 7: Let $L$ be a differential operator on $R^{n}$ such that $L (F ⋆ g) = (L F) ⋆ g = F ⋆ (L g)$ for any $F \in D$ and $g \in C_{c}^{\infty}$ . Then if $ψ : R^{n} \to R$ is any smooth function, the differential operator $K = ψ L$ defined by $K ϕ = ψ L (ϕ)$ also satisfies $K (F ⋆ g) = (K F) ⋆ g = F ⋆ (K g)$ for all $F, g$ .

Proof: Evaluating $K (F ⋆ g) = ψ \cdot L (F ⋆ g)$ ,we can use that $L$ satisfies our hypothesis to conclude this is equal to $ψ \cdot ((L F) ⋆ g)$ and $ψ \cdot (F ⋆ (L g))$ . Taking the former and evaluating at $x \in R^{n}$ , we see it to equal $ψ (x) ((L F) ⋆ g) (x) = ψ (x) (L F) (g (x - \cdot))$ At this fixed $x$ , $ψ (x)$ is a constant, and so this is the same as evaluating the distribution $(ψ (x) L F)$ on $g (x - \cdot)$ . But this is the definition of the convolution of $ψ (x) L F$ with the function $g$ , evaluated at $x$ ! Thus, all together $ψ \cdot ((L F) ⋆ g)$ is the function $x \mapsto ((ψ (x) L F) ⋆ g) (x)$ which is the first half of what we want: $K (F ⋆ g) = ψ \cdot ((L F) ⋆ g) = (ψ L F) ⋆ g = (K F) ⋆ g$

The other case is similar, considering $ψ \cdot (F ⋆ (L g))$ evaluated at $x$ . This yields $ψ (x) F ((L g) (x - \cdot))$ , and as $ψ (x)$ is a constant at this $x$ , we may pull it inside to get $F (ψ (x) (L g) (x - \cdot))$ . That is, $ψ \cdot (F ⋆ (L g))$ sends $x$ to the result of convolving the function $ψ (x) L g$ with $F$ , so $K (F ⋆ g) = ψ \cdot (F ⋆ (L g)) = F ⋆ (ψ L g) = F ⋆ (K g)$

Finally, we need a description of the class of linear differential operators on $R^{n}$ which is amenable to our start-small-and-build-upwards approach:

Lemma 8: Any linear differential operrator on $R^{n}$ is a multinomial in the partial derivative operators, with coefficients in $C^{\infty} (R^{n})$ .

All the hard work is done, now its just putting the pieces together to state the main result:

Theorem 9: Let $\nabla$ be any linear differential operator on $R^{n}$ . Then $\nabla (F ⋆ g) = (\nabla F) ⋆ g = F ⋆ (\nabla g)$ for all $F \in D$ , $g \in C_{c}^{\infty}$ .

Proof: We write $\nabla$ as a multinomial in the partial derivatives, $\nabla = \sum_{[α]} ψ_{[α]} \partial^{[α]}$ where $[α] = [a, b, c, \dots]$ ranges over some finite subset of all multi-indices, $\partial^{[α]} = \partial_{x}^{a} \partial_{y}^{b} \partial_{z}^{c} \dots$ , and for each index $ψ_{[α]}$ is some smooth function $R^{n} \to R$ . But as each $\partial^{[α]}$ satisfies the desired property by Corollary 4, we can apply Lemmas 6 and 7 finitely many times to conclude that $\nabla$ does as well.

◀ Hyperbolic_spring

Topologizing The Space Of Distributions ▶

Convolution and Differentiation of Distributions

If F is a distribution and g is a function, why is (DF)*g equal to F*Dg?

Proving ∇(F⋆g)=(∇F)⋆g=F⋆(∇g)

If F is a distribution and g is a function, why is (DF)g equal to FDg?

Proving $\nabla (F ⋆ g) = (\nabla F) ⋆ g = F ⋆ (\nabla g)$