# The differential operator $\frac{d}{dx}$ binds variables

Recently the question If $\frac{d}{dx}$ is an operator, on what does it operate? was asked on mathoverflow.  It seems that some users there objected to the question, apparently interpreting it as an elementary inquiry about what kind of thing is a differential operator, and on this interpretation, I would agree that the question would not be right for mathoverflow. And so the question was closed down (and then reopened, and then closed again….sigh). (Update 12/6/12: it was opened again,and so I’ve now posted my answer over there.)

Meanwhile, I find the question to be more interesting than that, and I believe that the OP intends the question in the way I am interpreting it, namely, as a logic question, a question about the nature of mathematical reference, about the connection between our mathematical symbols and the abstract mathematical objects to which we take them to refer.  And specifically, about the curious form of variable binding that expressions involving $dx$ seem to involve.  So let me write here the answer that I had intended to post on mathoverflow:

————————-

To my way of thinking, this is a serious question, and I am not really satisfied by the other answers and comments, which seem to answer a different question than the one that I find interesting here.

The problem is this. We want to regard $\frac{d}{dx}$ as an operator in the abstract senses mentioned by several of the other comments and answers. In the most elementary situation, it operates on a functions of a single real variable, returning another such function, the derivative. And the same for $\frac{d}{dt}$.

The problem is that, described this way, the operators $\frac{d}{dx}$ and $\frac{d}{dt}$ seem to be the same operator, namely, the operator that takes a function to its derivative, but nevertheless we cannot seem freely to substitute these symbols for
one another in formal expressions. For example, if an instructor were to write $\frac{d}{dt}x^3=3x^2$, a student might object, “don’t you mean $\frac{d}{dx}$?” and the instructor would likely reply, “Oh, yes, excuse me, I meant $\frac{d}{dx}x^3=3x^2$. The other expression would have a different meaning.”

But if they are the same operator, why don’t the two expressions have the same meaning? Why can’t we freely substitute different names for this operator and get the same result? What is going on with the logic of reference here?

The situation is that the operator $\frac{d}{dx}$ seems to make sense only when applied to functions whose independent variable is described by the symbol “x”. But this collides with the idea that what the function is at bottom has nothing to do with the way we represent it, with the particular symbols that we might use to express which function is meant.  That is, the function is the abstract object (whether interpreted in set theory or category theory or whatever foundational theory), and is not connected in any intimate way with the symbol “$x$”.  Surely the functions $x\mapsto x^3$ and $t\mapsto t^3$, with the same domain and codomain, are simply different ways of  describing exactly the same function. So why can’t we seem to substitute them for one another in the formal expressions?

The answer is that the syntactic use of $\frac{d}{dx}$ in a formal expression involves a kind of binding of the variable $x$.

Consider the issue of collision of bound variables in first order logic: if $\varphi(x)$ is  the assertion that $x$ is not maximal with respect to $\lt$, expressed by $\exists y\ x\lt y$, then $\varphi(y)$, the assertion that $y$ is not maximal, is not correctly described as the assertion $\exists y\ y\lt y$, which is what would be obtained by simply replacing the occurrence of $x$ in $\varphi(x)$ with the symbol $y$. For the intended meaning, we cannot simply syntactically replace the occurrence of $x$ with the symbol $y$, if that occurrence of $x$ falls under the scope of a quantifier.

Similarly, although the functions $x\mapsto x^3$ and $t\mapsto t^3$ are equal as functions of a real variable, we cannot simply syntactically substitute the expression $x^3$ for $t^3$ in $\frac{d}{dt}t^3$ to get $\frac{d}{dt}x^3$. One might even take the latter as a kind of ill-formed expression, without further explanation of how $x^3$ is to be taken as a function of $t$.

So the expression $\frac{d}{dx}$ causes a binding of the variable $x$, much like a quantifier might, and this prevents free substitution in just the way that collision does. But the case here is not quite the same as the way $x$ is a bound variable in $\int_0^1 x^3\ dx$, since $x$ remains free in $\frac{d}{dx}x^3$, but we would say that $\int_0^1 x^3\ dx$ has the same meaning as $\int_0^1 y^3\ dy$.

Of course, the issue evaporates if one uses a notation, such as the $\lambda$-calculus, which insists that one be completely explicit about which syntactic variables are to be regarded as the independent variables of a functional term, as in $\lambda x.x^3$, which means the function of the variable $x$ with value $x^3$.  And this is how I take several of the other answers to the question, namely, that the use of the operator $\frac{d}{dx}$ indicates that one has previously indicated which of the arguments of the given function is to be regarded as $x$, and it is with respect to this argument that one is differentiating.  In practice, this is almost always clear without much remark.  For example, our use of $\frac{\partial}{\partial x}$ and $\frac{\partial}{\partial y}$ seems to manage very well in complex situations, sometimes with dozens of variables running around, without adopting the onerous formalism of the $\lambda$-calculus, even if that formalism is what these solutions are essentially really about.

Meanwhile, it is easy to make examples where one must be very specific about which variables are the independent variable and which are not, as Todd mentions in his comment to David’s answer. For example, cases like

$$\frac{d}{dx}\int_0^x(t^2+x^3)dt\qquad \frac{d}{dt}\int_t^x(t^2+x^3)dt$$

are surely clarified for students by a discussion of the usage of variables in formal expressions and more specifically the issue of bound and free variables.

## 10 thoughts on “The differential operator $\frac{d}{dx}$ binds variables”

1. The second integral you have written the lower limit as t – shouldn’t it be 0?

• Well, I had wanted just to mix things up a bit, since that instance of $t$ is the only one relevant for this instance of $\frac{d}{dt}$.

2. I realized at some point that freshman calculus is full of these problems. One of the things that hammered this home for me was when I saw a thread somewhere (I have lost the link) where several people thought that $\int_0^x x\,dx$ is somehow misformed for using $x$ as the limit of integration and in the integrand.

• I do think that that is misformed. $\int_0^x x dt$ would be well formed (although likely to confuse undergraduates). But I don’t see how $x$ can both be the upper limit of integration and the variable of integration. Can you explain what this notation is meant to mean?

• David, it is the same kind of thing as $n\cdot\Sigma_{n=0}^\infty \frac{1}{2^n}$, which is just $n\cdot 2$, or $2n$. The $n$ appearing in the sum is a bound variable, and has nothing to do with the $n$ in front. Such an expression might arise when you have a variable $n$ to be multiplied by a certain sum, which you had previously calculated using (a different copy of) the symbol $n$, resulting in just this expression. The expression is not really ill-formed, but the understanding of it is best understood by means of the concepts of free and bound variables, since the $n$ appearing in the summand is a bound variable. This kind of issue arises pervasively in the collision of local and global variables in programming—imagine that you used the variable $x$ in a subroutine, and also used $x$ with a different meaning in a deeper subroutine called by that routine. There are really degrees of globalness, for the scope is bounded usually by a quantifier or, in calculus, by a sum or integral sign.

In particular, I would write:

$$\int_0^x x\ dx\quad =\quad (x^2/2)\mid_0^x\quad = \quad x^2/2-0=x^2/2.$$

One just has to keep track of which $x$ is bound and which is free, and it is perfectly sensible (if ill-advised).

3. The use of partial derivatives does in fact lead to problems when there are unannounced constraints (e.g., if we are actually working on some submanifold of the ambient space using an implicit function rule). It is common for students to first encounter this in a thermodynamics class, where one derives various relationships between energy, pressure, temperature, volume, entropy, and chemical potential, while holding some subset of the variables constant. I remember it being rather confusing, at least for my classmates and me. Your mention of lambda-calculus as a way to clear up ambiguities makes me curious about whether it can be fruitfully applied to physics education.

4. You might be interested in my paper, “The First-Order Syntax of Variadic Functions”, to appear in NDJFL, http://arxiv.org/pdf/1105.4135.pdf

It does not talk about differential operators specifically, but it deals with similar issues of variable-binding-in-terms (as opposed to in formulas). In particular (Re: David Speyer), a rigorous semantics justifies saying things like
sum_{n=0}^n n = n(n+1)/2.

In addition to what DJH already wrote to David Speyer I’d add the following amusing example. Is the first-order sentence “forall x there exists x such that x=0” true, say, in the reals? The answer is Yes. One way to see this is to carefully apply the Tarskian definition of truth-in-a-model. You’ll see it’s entirely similar to the integral_0^x x dx question.

By the way, I don’t know how standard this is, but Stewart’s “Calculus” has the following to say about differentials (additions in brackets are mine): “If y=f(x), where f is a differentiable function, then the differential dx is an independent variable; that is, dx can be given the value of any real number. The differential dy is then [a dependent variable] defined in terms of dx [and x] by the equation dy=f'(x)dx.” This actually has the potential to be made quite logically rigorous (defining not just “variables” as in FOL but “independent variables” and “dependent variables” which depend on independent variables in various ways. One could very compellingly argue that elementary calculus isn’t just about real numbers and functions, but also about formal terms in a certain sophisticated language.

5. You are careful to say that d/dx does not bind in the usual sense of the word, but I was wondering if you have a more precise idea of what sort of binding this is?

Over at https://cs.stackexchange.com/questions/82230 I asked a similar question (before discovering your post today) and it was suggested to interpret d/dx (f(x)) as D(lambda x.f(x))(x). The problem with that is, that the expression with the lambda contains a free variable x, while it is not clear that the expression d/dx (f(x)) does so too. (I could of course write d/dx (f(5)), but it is usually considered to be something different from D(lambda x.f(x))(5))

Anyway, thanks for highlighting some of the subtleties over at MO. I find it extremely difficult to discuss this with people, since most consider the problem trivial or useless. I find it quite interesting from the perspective of (computer) formalizing the differential calculus as used by physicist and engineers.

6. Why should someone consider $\frac{d}{dx}$ as an operator? I remember in my undergrad calculus course we never have been thinking of it as an operator, rather it was a notation or say an abbreviation for a clear and precise statement. To be precise no one is allowed to write $\frac{d}{dx}x^3+1$ unless it is clear from the context. For making sure I again consult my undergrad textbook as well as other textbooks in calculus and analysis including calculus and analysis books of Rudin, Apostol, and R. Silverman, non of them mentions this as an operator.
I took the following lines from Apostol’s Calculus book.