Lesson overview
In this lesson, we'll use calculus to find the minima and maxima of any function \(f(x)\).
This can be done by using derivatives. Whenever a function "flattens out" where the steepness of the function at that point becomes zero, such a point must be associated with either a minimum or a maximum value of \(f(x)\). For example, if \(f(x)\) denotes the altitude at various point in along a mount, then \(f'(x)\) would denote the steepness of the mountain at any point. At the bottom and top of the mountain, the steepness \(f'(x)\) becomes zero. Thus, the values of \(x\) for which \(f'(x)\) becomes zero could either be associated with the bottom or top of the mountain. Knowing that \(f'(x)=0\) at the point \(x\) would not be enough information, however, to know whether or not that particular \(x\) value is associated with the top of the mountain or the bottom. To determine whether or not \(f(x)\) is at a maximum or a minimum value, we must take the second derivative of \(f(x)\) (which is represented by \(f''(x)\). If \(f''(x)>0\), then the steepness of the mountain \(f'(x)\) is getting bigger with increasing \(x\); thus, such a point must be associated with the top of the mountain (a maximum value of \(f(x)\). But if \(f''(x)<0\), then the steepness of the mountain \(f'(x)\) is getting smaller with increasing \(x\); such a point must therefore be associated with the bottom of the mountain (a minimum value of \(f(x)\)).
How do we know what the maximum and minimum values of a function are?
In calculus, the function \(f(x)\) gives the height of the graph at each point \(x\), and the derivative gives the slope of the graph at each \(x\). If \(f(x)\) is any arbitrary function as in Figure 1, you can see visually that there will be certain points where the function has reached a local minima or maxima. But is there a purely analytical way to determine the values of \(x\) for which the function hits a maximum or minimum value? Yes, there is. Whenever a function \(f(x)\) reaches a local maxima or minima, the derivitive \(f'(x)\) at that point must be zero. But how do we know, without looking at the graph, whether or not that point is a local maxima or a local minima?
In order for \(f(x)\) to reach a local maximum at some point—say \(f(a)\)—as \(f(x)→f(a)\) from the left to the right, \(f(x)\) has to keep increasing at a slower and slower rate until eventually it levels off at its maximum value. To put that another way, as \(f(x)→f(a)\) from the left to the right, the slope \(f'(x)\) at each point has to gradually decrease to zero. As you move away from a local maximum \(f(a)\) to the right, the function \(f(x)\) should start decreasing; this means that the derivative should gradually decrease from zero to negative values as you move further to the right. What all of this means is that at the point \(f(a)\), the slope is clearly getting smaller as you move to the right by a small amount \(dx\). The derivative \(f'(x)\) is just a function; you could imagine plotting that function for \(x\) values close to \(a\) and what you would see is this: the function would start off positive; as you move to the right approaching \(a\), the function would gradually get less positive; when you reach \(a\), the function would become zero; and then as you move away from \(a\) to the right, the function would get negative. When we view \(f'(x)\) like this in terms of just another function on a graph, it is clear that the slope of this function is negative. And yes, that's right, we're thinking about the slope of the slope—something which is all to easy to get confused about. For me, the easiest way to think about this when I first learned this concept was to say: since \(f'(x)\) is just some function, I'm just going to think about taking the slope of just some other plain old function. That's much less confusing than thinking about the slope of the slope of the function. What we have learned here is that two conditions must be satisfied for a point along \(f(x)\) to be considered a local maximum: first, the derivative at that point must be zero; and also, the second derivative at that point must be negative.
A point \(f(a)\) along \(f(x)\) is a local maximum if \(f'(a)=0\) and \(f''(a)<0\).
How do we know if \(f(x)\) hit a minimum value at some point? Figuring this out is analogous to determining whether or not \(f(x)\) hit a local maximum—except, things are switched around. If \(f(x)\) is a minimum at some point (say at \(f(b)\)), then as \(f(x)→f(b)\) from the left to the right, \(f(x)\) has to gradually decrease at a slower and slower rate until it levels off at a minimum value; the slope \(f'(x)\) must start off negative, and then gradually increase to zero. As you move away from \(f(b)\) to the right, the function has to start increasing—which means that the slope must go from zero to some positive value. What we can conclude from this is that if we were looking at the graph \(f'(x)\) vs. \(x\) (just think of \(f'(x)\) as just any other plain old function), if you imagine moving away from \(f'(b)\) by a small amount \(dx\), then the change \(df'\) will be positive. Thus, the derivative of this function must be positive. To wrap that all up, in order for a function to have hit a minimum value, its slope at that point must be zero and its second derivative must be positive.
A point \(f(b)\) along \(f(x)\) is a local minimum if \(f'(b)=0\) and \(f''(b)<0\).
Example: finding the minimum of \(\textbf{x}^{\textbf{2}}\)
We all already know that the minimum value of the function \(f(x)=x^2\) occurs at \(f(0)=0\). But our goal in this lesson will be to try to find this minimum "without looking," so to speak. We're going to try to find the minimum using just the analytical tools we developed in the previous section. As we discussed a few sentences ago, there are essentially only two requirements for a function \(f(x)\) to be a minimum at a point \(x=b\): it's derivative has to be zero at that point (this essentially tells us that we're either looking at a maximum or a minimum, without knowing which of the two we're at) and its second derivative at that point must be positive (this tells us which of the two we're dealing with—a minimum). That's basically all it boils down to.
To find the minimum of \(f(x)=x^2\), let's say that the minimum of this function occurs at some point \(x=b\)—since we're not looking at the graph, we have no idea what that point actually is. Let's start out by taking the first derivative of \(f(x)\) to get
$$f'(x)=2x.\tag{1}$$
The function \((f(x)=x^2\) will only be a minimum or maximum at those values of \(x\) where the derivative \(f'(x)\) is zero. As we can see from Equation (1), there is only one value of \(x\) where the derivative is zero. The derivative is only zero at the point \(x=0\) and we can prove that through substitution:
$$f'(0)=2·0=0.$$
Again, if we're not looking at a graph, all that we know right now is that \(f(x)=x^2\) must be at either a minimum or a maximum at \(f(0)\). We're not sure which yet though. The only way to find out whether this point is associated with a maximum or a minimum is to evaluate \(f''(0)\). If the value we get for this second derivative is positive, then we'll know that we're at a minimum; if this second derivative is negative, then we'll know that \(f(0)\) is a maximum. Let's start off by taking the derivative of \(f'(x)\) with respect to \(x\) to get:
$$f''(x)=2.\tag{2}$$
Equation is essentially telling us that the second derivative is equal to \(2\) at every value of \(x\). Every value of \(x\), including zero. Thus, \(f''(0)=2\). Since the first derivative at \(x=0\) is zero and since the second derivative at \(x=0\) is positive, this means that the minimum of \(f(x)\) must occur at \(f(0)=0\). If we now look at the graph of \(x^2\) below, we discover that this is indeed the minimum.
This article is licensed under a CC BY-NC-SA 4.0 license.