Ask Uncle Colin: why does the Newton-Raphson method work?

Ask Uncle Colin is a chance to ask your burning, possibly embarrassing, maths questions – and to show off your skills at coming up with clever acronyms. Send your questions to colin@flyingcoloursmaths.co.uk and Uncle Colin will do what he can.

Dear Uncle Colin,

I know how to use the Newton-Raphson method – but I don’t know why it works and I’m worried nobody will like me because of it.

-- Getting An Understanding Starts Somewhere

But of course, GAUSS! First up, just for the sake of clarity, not knowing where the Newton-Raphson method comes from is perfectly normal. In fact, not many people know that the version we know and love was developed by Simpson ((Newton and Raphson had similar, but much more limited, processes for finding polynomial roots.)) – I didn’t know that until recently, and a few people like me (I think).

If you want to find the root of a function with the Newton-Raphson method, starting from a sensible guess $x_0$, here’s what you do:

Work out the value of the function at $x_0$
Work out the value of the derivative of the function at $x_0$
Divide the first by the second
Take the result away from $x_0$ – and you get your next guess, $x_1$.

For example, if I wanted to work out $\sqrt{123}$ and figured out that $f(x) = x^2-123$ would give me a zero at that point, I’d guess that the answer was about 11 (because $11^2=121$).

I’d work out: $f(11) = -2$ .

I’d differentiate to get $f’(x) = 2x$, so $f’(11) = 22$.

I’d divide the first by the second to get $-\frac{1}{11}$.

I’d take that away for a second, improved guess of $x_1 = 11 + \frac{1}{11} = 11.\dot 0 \dot 9$. (It’s actually 11.0905, so that’s good to three decimal places.)

But why does it work? That’s a more interesting question. It’s actually a very simple idea: draw a tangent to the curve at your best guess, and use where that crosses the $x$-axis as your next guess.

Not convinced? Well, the tangent at $x_0$ has a gradient of $f’(x_0)$ and goes through $(x_0, f(x_0))$, so the equation of the line is $(y-f(x_0)) = f’(x_0) (x - x_0)$ – and the point $(x_1, 0)$ lies on this line.

Substituting, you get $- f(x_0) = f’(x_0) (x_1 - x_0)$. Divide by the gradient and add $x_0$, you get:

$x_0 - \frac{ f(x_0)}{f’(x_0)} = x_1$, which is the Newton-Raphson recipe. Neat, eh?

-- Uncle Colin

A selection of other posts