Education is not Moneyball

I initially embraced value-added methods of teacher evaluation, figuring that they could revolutionize education in the same way that sabermetricians revolutionized professional baseball. Over time, however, I realized that this analogy was somewhat flawed. There are lots of ways to analyze data, and the owners of baseball teams have a real motivation — they want to win ball games and sell tickets — to use data appropriately to ensure their best chance of success. I’m not so sure that the “owners” of public education — the politicians and ultimately the voters — share this motivation.

An excellent editorial the contrasting use of statistics in baseball and in education appeared in Education Week: http://www.edweek.org/tm/articles/2014/08/27/fp_eger_valueadded.html?cmp=ENL-TU-NEWS1 I appreciate the tack that this editorial takes: the author is not philosophically opposed to sabermetric-like analysis of education but argues forcefully that, pragmatically, we’re not there yet.

Both the Gates Foundation and the Education Department have been advocates of using value-added models to gauge teacher performance, but my sense is that they are increasingly nervous about accuracy and fairness of the new methodology, especially as schools transition to the Common Core State Standards.

There are definitely grounds for apprehensiveness. Oddly enough, many of the reasons that the similarly structured WAR [Wins Above Replacement] works in baseball point to reasons why teachers should be skeptical of value-added models.

WAR works because baseball is standardized. All major league baseball players play on the same field, against the same competition with the same rules, and with a sizable sample (162 games). Meanwhile, public schools aren’t playing a codified game. They’re playing Calvinball—the only permanent rule seems to be that you can’t play it the same way twice. Within the same school some teachers have SmartBoards while others use blackboards; some have spacious classrooms, while others are in overcrowded closets; some buy their own supplies while others are given all they need. The differences across schools and districts are even larger.

The American Statistical Association released a brief report on value-added assessment that was devastating to its advocates. ASA set out some caveats on the usage on value-added measurement (VAM) which should give education reformers pause. Some quotes:

VAMs are complicated statistical models, and they require high levels of statistical expertise. Sound statistical practices need to be used when developing and interpreting them, especially when they are part of a high-stakes accountability system. These practices include evaluating model assumptions, checking how well the model fits
the data, investigating sensitivity of estimates to aspects of the model, reporting measures of estimated precision such as confidence intervals or standard errors, and assessing the usefulness of the models for answering the desired questions about teacher effectiveness and how to improve the educational system.

VAMs typically measure correlation, not causation: Effects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.

Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.

VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.

 

 

 

The number of digits in n! (Part 4)

When I was in school, I stared at this graph for weeks, if not months, trying to figure out an equation for the number of digits in n!. And I never could figure it out. However, even though I was not able to figure this out for myself, there is a very good approximation using Stirling’s approximation. The next integer after \log_{10} n! gives the number of digits in n!, and

\log_{10} n! \approx \displaystyle \frac{\left(n + \displaystyle \frac{1}{2} \right) \ln n - n + \displaystyle \frac{1}{2} \ln (2\pi)}{\ln 10}

The graph below shows just how accurate this approximation really is. The solid curve is the approximation; the dots are the values of \log_{10} n!. Not bad at all… the error in the curve is smaller than the size of the dots.

stirling

The following output from a calculator shows just how close the approximation to log_{10} 69! is to the real answer. There are also additional terms to Stirling’s series that would get even closer answers.

stirling69-2

stirling69

As I mentioned earlier in this series, I’m still mildly annoyed with my adolescent self that I wasn’t able to figure this out for myself… especially given the months that I spent staring at this problem trying to figure out the answer.

First, I’m annoyed that I didn’t think to investigate \log_{10} n!. I had ample experience using log tables (after all, this was the 1980s, before scientific calculators were in the mainstream) and I should have known this.

Second, I’m annoyed that I didn’t have at the tips of my fingers the change of base formula

\log_{10} n! = \displaystyle \frac{\ln n!}{\ln 10}

Third, I’m annoyed that, even though I knew calculus pretty well, I wasn’t able to get at least the first couple of terms of Stirling’s series on my own even though the derivation was entirely in my grasp. To begin,

\ln n! = \ln (1 \cdot 2 \cdot 3 \dots \cdot n) = \ln 1 + \ln 2 + \ln 3 + \dots + \ln n

For example, if n = 10, then \ln 10! would be the areas of the 9 rectangles shown below (since $\ln 1 = 0$):

stirlingintegralThe areas of these nine rectangles is closely approximated by the area under the curve y = \ln x between x =1\frac{1}{2} and x = 10\frac{1}{2}. (Indeed, I chose a Riemann sum with midpoints so that the approximation between the Riemann sum and the integral would be very close.)

In general, for n! instead of 10!, we have

\ln n! \approx \displaystyle \int_{3/2}^{n+1/2} \ln x \, dx

This is a standard integral that can be obtained via integration by parts:

\ln n! \approx \bigg[ \displaystyle x \ln x - x \bigg]_{3/2}^{n+1/2}

\ln n! \approx \left[ \left(n + \displaystyle \frac{1}{2} \right) \ln \left(n + \displaystyle \frac{1}{2} \right) - \left(n + \displaystyle \frac{1}{2} \right) \right] - \left[ \displaystyle \frac{3}{2} \ln \frac{3}{2} - \frac{3}{2} \right]

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln \left(n + \displaystyle \frac{1}{2} \right) - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

We can see that this is already taking the form of Stirling’s approximation, given above. Indeed, this is surprisingly close. Let’s use the Taylor approximation \ln(1+x) \approx x for x \approx 0:

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln \left(n \left[1 + \displaystyle \frac{1}{2n}\right] \right) - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \left[\ln n + \ln \left(1 + \displaystyle \frac{1}{2n} \right) \right] - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \left[\ln n + \displaystyle \frac{1}{2n} \right] - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln n + \left(n + \displaystyle \frac{1}{2} \right) \displaystyle \frac{1}{2n} - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln n +\displaystyle \frac{1}{2} + \frac{1}{4n} - n - \displaystyle \frac{3}{2} \ln \frac{3}{2} + 1

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln n - n + \left(\displaystyle \frac{3}{2} - \displaystyle \frac{3}{2} \ln \frac{3}{2} \right) + \displaystyle \frac{1}{4n}

By way of comparison, the first few terms of the Stirling series for \ln n! are

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln n - n + \displaystyle \frac{1}{2} \ln (2\pi) + \displaystyle \frac{1}{12n}

We see that the above argument, starting with an elementary Riemann sum, provides the first two significant terms in this series. Also, while the third term is incorrect, it’s closer to the correct third term that we have any right to expect:

\displaystyle \frac{3}{2} - \displaystyle \frac{3}{2} \ln \frac{3}{2} \approx 0.8918\dots

\displaystyle \frac{1}{2} \ln (2\pi) \approx 0.9189\dots

The correct third term of \displaystyle \frac{1}{2} \ln(2\pi) can also be found using elementary calculus, though the argument is much more sophisticated that the one above. See the MathWorld website for details.

The number of digits in n! (Part 3)

The following graph shows the number of digits in n! as a function of n.

factorialdigits

When I was in school, I stared at this graph for weeks, if not months, trying to figure out an equation that would fit these points. And I never could figure it out.

When I took calculus in college, I distinctly remember getting up the nerve to ask my professor, the great L.Craig Evans (now at UC Berkeley), if he knew how to solve this problem. To my great consternation, he immediately wrote down what I now realize to be the right answer, using Stirling’s approximation:

\ln n! \approx \left(n + \displaystyle \frac{1}{2} \right) \ln n - n + \frac{1}{2} \ln (2\pi)

While I now know that this was the way to go about solving this problem, I didn’t appreciate how this formula could help me at the time. I only saw the n! on the left-hand side and did not see the immediate connection between this formula and the number of digits in n!.

But now I know better.

For starters, the number of base-10 digits in a number n is always the next integer greater that \log_{10} n. For example, $\log_{10} 2000 \approx 3.301$, and the next integer larger than 3.301 is 4. Unsurprisingly, the number 2000 has 4 digits.

Second, the change of base formula for logarithms gives

\log_{10} n! = \displaystyle \frac{\ln n!}{\ln 10}

Therefore, the number of digits in n! will be about

\displaystyle \frac{\left(n + \displaystyle \frac{1}{2} \right) \ln n - n + \frac{1}{2} \ln (2\pi)}{\ln 10}

The graph below shows just how accurate this approximation really is. The solid curve is the approximation; the dots are the values of \log_{10} n!. (In other words, this series of dots are only slightly different than the dots above, which have integers as coordinates.) Not bad at all… the error in the approximation is smaller than the size of the dots in this picture.

stirling

The number of digits in n! (Part 2)

The following graph shows the number of digits in n! as a function of n.

factorialdigits

When I was in school, I stared at this graph for weeks, if not months, trying to figure out an equation that would fit these points. And I never could figure it out.

In retrospect, my biggest mistake was thinking that the formula had to be something like y = a x^m, where the exponent m was a little larger than 1. After all, the graph is clearly not a straight line, but it’s also not as curved as a parabola.

What I didn’t know then, but know now, is that there’s a really easy way to determine to determine if a data set exhibits power-law behavior. If y = a x^m, then

\ln y = \ln a + m \ln x.

If we make the substitutions Y = \ln Y, B = \ln a, and X = \ln x, then this equation becomes

Y = m X + B

In other words, if the data exhibits power-law behavior, then the log-transformed data would look very much like a straight line. Well, here’s the graph of (X,Y) after applying the transformation:

loglogfactorialdigits

Ignoring the first couple of pots, the dots show an ever-so-slight concave down pattern, but not enough that would have discouraged me from blindly trying a pattern like y = a x^m. However, because these points do not lie on a straight line and exhibit heteroscedastic behavior, my adolescent self was doomed to failure.

The number of digits in n! (Part 1)

When I was in school, perhaps my favorite pet project was trying to find a formula for the number of digits in n!. For starters:

  • 0! = 1: 1 digit
  • 1! = 1: 1 digit
  • 2! = 2: 1 digit
  • 3! = 6: 1 digit
  • 4! = 24: 2 digits
  • 5! = 120: 3 digits
  • 6! =720: 3 digits
  • 7! = 5040: 4 digits
  • 8! = 40,320: 5 digits

I owned what was then a top-of-the-line scientific calculator (with approximately the same computational capability as a modern TI-30), and I distinctly remember making a graph like the following on graph paper. The above calculations contribute the points (0,1), (1,1), (2,1), (3,1), (4,2), (5,3), (6,3), (7,4), and (8,5).

factorialdigitsI had to stop (or, more accurately, I thought I had to stop) at 69! because my calculator couldn’t handle numbers larger than 10^{100}.

I stared at this graph for weeks, if not months, trying to figure out an equation that would fit these points. And I never could figure it out.

And, to this day, I’m somewhat annoyed at my adolescent self that I wasn’t able to figure out this puzzle for myself… since I had all the tools in my possession needed to solve the puzzle, though I didn’t know how to use the tools.

In this series of posts, I’ll answer this question with the clever application of some concepts from calculus and precalculus.

Too many significant digits (Part 2)

In yesterday’s post, I had a little fun with this claim that the Nike app could measure distance to the nearest trillionth of a mile. The more likely scenario is that the app just reported all of the digits of a double-precision floating point number, whether or not they were significant.

significantdigits

In real life, I’d expect that the first three decimal places are accurate, at most. According to Wikipedia, the official length of a marathon is 42.195 kilometers, but any particular marathon may be off by as many as 42 meters (0.1% of the total distance) to account for slight measurement errors when figuring out a course of that length.

A history about the Jones-Oerth Counter, the devise used to measure the distance of road running courses, can be found at the USA Track and Field website. And my friends who are serious runner swear that the Jones-Oerth Counter is much more accurate than GPS.

green line
The same story often appears in students’ homework. For example:

If a living room is 17 feet long and 14 feet wide, how long is the diagonal distance across the room?

Using the Pythagorean theorem, students will find that the answer is \sqrt{485} feet. Then they’ll plug into a calculator and write down the answer on their homework: 22.02271555 feet.

This answer, of course, is ridiculous because a standard ruler cannot possibly measure a distance that precisely. The answer follows from the false premise that the numbers 17 and 14 are somehow exact, without absolutely no measurement error. My guess is that at most two decimal places are significant (i.e., the numbers 17 and 14 can be measured accurately to within one-hundredth of a foot, or about one-eighth of an inch).

My experience is that no many students are comfortable with the concept of significant digits (or significant figures), even though this is a standard topic in introductory courses in chemistry and physics. An excellent write-up of the issues can be found here: http://www.angelfire.com/oh/cmulliss/

Other resources:

http://mathworld.wolfram.com/SignificantDigits.html

http://en.wikipedia.org/wiki/Significant_figures

http://en.wikipedia.org/wiki/Significance_arithmetic

 

Too many significant digits (Part 1)

The following appeared on my Facebook feed a while back:

significantdigits

Just look at that: the Nike app claimed to measure the length of my friend’s run with twelve decimal places of accuracy.

Let’s have some fun with this. Just suppose that the app was able to measure distance to the nearest trillionth of a mile. One trillionth of a mile is…

5.28 billionths of a foot,

or about 63.4 billionths of an inch,

or about 161 billionths of a centimeter,

or about 1.61 billionths of a meter,

or about 1.61 nanometers.

By way of comparison, the fingernails on the average adult grow about 3 millimeters a month. A world-class runner could run 6.25 miles in about 30 minutes; in those 30 minutes, his/her fingernails would grow about 2 microns, or about 2000 nanometers. (Of course, they’ll grow longer for less athletic runners covering the same distance at a slower speed.)

So if the Nike app can measure my distance to the nearest trillionth of a mile, it would have absolutely no difficulty measuring how much my fingernails grew while running.

Or, it could be that the Nike app really isn’t measuring the distance all that precisely. Probably the app used double-precision arithmetic, and whoever programmed the app didn’t tell it to truncate after a reasonable number of digits.

Lychrel Numbers

A friend of mine posted the following on Facebook (with names redacted):

So [my daughter] comes home with this assignment:

For each number from 10 – 99, carry out the following process.

  1.  If the number is a palindrome (e.g., 77), stop.
  2.  Else reverse the number and add that to the original. E.g.: 45+54 = 99.
  3.  If the result is not a palindrome, repeat step (2) with the result.
  4.  Record the final palindromic result and the number of steps taken.

Most are simple.

  • 56 + 65 = 110
  • 110 + 011 = 121
  • Stop. 2 steps taken.

The numbers 89 and 98 were given for extra credit, and they mysteriously explode, taking 24 steps. It made [my daughter] cry.

She wanted me to check her work, so I decided it was a good time to teach the wonders of Python, and we very quickly had a couple of simple functions to do the trick.

Well, you saw where this was going. How many steps does 887 take?

We’re up to 104000 steps so far, and Python is crying.

True or false: For a given n, the above algorithm completes in finite time?

I guess I’ve been living under a rock for the past 20 years, because I had never heard of this problem before. It turns out that numbers not known to lead to a palindrome are called Lychrel numbers. However, no number in base-10 has been proven to be a Lychrel number. The first few candidate Lychrel numbers (i.e., numbers that have not been proven to not be Lychrel numbers) are 196, 295, 394, 493, 592, 689, 691, 788, 790, 879, 887, 978, 986, 1495, 1497, 1585, 1587, 1675, 1677, 1765, 1767, 1855, 1857, 1945, 1947, 1997, 2494, 2496, 2584, 2586, 2674, 2676, 2764, 2766, 2854, 2856, 2944, 2946, 2996, 3493, 3495, 3583, 3585, 3673, 3675…

The above algorithm is called the 196-algorithm, after the smallest suspected Lychrel number.

For further reading, I suggest the following links and the references therein:

http://mathworld.wolfram.com/196-Algorithm.html

http://mathworld.wolfram.com/LychrelNumber.html

http://www.p196.org/

http://www.mathpages.com/home/kmath004/kmath004.htm (which contains a proof that 10110 is a Lychrel number in binary and that Lychrel numbers always exist in base 2^k)

http://en.wikipedia.org/wiki/Lychrel_number

Proof without words: The difference of consecutive cubes

Source: https://www.facebook.com/photo.php?fbid=451328078334029&set=a.416585131808324.1073741827.416199381846899&type=1&theater

For a more conventional algebraic proof, notice that

(n+1)^3 - n^3 = n^3 + 3n^2 + 3n + 1 - n^3 = 3n(n+1) + 1

The product n(n+1) is always an even number times an odd number: if n is even, then n+1 is odd, but if n is odd, then n(n+1). So n(n+1) is a multiple of 2, and so 3n(n+1) is a multiple of 6. Therefore, 3n(n+1)+1 is one more than a multiple of 6, proving the theorem.