The NCLB Fiasco After Ten Years: What Were Bush And Kennedy Thinking?

It has finally dawned on me what Ted Kennedy, George W. Bush, and company were probably thinking ten years ago when they came up with the ill-fated No Child Left Behind school reform act.

Until now, I've never been able to grasp what kind of picture they had in their heads when they decided that the way to close the black/Hispanic-white/Asian Achievement Gap was through more school testing.

Mandating frequent testing of K-12 students to solve the problem of cognitive inequality always struck me as much like trying to solve the problem of height inequality by requiring that everybody play a lot of basketball. That wouldn't make short people taller—it would just make their shortness more obvious.

But now, I think, I've finally stumbled upon the wacky analogy unconsciously underlying the conventional wisdom about how more school testing would Leave No Child Behind. So I'm going to take some time to explain the mental framework behind so much of mainstream school reform thinking, as exemplified by Bush's popular soundbite on "the soft bigotry of low expectations".

The great thing about espousing the conventional wisdom, as Kennedy and Bush did, is that you don't have to justify it very much. You just handwave problems away. And, because you aren't questioned aggressively about what model you have in mind for how your law is supposed to work, it's hard for skeptics to instill any doubts.

Recently, the Obama Administration has been waking up to the fact that the NCLB mandate requiring  every public school student in America to score "proficient" (on a scale running from "below basic" to "basic" to "proficient" to "advanced") on both reading and math tests by 2014 has left them with a big mess.

Barack Obama has been campaigning to change the law, because up to 80 percent of schools will otherwise be officially found "failing" this year and therefore must be punished. He complained last Monday:

"That's an astonishing number. … We know that four out of five schools in this country aren't failing. So what we're doing to measure success and failure is out of line."[Obama Urges Education Law Overhaul, By Helene Cooper, NYT, March 15, 2011]

Why the sudden surge in schools classified as "failing" under the NCLB?

Because, while the people who wrote the NCLB might have been naive about testing, they weren't at all naive about the politics of blame. Accordingly, they required only modest improvement in test scores during the early years of NCLB. The goal of 100 percent proficiency was cleverly backloaded.

This has created a hockey stick-like profile of required performance that zooms up to 100% proficiency at the last moment—like the student who tells his parents at 8pm they must go to the store now because, he just remembered, his big science fair project is due in the morning.

And, this way, most of the penalties wouldn't kick in until the second decade of NCLB's existence—when, by the way, Kennedy is dead and Bush retired.

Amazingly, however, it now turns out that passing a federal law requiring every student in the country to be "proficient" at reading and math really doesn't mean they will be, in fact, "proficient". (With students, failure is always an option.)

Why did the public policy mainstream place so much faith in the power of K-12 testing to eliminate achievement inequality? Why did they think that it was feasible for everybody (or practically everybody, for that matter) to become "proficient"? Why did they think it was crucial for everybody to pass—but not very important for some people to do much better than passing?

I've been stumped by these questions for a decade.

At last, however, I've realized there is at least one common kind of test that does function rather like what the conventional wisdom expected from K-12 tests.

Almost everybody (outside a few cities) takes this particular kind of test. And the great majority of them eventually buckle down and pass it.

And passing is all that's required. While high scorers on the SAT get into fancier colleges and are more likely to go on to postgraduate studies, high scorers on this other kind of test aren't treated much differently from those who just scrape by.

So what is this widespread test that provides the mental template for NCLB and for so much else in the school reform brouhaha?

In the interests of dramatic suspense, I'm not going to reveal it yet. See if you can guess!

There's a fundamental distinction about testing that is poorly understood. Tests can be thought of as measuring either:  

  1. Relative performance versus other test-takers; or
     
  2. Absolute performance against some predetermined level of adequate learning.

Most school reform rhetoric assumes that the latter is how tests inevitably work. That's why we hear constantly about how we must make the standards more rigorous to raise performance.

The idea of an absolute test is much easier to give a pep talk about:

"Every single one of you must learn how to use the quadratic formula! It will be hard, you will lose sleep studying it, but I know that, in the end, each and every one of you can and will do it!!"

The problem with our thinking about education in modern America is not that adults address children in this manner. That's good. The problem is that adults are supposed to talk to other adults about education as if they, too, were children.

A grown-up conversation about school reform ought to mention that the most obvious analogs for K-12 achievement tests are college admissions tests, such as the SAT and ACT. And those are relativist rather than absolutist tests. The SAT clearly doesn't make people more equal. The SAT is deliberately designed to leave many children "behind" and send a few far ahead.

The SAT doesn't have a passing score to push everybody toward any "basic" or "proficient" minimum competence. Instead, the SAT elaborately distinguishes among students for the benefit of exclusive colleges.

And it more or less works at what it says is does.

Indeed, a large fraction of the most successful and enduring tests—such as the SAT, ACT, SAT Subject Tests, GRE, LSAT, MCAT, DAT, and GMAT, military's AFQT—are all built upon the assumption that human performance is distributed relativistically upon a bell curve. A median score and a standard deviation are determined. Thus, if the median is, say, 500 and the standard deviation is 100, somebody who scores a 600 ranks at the 84th percentile.

Among famous education exams, the Advanced Placement test started off absolute—scoring 3 on a 1 to 5 scale was "passing". But colleges have been gradually relativising how they treat scores—e.g., a 5 on U.S. History might get you credit for two semesters of history, a 4 gets you one semester, and 3 or less nothing. It's up to each college: MIT only blesses 5s and Caltech doesn't give any credit. Moreover, less than ten percent of 17-year-olds take any single AP test in a year, so the AP isn't much of a model for the mass of students.

Universities use SAT-type test scores as each sees fit. Caltech, for instance, typically wants higher scoring students than the adjacent Pasadena City College, the junior college that my father attended in the 1930s.

Scoring upon a bell curve is both mathematically elegant and pragmatically useful, which is why it's so widely used. Psychometricians feel much more confident about what they are measuring when told to devise tests that are explicitly relativist.

But there are some assumptions behind bell curve scoring whose implications are not at all popular. When Richard Herrnstein and Charles Murray spelled out these out in epic detail in 1994 in The Bell Curve, they didn't make themselves the toast of the town. ["The Bell Curve" and its critics, By Charles Murray Commentary, May 1995]

The point of relativistic tests such as the SAT is not to make sure that every student knows what he or she needs to know: it's to find out who is best. Nor these test designers claim that administering their test will make the students smarter. In fact, the designers worry when scores go up that perhaps somebody is gaming the test.

Of course, relativist tests do top out at some score. But that's merely for convenience and cost-effectiveness. It's obvious that an 800 on the SAT doesn't necessarily represent the ultimate in human intellectual proficiency. What would John Updike have scored on the SAT Verbal if the test had been 48 hours long? 1050? What would John von Neumann have scored on the SAT Math? 1100?

In turn, it's hard for those of us who grasp these basics of psychometrics to realize that people at the Kennedy/Bush level of intellectual sophistication and/or substance abuse don't find the logic of bell curve tests intuitive at all.

Further, to reach Kennedy/Bush levels of success in politics, you can't go around saying things like, "Well, obviously, your SAT score shows you aren't smart enough to get into Caltech, so you'd better come up with a more practical plan for your life."

In fact, today, you probably shouldn't even think that.

Instead, in public life, you get rewarded for uplifting demagoguery. When Bush attributed poor performance in school to "the soft bigotry of low expectations", he most likely quite sincerely meant it.

The conventional wisdom espouses the notion that there are academic accomplishments that every student should have because they are crucial for his future. For example, here's the United Federation of Teachers explicating this popular assumption:

"Algebra 2 is often described as a 'gateway course' because it correlates so closely with college success. Students who complete Algebra 2 are twice as likely to earn a bachelor's degree as students who do not, and passing Algebra 2 reduces the gap in college-completion rates between African American and Latino students and their white peers by half."[Beyond high school graduation: What the data tell us, by Maisie McAdoo, UFT.ort, April 1, 2010]

In fact, in my experience, techniques that are reserved for Algebra 2 aren't going to be used on the job by the great majority of workers, even among college graduates. Nevertheless, I don't doubt that success in Algebra 2 in high school does correlate with success in college—even in classes that don't use Algebra 2 at all. That's because Algebra 2 measures logic, powers of abstraction, and work ethic, all of which are good things to have at college.

But you aren't supposed to think like that. You are supposed to think like this: Success in life correlates with graduating from college, which correlates with success in high school Algebra 2. Therefore, knowing Algebra 2 makes people a success in life—so the public schools must teach everybody Algebra 2!

Unfortunately, not everybody who takes Algebra 2 learns Algebra 2. Which is why the UFT is obliged to go on:

"In 2008, the Alliance for Excellent Education reported, of 90,000 students who took an end-of-course Algebra 2 exam, the average score was 27 percent."

The ensuing arguments in mainstream public policy discourse are over whom to blame: poverty, teachers, racism, government schools, parents, YouTube, or whatever.

So what common test is the closest model for school reform's conventional wisdom? What test is the opposite of the SAT in that all the emphasis is put upon achieving a passing score?

That's right, you guessed it: the driver's license test!

When the conventionally-minded imagine that K-12 tests will bring about equality, they are assuming that these tests will work more or less like the test your teenager takes down at the DMV. Compare it to the SAT: 

  • The driver's test is supposed to be scored in an absolute fashion, not relative to what everybody else is doing.
     
  • Given enough tries, most people eventually pass.
     
  • The whole point of driver's test takers is to reach the minimum level of competence to be allowed to drive. The passing score is intended to be good enough.
     
  • There are very few rewards for acing the driver's test. If you get a perfect score on the driver's test, you won't get a letter from Team Penske imploring you to try out to be one of their racecar drivers. NASA doesn't invite you to enroll in astronaut school.

Granted, for some people the driver's license exam is a stepping stone to harder license tests, such as for driving an 18-wheeler. For most people, though, it's the beginning and end of the line.

  • If the government makes the driver's test harder, teenagers will study more for it.

My impression is that the driver's test is more difficult than when I breezed through it in the lackadaisical 1970s. The average age when young adults get their first driver's license has gone up in many states. But for teens, this is a high-stakes test. So many work hard at studying for the written part and practicing for the behind-the-wheel part.

Obviously, once you come out and articulate this mindset, then the driver's license test sounds like a pretty dumb analogy for school achievement tests. Driver's licenses are absolute, school achievement tests relative—hence all the concern about The Gap.

It's easy to develop a more sensible goal than the NCLB's implicit intention of raising black and Hispanic average performance by about a standard deviation while simultaneously not letting whites and Asians improve (because that would merely perpetuate The Gap).

For example, we could try to raise everybody's absolute performance by half of a standard deviation. That would leave blacks and Hispanics better equipped to perform in the economy and in life.

Of course, given the nature of the IQ Bell Curve, it would also leave whites and Asians would still better equipped—but is our objective improving everyone's potential, or equality?

A good question. However, we needn't face it—because any common sense goals in education policy are unlikely as long as the conventional wisdom is protected from serious questioning.

Amusing end note: Ironically, both Ted Kennedy and George W. Bush had a lot of time during their adult years to think about drivers' licenses.

George W. Bush had his license suspended for either one month or two years (sources differ) after his drunk driving arrest in 1976.

And Teddy's punishment for killing that poor girl at Chappaquiddick was having his driver's license suspended too—for a whole year!

[Steve Sailer (email him) is movie critic for The American Conservative. His website www.iSteve.blogspot.com features his daily blog. His new book, AMERICA'S HALF-BLOOD PRINCE: BARACK OBAMA'S "STORY OF RACE AND INHERITANCE", is available here.]