From the Washington Post
Perls and his colleagues analyzed the genes of participants in the New England Centenarian Study, which is the largest study of centenarians and their families in the world. The study involves about 1,600 centenarians and has been ongoing since 1995."A lot of people might ask, `Well, who would want to live to 100?` because they think they have every age-related disease under the sun and are on death`s doorstep," Perls said. "But this isn`t true. We have noted in previous work that 90 percent of centenarians are disability-free at the average age of 93."They also noticed that longevity seemed to run in centenarians` families, indicating that genetics must play a role.So the researchers compared the genes of 1,055 centenarians with 1,267 other people to see whether they could identify any unique patterns. Based on that work, the researchers identified 150 genetic variations that appeared to be associated with longevity that could be used to predict with 77 percent accuracy whether someone would live to be at least 100."Seventy-seven percent is a very high accuracy for a genetic model, which means that the traits that we are looking at have a very strong genetic base," said Paola Sebastiani, a professor of biostatistics at the Boston University School of Public Health who helped conduct the study.
I`m not really interested in this topic enough to go find the paper, but my question would be whether the researchers split their sample in half, data-mined one half, then tested their findings on the other half. That`s proper research hygiene so that you don`t just come up with a lot of small, random associations.
But it`s hard to make yourself do it. I remember taking a finance course at UCLA in 1981 where we had to do a SAS analysis of a hypothesis about patterns in the stock market to test the Efficient Markets Hypothesis. So, I typed in from the Baseball Encyclopedia the dates of World Series home games involving the Yankees going back to 1921, and the stock market volume and change in Dow Jones average. The professor had said over and over that you had to divide your sample size in two, but there just weren`t enough home games, so I didn`t do it and got only a B on the project. (I didn`t find any effect on prices, but NYSE volume was down on days of Yankee World Series home games.)