Epidemiology - Drug Baron

Yearly Archive

Yearly Archives: 2012

June 11, 2012 no comments

Constructing better multivariate biomarker composites

The earliest biomarkers, such a body temperature or blood pressure, were single measurements that reflected multiple physiological processes. Today, though, our reductionist approach to biology has turned up the resolution of our lens: we can measure levels of individual proteins, metabolites and nucleic acid species, opening the biomarker floodgates.

But this increased resolution has not necessarily translated into increased power to predict. The principal use of biomarkers after all is to use things that are easy to measure to predict more complex biological phenomena. Unfortunately, the levels of most individual molecular species are, on their own, a poor proxy for physiological processes that involve dozens or even hundreds of component pathways.

The solution is to combine individual markers into more powerful signatures. Biomarkers like body temperature allow physiology to perform the integration step. But for individual molecular biomarkers that job falls to the scientist.

Unsurprisingly, the success of such efforts is patchy – simply because there are an infinite number of ways to combine individual molecular biomarkers into composite scores. How do you choose between linear and non-linear combinations, magnitude of coefficients and even at the simplest level which biomarkers to include in the composite score in the first place?

The first port of call is usually empiricism. Some form of prior knowledge is used to select an optimal combination. For example, I may believe that a couple of biomarkers are more likely to contribute than others and so I may give them a stronger weighting in my composite score. But with an infinite array of possible combinations it is hard to believe that this approach is going to come anywhere close to the optimum combination.

Unless you have a predictive dataset, however, this kind of ‘stab in the dark’ combination is the best you can do. Just don’t be surprised if the resulting composite score is worse than any of the individual biomarkers that compose it.

With a dataset that combines measurements of each individual biomarker and the outcome being modeled, more sophisticated integration strategies become possible. The most obvious is to test each individual marker in turn for its association with the outcome and then combine those markers that show a statistically significant association. Perhaps you might even increase the weighting of the ones that are most strongly associated.

But how powerful are these ad hoc marker composites?

From a theoretical perspective, one might imagine the answer is not very powerful at all. While common sense suggests that each time you add …

Yearly Archive

Yearly Archives: 2012

Constructing better multivariate biomarker composites

The interleukin lottery: playing the odds on numbers 9 and 16

Environmental Pollutants: Opening a Soup-Can of Worms

Yearly Archive