It occurred to me that with so many reports of clinical studies quoted in the news, I should review this topic, or, rather, supplement my earlier blog which discussed the difference between absolute and relative risk, or the validity of a verbal report as compared to a published journal article, among other topics.
Let me begin by re-emphasizing that correlation is not causation, so that if people with disease A do or do not ingest substance B on a regular basis, this does not necessarily mean that you can reduce your risk of disease A by ingesting or refraining from substance B. However it is fair to say that if every study shows a correlation,(such as between smoking and lung cancer) then there probably is a causal relationship, bearing in mind that it is still possible that there is a common denominator of which we are unaware that links A and B. The classic example is the correlation between coffee drinking and coronary artery disease, where the confounding effect was that coffee drinkers were more likely to be smokers.
There are two classic errors that can occur in a clinical study. A type I error is when the scientist mistakenly believes that the intervention has an effect on the incidence of the disease. A type II error is when the scientist mistakenly believes that the intervention has no effect. Most studies are designed to minimize and accurately define the chance of a type I error, and a positive study is generally statistically accepted if the probability of a type I error is less than 5%. This means that there is one chance in twenty that the conclusion is wrong, which is why I tell my patients to wait for a second prospective study. The chance of a type I error can never be zero. A type II error is directly related to the square root of the number of people in the study, so that by including enough patients, the likelihood of a type II error can be made as small as you like (at an additional expense and inconvenience to the observer).
There are two types of studies: retrospective and prospective. A retrospective study interviews people with disease A and compares the results with a matched set of individuals (matched as to age, sex, race, and/or other qualities as closely as possible). The study then tries to find differences in their past history between the two groups: did they go to graduate school, do they exercise, do they take Vitamin E, or whatever the scientist feels like studying. It is statistically fairer and more rigorously accurate if you specify in advance what action you are studying, rather than just do a blanket survey for differences. By the laws of chance, you are guaranteed to make a type I error if you survey all interventions. To be called an accurate marksman you must specify which leaf on the tree you are trying to hit before you fire a bullet into its branches and see that you hit a leaf.
A prospective study can be defined so as to accurately calculate the chance of a type I or type II error. In this study, we take a matched set of patients, do intervention B on one set, and compare the incidence of disease A in both sets after a fixed time period, which may be as long as five years for a dietary intervention, and as short as 30 days if we are looking for the occurrence of a second heart attack after the first one. In this case we have definitely defined both the intervention and the disease for which we are looking.
Most of the medical news with which we are bombarded has to do with retrospective studies, and we are assailed with the information that intervention A is good or bad for you based on them. A retrospective study should only be used to suggest a hypothesis, which then should be tested by a prospective study. Any retrospective study is filled with pitfalls, beginning with relying on the subject's memory of what he or she did or didn't do, e.g. the number of aspirin you took per week, on the average, over the past five years. And then we have headlines such as "more coffee drinkers get pancreatic cancer" (subsequently shown to be untrue). But since every retrospective study shows that adult females who drink two or more cups of coffee a day have a lower incidence of adult onset diabetes, there probably is a causative link between the two.
Be especially wary of the statement that more people who live in a certain area have more or less than the nationwide incidence of a disease. By the laws of statistics, the population of most counties or cities will have either more or less than the national average of almost everything. We don't even have consistent results comparing the lifespans of taller and shorter people, or right-handed people to left-handed people.
When I read about the results of a retrospective study, I say to myself "Very interesting. I wonder what a prospective study would show". It has been my experience that most retrospective conclusions are not validated by the ensuing prospective study, and very often a second prospective study does not agree with the first. Lancet had back-to-back articles with opposite conclusions as to whether or not a low salt diet prevented hypertension.
One last point. Often it is noticed that people with a certain medical problem (e.g. ASCVD) have an abnormal test (such as the CRP). We then jump to the erroneous conclusion that lowering the CRP will reduce the incidence of coronary artery disease, which conclusion has never been validated. My analogy to this is that pulling down on the metal floor indicator in the lobby of an office building will not bring the elevator down.