╨╧рб▒с>■ BD■ A ье┴M Ё┐Э6bjbjт=т= "PАWАW0 l^^^^^^^r6666<rrг╢ЪЪЪЪЪЪB▄°фцццццц$Y
yp
9^ЪЪ
Ї^^ЪЪC$ЇЇЇФ^Ъ^ЪфЇфЇ╛Ї▓4^^АЪО SД═A─r─6ЬLАdg<гT,щиLщАЇrr^^^^┘Supplementary Online Material
A structural approach to selection bias
Miguel A. Hernсn, Sonia Hernсndez-Dэaz, James M. Robins
A.2. Hazard ratios as effect measures
The causal DAG in Appendix Figure 1a describes a randomized study of the effect of surgery E on death at times 1 (D1) and 2 (D2). Suppose the effect of exposure on D1 is protective. Then the lack of an arrow from E to D2 indicates that, although the exposure E has a direct protective effect (decreases the risk of death) at time 1, it has no direct effect on death at time 2. That is, exposure does not influence the survival status at time D2 of any subject who would survive past time 1 when unexposed (and thus when exposed). Suppose further that U is an unmeasured haplotype that decreases the subjectТs risk of death at all times. The associational risk ratios ARRED1 and ARRED2 are unbiased measures of the effect of E on death at times 1 and 2, respectively. (Because of the absence of confounding, ARRED1 and ARRED2 equal the causal risk ratios CRRED1 and CRRED2, respectively) Note that, even though E has no direct effect on D2, ARRED2 (or, equivalently, CRRED2 ) will be less than 1.0 because it is a measure of the effect of E on total mortality through time two.
Consider now the time-specific associational hazard (rate) ratio as an effect measure. In discrete time, the hazard of death at time 1 is the probability of dying at time 1 and thus is the same as ARRED1. However the hazard at time 2 is the probability of dying at time 2 among those who survived past time 1. Thus the associational hazard ratio at time 2 is then ARRED2|D1=0. The square around D1 in Appendix Figure 1a indicates this conditioning. Exposed survivors of time 1 are less likely than unexposed survivors of time 1 to have the protective haplotype U (because exposure can explain their survival) and therefore are more likely to die at time 2. That is, conditional on D1=0, exposure is associated with a higher mortality at time 2. Thus the hazard ratio at time 1 is less than 1.0 whereas the hazard ratio at time 2 is greater than 1.0, i.e., the hazards have crossed. We conclude that the hazard ratio at time 2 is a biased estimate of the direct effect of exposure on mortality at time 2. The bias is selection bias arising from conditioning on a common effect D1 of exposure and of U, which is a cause of D2 that opens the non-causal (i.e., associational) path E Т! D1Т! UТ! D2 between E and D2.13 In the survival analysis
literature an unmeasured cause of death that is marginally unassociated with exposure such as U is often referred to as a frailty.
In contrast to the above, the conditional hazard ratio ARRED2|D1=0,U at D2 given U is equal to 1.0 within each stratum of U because the path E Т! D1Т! UТ! D2 between E and D2 is now blocked by conditioning on the noncollider U. Thus the conditional hazard ratio correctly indicates the absence of a direct effect of E on D2. The fact that the unconditional hazard ratio ARRED2|D1=0 differs from the common-stratum specific hazard ratios of 1.0 even though U is independent of E, shows the non collapsibility of the hazard ratio.26
Unfortunately, the unbiased measure ARRED2|D1=0,U of the direct effect of E on D2 cannot be computed because U is unobserved. In the absence of data on U, it is impossible to know whether or not exposure has a direct effect on D2. That is the data cannot determine whether the true causal DAG generating the data was that in Appendix Figure 1a versus that in Appendix Figure 1b.
A.3. Effect modification and common effects in DAGs
Although an arrow on a causal DAG represents a direct effect, a standard causal DAG does not distinguish a harmful effect from a protective effect. Similarly, a standard DAG does not indicate the presence of effect modification. For example, although Appendix Figure 1a implies that both E and U affect death D1, the DAG does not distinguish among the following three qualitatively distinct ways that U could modify the effect of E on D1:
the causal effect of exposure E on mortality D1 is in the same direction (i.e., harmful or beneficial) in both stratum U=1 and stratum U=0.
the direction of the causal effect of exposure E on mortality D1 in stratum U=1 is the opposite of that in stratum U=0 (i.e., there is a qualitative interaction between U and E)
exposure E has a causal effect on D1 in one stratum of U but no causal effect in the other stratum, e.g., E only kills subjects with U=0.
Because standard DAGs do not represent interaction, it follows that it is not possible to infer from a DAG the direction of the conditional association between two marginally independent causes (E and U) within strata of their common effect D1. For example, suppose that, in the presence of an undiscovered background factor V that is unassociated with E or U, having either E=1 or U=1 is sufficient and necessary to cause death (an СorТ mechanism), but that neither E nor U causes death in the absence of V. Then among those who died by time 1 (D1=1), E and U will be negatively associated, as it is more likely that an unexposed subject (E=0) had U=1 because the absence of exposure increases the chance that U was the cause of death. (Indeed the logarithm of the conditional odds ratio ORUE|D1=1 will approach minus infinity as the population prevalence of V approaches 1.0.) Although this СorТ mechanism was the only explanation given in the main text for the conditional association of independent causes within strata of a common effect; nonetheless, other possibilities exist. For example, suppose that in the presence of the undiscovered background factor V, having both E=1 and U=1 is sufficient and necessary to cause death (an СandТ mechanism) and that neither E nor U causes death in the absence of V. Then, among those who die by time 1, those who had been exposed (E=1) are more likely to have the haplotype (U=1), i.e., E and U are positively correlated. A standard DAG such as that in Appendix Figure 1a fails to distinguish between the case of E and U interacting through an СorТ mechanism from the case of an СandТ mechanism.
Although conditioning on common effect D1 always induces a conditional association between independent causes E and U in at least one of the two strata of D1 (say, D1=1), there is a special situation under which E and U remain conditionally independent within the other stratum (say, D1=0). This situation occurs when the data follow a multiplicative survival model. That is, when the probability, Pr[D1=0| U=u, E=e], of survival (i.e., D1=0) given E and U is equal to a product g(u) h(e) of functions of u and e. The multiplicative model Pr[D1=0| U=u, E=e]= g(u) h(e) is equivalent to the model that assumes the survival ratio Pr[D1=0| U=u, E=e] / Pr[D1=0| U=0, E=0] does not depend on u and is equal to h(e) . (Note that if Pr[D1=0| U=u, E=e] = g(u) h(e), then Pr[D1=1| U=u, E=e] = 1 Ц [g(u) h(e)] does not follow a multiplicative mortality model. Hence, when E and U are conditionally independent given D1=0, they will be conditionally dependent given D1=1.)
Biologically, this multiplicative survival model will hold when E and U affect survival through totally independent mechanisms in such a way that U cannot possibly modify the effect of E on D1, and vice versa. For example, suppose that the surgery E affects survival through the removal of a tumor whereas the haplotype U affects survival through increasing levels of LDL-cholesterol levels resulting in an increased risk of heart attack (whether a tumor is present or not), and that death by tumor and death by heart attack are independent in the sense that they do not share a common cause. In this scenario we can consider two cause-specific mortality variables: death from tumor D1A and death from heart attack D1B. The observed mortality variable D1 is equal to 1 (death) when either D1A or D1B is equal to 1, and D1 is equal to 0 (survival) when both D1A and D1B equal 0. We assume the measured variables are those in Appendix Figure 1a so data on underlying cause of death is not recorded. Appendix Figure 2 is an expansion of Appendix Figure 1a that represents this scenario (variable D2 is not represented because it is not essential to the current discussion). Because D1=0 implies both D1A=0 and D1B=0, conditioning on observed survival (D1=0) is equivalent to simultaneously conditioning on D1A=0 and D1B=0 as well. As a consequence we find by applying d-separation13 to Appendix Figure 2 that E and U are conditionally independent given D1=0, i.e., the path, between E and U through the conditioned on collider D1 is blocked by conditioning on the non-colliders D1A and D1B.8 On the other hand, conditioning on D1=1 does not imply conditioning on any specific values of D1A and D1B as the event D1=1 is compatible with three possible unmeasured events D1A=1 and D1B=1, D1A=1 and D1B=0, and D1A=0 and D1B=1. Thus the path between E and U through the conditioned on collider D1 is not blocked, and thus E and U are associated given D1=1.
What is interesting about Appendix Figure 2 is that by adding the unmeasured variables D1A and D1B, which functionally determine the observed variable D1, we have created a somewhat nonstandard DAG that succeeds in representing both the conditional independence between E and U given D1=0 and the their conditional dependence given D1=1. As far as we are aware, this is the first time such a conditional independence structure has been represented on a DAG.
If E and U affect survival through a common mechanism then there will exist an arrow either from E to D1B or from U to D1A, as shown in Appendix Figure 3a. In that case the multiplicative survival model will not hold, and E and U will be dependent within both strata of D1. Similarly if the causes D1A and D1B are not independent because of a common cause V as shown in Appendix Figure 3b, the multiplicative survival model will not hold, and E and U will be dependent within both strata of D1.
In summary, conditioning on a common effect always induces an association between its causes, but this association may be restricted to certain levels of the common effect.
A.4. Generalizations of structure (3)
Consider Appendix Figure 4a representing a study restricted to firefighters (F=1). E and D are unassociated among firefighters because the path EFACD is blocked by C. If we then stratify on the covariate C as in Appendix Figure 4b, E and D are conditionally associated among firefighters in a given stratum of C; yet C is neither caused by E nor by a cause of E. This example demonstrates that our previous formulation of structure (c) is insufficiently general to cover examples in which we have already conditioned on another variable F before conditioning on C. Note that one could try to argue that our previous formulation works by insisting that the set (F,C) of all variables conditioned be regarded as a single super-variable and then apply our previous formulation with this super-variable in place of C. This fix-up fails because it would require E and D to be conditionally associated within joint levels of the super variable (C, F) in Appendix Figure 4a as well, which is not the case.
However, a general formulation that works in all settings is the following. A conditional association between E and D will occur within strata of a common effect C of two other variables, one of which is either the exposure or statistically associated with the exposure and the other is either the outcome or statistically associated with the outcome.
Clearly, our earlier formulation is implied by the new formulation and, furthermore, the new formulation gives the correct results for both Appendix Figures 4b and 4c. A drawback of this new formulation is that it is not stated purely in terms of causal structures, as it makes reference to (possibly non-causal) statistical associations. Now it actually is possible to provide a fully general formulation in terms of causal structures but it is not particularly intuitive, simple or helpful, and so we will not give it here.
Figure Legends
Appendix Figure 1: Effect of exposure on survival
Appendix Figure 2: Multiplicative survival model
Appendix Figure 3: Multiplicative survival model does not hold
Appendix Figure 4: Conditioning on two variables
GАеж#$%KL{|АБВйк`ab═╬DFGOQRz{╨╥╙█▌▐
45NOPUWXoqr┌█д ж з K
M
N
P
Q
S
g
h
i
ЕЖЗ
&
'
=
>
?
,.8:<¤°ЎЎЎЇЎЇЇЎЎЇЎЎЇЎЇЁЇЁЎЇЁЇЁЇЁЇЁЎЎЇЇЁЇЁюЇЁЇЁЇЁЇЎЇЎЎЇЎЇЎЎЇЇЇЎЎЇ>*EH· H*H*6БmH
sH
5Б^GАж▄v|╨LА7┼ўўїўўєъхъъъхъЫI
&FdрE╞А╪2{Ж)dрД╨dр`Д╨$dрa$Э6■<>Btv24ЁЇЎ·№ rtвд░▓─╞╨╥╘:<ЁЄ·№■13467;ЕЖЩЫ═╧ў∙·№¤ !=>hi│┤╡LАабжз╡╢╖./345UVdefп░┐└Її¤°ЎЇЁЇЁЇЎЇЎЎЇЇЎЎЇЎЎЎЇЇЁЇЁЇЎЎ¤ЇЁЇЁЇЎЎЇЎЎЎЇюЎЎЎЇЎЎЎЇЎЎЇЎЎЎЎ5БEH· H*H*6БmH
sH
H*^89notuЩЪЫопст№¤┼╞╦╠єЇїGHcdhiyzАБ╒╓█▄№¤$%&+,12ВГЛМ╔╩ ^_pqПРЮЯжз√№"#fgТУЮЯде"#ЧШЩ▐▀фх¤√√√√√¤√√√√√√¤√√√√√√√√√¤√√√√√ў¤ўє¤√√√√√√√√√√√√√√√¤√√EH· H*6БH*6БH*`┼wp4"╙)Ю+Н-:.a.╡kbYYYYYTdрД╨dр`Д╨Дhdр`ДhI
&FdрE╞А╪2{Ж)I
&FdрE╞А╪2{Ж)
DEJKМНО
& ' ( 2 3 8 : P Q R S U V W X j k p q П Р С Х Ц Ч Ы Ь Э Я б в г е ж з и щ ъ ы я Ё ё ї Ў ў ■ !!!!
!3!4!5!6!K!L!M!Q!R!S!W!X!Y!]!^!_!`!b!c!d!e!p!q!r!v!w!¤√¤√¤¤¤√¤√¤¤¤¤√¤¤¤¤¤¤¤¤¤√¤¤¤¤¤¤¤¤√¤¤¤¤√¤¤¤¤¤√¤¤¤¤¤¤¤¤√¤H*6Бcw!x!|!}!~!З!И!Й!К!М!Н!О!П!╨!╤!╓!╫!№!¤!■!-"."/"t"u"z"{"╞"╟"э"ю"Є"є"Ї",#-#t#u#▀$р$с$т$ $%%%$%%%&%I%J%K%L%P%Q%R%S%g%h%i%Н%О%П%Р%Х%Ц%Ч%Ш%y&z&|&╧&╨&╤&с&т&у&ф&ы&ь&э&ю&я&'''K'L'M'N'U'V'W'X'Х'Ч'▓'│'¤¤¤¤¤¤¤¤¤√¤√¤¤¤¤¤√¤¤¤√ў¤√ў¤√¤√ў¤√ў¤√¤√ў¤√ў¤√¤√¤√ў¤√ў√¤√¤√ў¤√ўї¤H*6БH*H*6Бa│'╕'╣'▐'▀'р'№'¤'(((()(*([(\(](^(c(d(e(f(g(i(Н(О(П(╚(╔(╩(╦(╨(╤(╥(╙(с(т(у())))$)%)&)')*),)-).)5)6)7)8)?)A)B)C)J)K)L)M)h)i)n)o)Ф)Х)Ц)░)▒)╢)╖)═)╬)╧)**+*,*-*2*3*4*5*j*k*l*с*т*ч*ш*я*Ё*ё*+ +!+б+в+¤¤√¤¤¤√¤√ў¤√ўї¤√¤√ў¤√ў¤√¤√ў¤√ў¤√ў¤√ў¤√ў¤√ў¤¤¤√¤¤¤√¤√ў¤√ў¤√¤¤¤√¤√¤H*6БH*H*6Бaв+з+и+ +,,,,,,,,,,,|,},В,Г,м,н,о,╚,╔,╩,╦,╨,╤,╥,╙,--Y-Z-_-`-Й-К-Л-:.a.о.п.╡.╢.╗.╝.Є.ў.//.///J/K/P/Q/Ш/Щ/Я/а/╢/╖/╩/╦/{0|0Ф0Х0°0∙0·0√0╝1╜1┬1├12222╕2╣2╛2┐2ь2э2╣5╚5Ь6Э6¤¤¤√ў¤¤√ў¤¤¤√¤√ў¤√ў¤¤¤¤√ї¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ёё5Б\Б5Б6БH*H*6БZa.J2к3╕5╚5·5,6k6Ь6Э6ЎЎь▀▌▌▌▌█
$Д╨dрC$`Д╨a$
Д╨dрC$`Д╨Д╨dр`Д╨ 1Рh░╨/ ░р=!░"░#Ра$Ра%░
i8@ё 8NormalCJ_HaJmH sH tH 6@6 Heading 1$dр@&5Б@@@ Heading 2$$dр@&a$5БCJ <A@Є б<Default Paragraph Font>+@Є>Endnote Textdh1$7$8$H$0P GАж▄░ 3
E┴
ї
м:ьwхйH#%'п'╓'┐+--/=/o/б/р/00Ш0ААШ0АА0ААШ0АШ0А0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ 0ААШ 0ААШ 0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0ААШ0АА<w!│'в+Э6 !#$%&┼a.Э6"'Э6 OLE_LINK3 OLE_LINK1 OLE_LINK5Ыd$0дf$0x~рщЁ
Ї
ИМ№▀шФ!Ь!┬!╞!╧!##0rt▄╚# Щ ░ ║ Й
Т
vн╖Eмп:=ьЇwTsv╜└9┘ ╪!с!╟"*(^(|(~(
*n*p*+033333333333333GGWWmm~п ░ 00 epi;F:\MANUSCRIPTS - IN REVISION\03-136 Hernan Article Plus.docepiqC:\Documents and Settings\epi\Application Data\Microsoft\Word\AutoRecovery save of 03-136 Hernan Article Plus.asdepi!A:\03-136 Hernan Article Plus.docъSWPyI h
Д╨ДШ■╞╨^Д╨`ДШ■ЗhИH)Рh
ДаДШ■╞а^Да`ДШ■ЗhИH.Тh
ДpДL ╞p^Дp`ДL ЗhИH.Рh
Д@ДШ■╞@^Д@`ДШ■ЗhИH.Рh
ДДШ■╞^Д`ДШ■ЗhИH.Тh
ДрДL ╞р^Др`ДL ЗhИH.Рh
Д░ДШ■╞░^Д░`ДШ■ЗhИH.Рh
ДАДШ■╞А^ДА`ДШ■ЗhИH.Тh
ДPДL ╞P^ДP`ДL ЗhИH.ъSW ░ ┴
w!ЧAH#п'┐+=/00 @АИ.t
4u u
u0p@ppp$@ Unknownrobins GРЗ: Times New Roman5РАSymbol3&РЗ: Arial"qМЁ╨h─Е&─Е&Їг'Ty#Ёа┤┤ББ20н0w2ГQЁ Supplementary Online Materialepiepi■ рЕЯЄ∙OhлС+'│┘0|РШ└╠╪фЁ
8D
P\dltфSupplementary Online MaterialMiuppepipipiNormal.dotrepi2iMicrosoft Word 9.0e@@В^w═A─@В^w═A─Їг'■ ╒═╒Ь.УЧ+,∙о0hpФЬдм┤╝─╠
╘■фDell Computer CorporationiTн0э Supplementary Online MaterialTitle
!"#$%&'(■ *+,-./0■ 2345678■ :;<=>?@■ ¤ C■ ■ ■ Root Entry └F└Д═A─EА1Table )WordDocument "PSummaryInformation( 1DocumentSummaryInformation8 9CompObj jObjectPool └Д═A─└Д═A─ ■ ■
└FMicrosoft Word Document
MSWordDocWord.Document.8Ї9▓q