Wednesday, February 10, 2010

survival

http://www.mail-archive.com/nmusers@globomaxnm.com/msg00398.html
COX Proportional Hazard Model with Time Dependent Covariate

Wang, Yaning (Tue, 14 Aug 2007 16:23:56 -0700)

Both Splus and SAS can do it. They both use the counting-process syntax. In SAS, follow Example 54.5: Time-Dependent Repeated Measurements of a Covariate:

proc phreg data=Tumor1;
model (T1,T2)*Status(0)=NPap;
OUTPUT OUT=Out1 SURVIVAL=sur/order=data;
id ID Time Dead;
run;

Even though SAS can correctly estimate the slope for the time-dependent covariate, the survival prediction (SURVIVAL=sur) in the output file (OUT=Out1) is wrong. SAS document explains how SAS calculates these surivial predictions here .

Unfortunately, SAS doesn't follow the right equation described in the document (Empirical Cumulative Hazards Function Estimates). Instead, SAS calculates S(t,x) = S(t,0) ** exp(beta*x), which only applies to time-independent covariate. But S(t,0) is correct from SAS if a new individual was created to have x=0 for all the time points. From S(t,0), you can write some code to follow the right equation in the SAS document to calculate S(t,x) correctly, which I tested to be consistent with Splus results.


Nick and Rene:
With the predicted S(t,x), a step function instead of a smooth function, you can simulate the event times. The hazard function can also be calculated. Of course, it is also a step function, not a smooth function. As a result, the integration step will be larger compared to NONMEM assuming a parametric smooth hazard function.

Yaning



Nick Holford (Thu, 05 Jul 2007 02:07:50 -0700)

Mike and Mike,

Thanks for your stimulating responses on this topic. I think I agree with practically everything you both have contributed but I think I need to expand more on the reasons why it seems to me that traditional time to event analysis has not really described the full story.

I accept (Mike C) that the Cox model assumes only that the undescribed hazard is proportional and not the same in both groups (for simplicity I will make remarks as if we are studying a clinical trial with a placebo and active drug group). Do you know of any realistic examples where the hazard might be proportional but not equal? I would think that in any randomized trial setting that the randomization would allow the assumption of equality.

It seems to me that there are three situations where standard textbooks and reports in the medical literature fail to deal fully with reality. I have had Collett's book with me for 18 months now and have read it from cover to cover. It is the best source I have found for a parametric hazard model perspective but it only hints at these issues.

...


Smith, Mike K (Wed, 04 Jul 2007 09:14:20 -0700)

Nick,

I would argue that parametric survival models are dependent on the "structural model" (Weibull, Exponential, Gompertz etc.) that you choose for the hazard function and so suffer the same issues as standard PK model building where the choice of covariates, error structures etc. depend on the correct choice of hazard model. The choice of model is
still an assumption...

On the other hand my understanding of Proportional Hazards models is that we don't necessarily care what the parametric form of the hazard is. We assume that the hazards changes proportionately with changes in the covariates (hence the name). Treatment, dose or exposure variable could be a covariate and although it is usually added in a linear form
it doesn't have to. In many cases the form or "shape" of the hazard function itself is a bit of a "nuisance variable" and what we want to know is the influencing factors on survival rates. In this case the proportional hazards model does just fine.

I'm hoping that your last paragraph was written at least partly tongue-in-cheek... I would argue that if the range of parametric hazard models you may have tried do not capture features in your data then you may want to examine proportional hazards models. There's a fairly huge statistical literature on these topics (and I have to confess I'm not an
expert by any means!).

A good reference book is by D. Collett: "Modelling Survival Data in Medical Research", Chapman & Hall / CRC Press. 2003.

Mike




Rene Bruno (Wed, 04 Jul 2007 10:25:17 -0700)

All,

I agree it depends on what you want to do with the model. My understanding is that if you want to simulate events (e.g. to simulate clinical trials) you need to use a parametric model. The Cox model only allows assessing risk ratio as a function of covariates.

Am I correct?

Rene




Mike Cole (Wednesday, July 04, 2007 5:30 PM)

Nick

I've come in at the end of this email exchange but felt I had to 'defend' the extremely widely used method of Cox regression and the proportional hazards model. There are several advantages which you don't spell out in you email and a couple of inaccuracies as well.

1. There is both a non-parametric version of the proportional hazards model (the widely used Cox regression model) and a parametric version which assumes a parametric form for the survival times but still retains the proportionality assumption.

2. The hazard function in the proportional hazards model is NOT assumed to be the same for each treatment group they are assumed to be proportional, pedantic maybe but needed to be spelt out for clarity. The proportionality assumption is often valid for survival data. There are extensions to the Cox model which allows different hazard functions between subgroups or strata.

3. The choice of survival time distribution is often difficult to justify with parametric models. That said, when this is possible the parametric model allows a greater degree of interpretation and provides more precise parameter estimates.

4. Whereas a parametric survival model is limited by the flexibility of the chosen survival distribution (and corresponding shape of the hazard function) the semi-parametric Cox method estimates this in a non-parametric way and so is extremely flexible.

5. "So it depends what you want -- if you just want to collect P values then use the semi parametric method. But if you want to understand the biology of the disease and the effects of drug treatments you need to seriously consider the parametric method." I would suggest that Professor Sir David Cox the originator of this method would have a few choice words to say about this remark :-) By the way he has written over 300 papers or books and the original paper has now been cited over 22,000 times.

Finally (and I might be opening myself up to a torrent of emails here, but why would you want to analysis survival data in NONMEM when this is covered so comprehensively in other software packages and many scripts are available to use in R/Splus??

Mike



Nick Holford (03 July 2007 22:59)

Jeff,

Thanks for highlighting the time to event analysis terminology issue. I think nmusers need to pay particular attention to a major difference between two classes of methods.

1. The Cox proportional hazards model is a semiparametric method that is used to describe the difference between treatments. It assumes the underlying hazard for both treatments is the same.

2. Parametric methods (e.g. using the Weibull distribution) try to describe the undelying hazard for each treatment and do not require the assumption that the underlying hazard is the same.

The semiparametric method is somewhat similar to doing a bioequivalence analysis with NCA. It can tell you about the difference between the two formulations under the assumption that the clearance is the same but it doesnt tell you the underlying PK parameters (clearance, volume, absorption rate constant etc) and cannot make predictions of the time course of concentration. The parametric time to event method describes the full hazard function but is dependent on assuming a particular model -- just like assuming a specific
compartmental model and input function in compartmental PK.

As nmusers will appreciate, one can learn and understand much more from a compartmental model than one can from doing a bioequivalence analysis. The parametric approach does not require the restrictive assumption that the underlying hazard is the same for both treatments (which is analogous to having to assume clearance is the same for a bioequivalence analysis).

So it depends what you want -- if you just want to collect P values then use the semiparametric method. But if you want to understand the biology of the disease and the effects of drug treatments you need to seriously consider the parametric method.

Nick




from http://groups.google.com/group/medstats/browse_thread/thread/a1ceade648118a7d

This is a very difficult question to answer. There are others on this list who can offer a more authoritative response, but let me share what little I know.

It's not quite what you want, but there is a review article in JASA:

The Price of Kaplan-Meier. Meier P, Karrison T, Chappell R, Xie H. Journal of the American Statistical Association 2004: 99(467); 890-896.

This article refers to another article by Rupert Miller published in 1983 with the provocative title "What Price Kaplan-Meier?".

Kaplan-Meier and Cox regression are not the same, but both do not rely on parametric assumptions and thus pay a "price" relative to a correctly specified parametric model.

Cox regression is popular because it is (a) easy and (b) (as you noted) safe. By "safe" I mean "more likely to satisfy the underlying assumptions than a parametric model". It's harder to make an unsupportable assumption with Cox regression.

A parametric model allows you to incorporate Bayesian priors. I don't think you can fit a Bayesian form of Cox regression, but I could be wrong. Parametric models offer various goodness of fit measures that can avoid some of the problems with unsupportable assumptions.

Frank Harrell gave a nice web seminar a few months ago about this issue, but I missed it so can't share his perspective on the issue. The handout from the webinar is here

Since there is no strong consensus in the research community, you can use either approach and get published. I also suspect that the differences between the parametric and non-parametric models are small in a practical sense, but can't offer any data to support this hunch. In general, though, I think the research community focuses too much attention on data analysis issues and not enough on data collection issues. After all, if you collect the wrong data, it doesn't matter how fancy your analysis is.

I hope this helps.
--
Steve Simon, Standard Disclaimer

No comments: