I know there is COXPHFIT function in MATLAB to do Cox regression, but I have problems understanding how to apply it.
1) How to compare two groups of samples with survival data in days (survdays), censoring (cens) and some predictor value (x)? The groups defined by groups logical variable. Groups have different number of samples.
2) What is the baseline parameter in coxphfit? I did read the docs, but how should I choose the baseline properly?
It would be great if you know a site with good detailed examples on medical survival data. I found only the Mathworks demo that does not even mention coxphfit.
Do you know may be another 3rd party function for Cox regression?
UPDATE: The r tag added since the answer I’ve got is for R.
With survival analysis, the hazard function is the instantaneous death rate.
In these analyses, you are typically measuring what effect something has on this hazard function. For example, you may ask “does swallowing arsenic increase the rate at which people die?”. A background hazard is the level at which people would die anyway (without swallowing arsenic, in this case).
If you read the docs for
coxphfitcarefully, you will notice that that function tries to calculate the baseline hazard; it is not something that you enter.EDIT: MATLAB’s
coxphfitfunction doesn’t obviously work with grouped data. If you are happy to switch to R, then the anaylsis is a one-liner.ANOTHER EDIT: That
baselineparameter to MATLAB’scoxphfitappears to be a normalising constant. R’scoxphfunction doesn’t have an equivalent parameter. I looked in Statistical Computing by Michael Crawley and it seems to suggest that the baseline hazard isn’t important, since it cancels out when you calculate the likelihood of your individual dying. See Chapter 33, and p615-616 in particular. My knowledge of how the model works isn’t deep enough to explain the discrepancy in the MATLAB and R implementations; perhaps you could ask on the Stack Exchange Stats Analysis site.