Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8794967
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T23:19:23+00:00 2026-06-13T23:19:23+00:00

I was working previously with SAS and then decided to shift to R for

  • 0

I was working previously with SAS and then decided to shift to R for academic requirements reasons.
My data (healthdemo) are health data containing some health diagnostic codes (ICD-10), I want to separate these codes into different columns. This is part of str(healthdemo):

$ PATIENT_KEY     : int  7391510 7404298 7390196 7381208 7401691 7381223 7383005 10188634 7384574 7398317 ...
 $ ICDCODE         : Factor w/ 1125 levels "","H00","H00.0",..: 654 56 654 654 665 48 90 679 654 654 ...
 $ PATIENT_ID      : int  39387 50244 38388 27346 49922 27901 27867 61527 33186 45309 ...
 $ DATE_OF_BIRTH   : Factor w/ 14801 levels "","01/01/1000",..: 7506 10250 52 73 94 6130 85 2710 95 100 ...

the ICDCODE contains many diseases from H00 to J99; first, I separated the letters from numbers in the ICDCODE

healthdemo$icd_char = substr(healthdemo$ICDCODE,1,1)
healthdemo$icd_num = substr(healthdemo$ICDCODE,2,2)

then I created diseases columns by this function:

healthdemo$cvd = 0
healthdemo$ihd = 0
healthdemo$mi = 0
healthdemo$dys = 0
healthdemo$afib = 0
healthdemo$chf = 0

now I want to apply a function similar to this SAS function (that I used to use):

if icd_char = 'I' and 01 <= icd_num < 52 then cvd = 1;

if icd_char = 'I' and 20 <= icd_num <= 25 then ihd = 1;

if icd_char = 'I' and 21 <= icd_num <= 22 then mi = 1;

if icd_char = 'I' and 46 <= icd_num <= 49 then dys = 1;

if icd_char = 'I' and icd_num = 48 then afib = 1;

this function will assign each patient with the given ICD character and ICD-number into cvd=1 (e.g.) and so on.

I tried using these functions in R but they didnt work for me:

healthdemo$cvd[healthdemo$icd_char == 'I' & 01 <= healthdemo$icd_num 
      & healthdemo$icd_num < 52 ] <- 1

and this

if (healthdemo$icd_char == "I" &  01 < = healthdemo$icd_num < 52  )
   {healthdemo$cvd <- 1} 

Would somebody help me please ?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T23:19:24+00:00Added an answer on June 13, 2026 at 11:19 pm

    I had a similar struggle when I transitioned from SAS to R for health-related research. My solution was to, as much as possible, let go the “if…then” approach and take advantage of some of R’s unique native programming capabilities. Here are two approaches to your problem.

    First, you can use indexing to find and replace elements. Here is some hospital discharge data of the kind you describe:

    hosp<-read.csv(file="http://www.columbia.edu/~cjd11/charles_dimaggio/DIRE/resources/R/sparcsShort.csv",stringsAsFactors=F)
    head(hosp)
    

    Say I want to identify every birth-related diagnosis in Manhattan. I first create a logical vector that returns a series of TRUES and FALSES for my search criteria, then I index my data frame by that logical vector. In this case I am also restricting the columns or variables I want returned:

    myObs<-hosp$county==59 & hosp$pdx=="V3000 " #note space
    myVars<-c("age", "sex", "disp")
    myFile<-hosp[myObs,myVars]
    head(myFile)
    

    The second, and perhaps more computationally elegant, approach is to use a function like “grep”. Say you’re interested in identifying all substance abuse diagnoses, e.g. alcohol abuse (291, 303, 305 and sub-codes), opioids, cannabis, amphetamines, hallucinogenics, and cocaine (304 and related sub-codes), or non-specific substance abuse-related diagnoses (292). In SAS you would write out a long if-then statement (or a more efficient array) of some kind:

    #/*********************** SUBSTANCE ABUSE *****************/
    #if pdx in /* use ICD9 codes to create diagnoses */ (’2910’,’2911’,’2912’,’2913’,’2914’,’2915’,
    #   ’29181’,’29189’, ’2919’,’2920’,’29211’,’29212’,’2922’,’29281’,’29282’,’29283’, #........etc....,’30592’,’30593’)
    #Then subst_ab=1; 
    #Else subst_ab=0;
    

    In R, you can instead write:

    substance<-grep("^291[0-9,0-9]|^292[0-9,0-9]|^303[0-9,0-9]|^304[0-9,0-9]^305[0-9,0-9]", hosp$pdx)
    hosp$pdx[substance]
    hosp$subsAb<-"No"
    hosp$subsAb[substance]<-"Yes"
    hosp$subsAb[1:100]
    
    table(hosp$subsAb)
    plot(table(hosp$subsAb))
    
    library(ggplot2)
    qplot(subsAb, age,data=hosp, alpha = I(1/50))
    

    Tomas Aragon has written a wonderful introduction to R for epidemiologists that goes into these approaches in detail. (http://www.medepi.net/docs/ph251d_fall2012_epir-chap01-04.pdf)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I was previously working with Ember.StateManager and now I'm doing some tests before I
I am working on some code that previously was using a cfquery, and is
For some reason, while previously working just fine, have el and events bindings now
I am working with a dataset in SAS, containing many variables. One of these
A previously working Ecplise now gives me the error Java Virtual Machine Launcher Could
I previously worked a lot with Java and now I am working more with
I've got a Visual Studio 2010 C++ project. Previously everything was working fine but
I am a developer previously working on SQL Server and Windows platform. I am
I have checked with a previously working version with firebug every line object has
I am using Uploadify and something which was previously working now isn't and I'm

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.