Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8762643
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T15:35:02+00:00 2026-06-13T15:35:02+00:00

Currently I am facing the following problem, which I’m working in Stata to solve.

  • 0

Currently I am facing the following problem, which I’m working in Stata to solve. I have added the algorithm tag, because it’s mainly the steps that I’m interested in rather than the Stata code.

I have some variables, say, var1 – var20 that can possibly contain a string. I am only interested in some of these strings, let us call them A,B,C,D,E,F, but other strings can occur also (all of these will be denoted X). Also I have a unique identifier ID. A part of the data could look like this:

ID  |  var1  |  var2  |  var3  |  ..  |  var20  
1   |   E    |        |        |      |    X
1   |        |   A    |        |      |    C
2   |   X    |   F    |   A    |      |   
8   |        |        |        |      |    E

Now I want to create an entry for every ID and for every occurrence of one of the strings A,B,C,E,D,F in any of the variables. The above data should look like this:

ID  |  var1  |  var2  |  var3  |  ..  |  var20
1   |    E   |        |        |  ..  |       
1   |        |    A   |        |      |       
1   |        |        |        |      |    C
2   |        |    F   |        |      |
2   |        |        |    A   |      |
8   |        |        |        |      |    E

Here we ignore every time there’s a string X that is NOT A,B,C,D,E or F. My attempt so far was to create a variable that for each entry counts the number, N, of occurrences of A,B,C,D,E,F. In the original data above that variable would be N=1,2,2,1. Then for each entry I create N duplicates of this. This results in the data:

ID  |  var1  |  var2  |  var3  |  ..  |  var20  
1   |   E    |        |        |      |    X
1   |        |   A    |        |      |    C
1   |        |   A    |        |      |    C
2   |   X    |   F    |   A    |      |   
2   |   X    |   F    |   A    |      |   
8   |        |        |        |      |    E

My problem is how do I attack this problem from here? And sorry for the poor title, but I couldn’t word it any more specific.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T15:35:03+00:00Added an answer on June 13, 2026 at 3:35 pm

    Sorry, I thought the finally block was your desired output (now I understand that it’s what you’ve accomplished so far). You can get the middle block with two calls to reshape (long, then wide).

    First I’ll generate data to match yours.

    clear
    set obs 4
    
    * ids
    generate n = _n
    generate id = 1 in 1/2
    replace id = 2 in 3
    replace id = 8 in 4
    
    * generate your variables
    forvalues i = 1/20 {
        generate var`i' = ""
    }
    replace var1 = "E" in 1
    replace var1 = "X" in 3
    replace var2 = "A" in 2
    replace var2 = "F" in 3
    replace var3 = "A" in 3
    replace var20 = "X" in 1
    replace var20 = "C" in 2
    replace var20 = "E" in 4
    

    Now the two calls to reshape.

    * reshape to long, keep only desired obs, then reshape to wide
    reshape long var, i(n id) string   
    keep if inlist(var, "A", "B", "C", "D", "E", "F")
    tempvar long_id
    generate int `long_id' = _n
    reshape wide var, i(`long_id') string
    

    The first reshape converts your data from wide to long. The var specifies that the variables you want to reshape to long all start with var. The i(n id) specifies that each unique combination of n and i is a unique observation. The reshape call provides one observation for each n–id combination for each of your var1 through var20 variables. So now there are 4*20=80 observations. Then I keep only the strings that you’d like to keep with inlist().

    For the second reshape call var specifies that the values you’re reshaping are in variable var and that you’ll use this as the prefix. You wanted one row per remaining letter, so I made a new index (that has no real meaning in the end) that becomes the i index for the second reshape call (if I used n–id as the unique observation, then we’d end up back where we started, but with only the good strings). The j index remains from the first reshape call (variable _j) so the reshape already knows what suffix to give to each var.

    These two reshape calls yield:

    . list n id var1 var2 var3 var20
    
         +-------------------------------------+
         | n   id   var1   var2   var3   var20 |
         |-------------------------------------|
      1. | 1    1      E                       |
      2. | 2    1             A                |
      3. | 2    1                            C |
      4. | 3    2             F                |
      5. | 3    2                    A         |
         |-------------------------------------|
      6. | 4    8                            E |
         +-------------------------------------+
    

    You can easily add back variables that don’t survive the two reshapes.

    * if you need to add back dropped variables
    forvalues i =1/20 {
        capture confirm variable var`i'
        if _rc {
            generate var`i' = ""
        }
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm facing a problem in C++ for which I currently don't have an elegant
I'm currently facing new problem with operators. Using following code, I want to make
I'm currently facing a problem which I can not resolve and I really don't
I'm facing the following problem and don't have an answer to it: We have
I'm currently working with an application where I'm facing an obvious problem. There are
I'm currently working a lot with DDD, and I'm facing a problem when loading/operating
I am currently fighting an old api and I am facing the following problem:
I have developed a Silverlight website. The problem i am facing currently is when
so I am working on a game, following a tutorial online. Currently I have
I'm currently facing the following issue: My app dynamically creates images (320 x 480

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.