Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6692887
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T05:55:09+00:00 2026-05-26T05:55:09+00:00

I’m trying to wrap my head around closures, and I think I’ve found a

  • 0

I’m trying to wrap my head around closures, and I think I’ve found a case where they might be helpful.

I have the following pieces to work with:

  • A set of regular expressions designed to clean state names, housed in a function
  • A data.frame with state names (of the standardized form that the function above creates) and state ID codes, to link the two (the “merge map”)

The idea is, given some data.frame with sloppy state names (is the capital listed as “Washington, D.C.”, “washington DC”, “District of Columbia”, etc.?), to have a single function return the same data.frame with the state name column removed and only the state ID codes remaining. Then subsequent merges can happen consistently.

I can do this in any number of ways, but one way that seems to be particularly elegant would be to house the merge map and the regular expression and the code process everything inside a closure (following the idea that a closure is a function with data).

Question 1: Is this a reasonable idea?

Question 2: If so, how do I do it in R?

Here’s a stupid simple clean state names function that works on the example data:

cleanStateNames <- function(x) {
  x <- tolower(x)
  x[grepl("columbia",x)] <- "DC"
  x
}

Here’s some example data that the eventual function will be run on:

dat <- structure(list(state = c("Alabama", "Alaska", "Arizona", "Arkansas", 
"California", "Colorado", "Connecticut", "Delaware", "District of Columbia", 
"Florida"), pop08 = structure(c(29L, 44L, 40L, 18L, 25L, 30L, 
22L, 48L, 36L, 13L), .Label = c("1,050,788", "1,288,198", "1,315,809", 
"1,316,456", "1,523,816", "1,783,432", "1,814,468", "1,984,356", 
"10,003,422", "11,485,910", "12,448,279", "12,901,563", "18,328,340", 
"19,490,297", "2,600,167", "2,736,424", "2,802,134", "2,855,390", 
"2,938,618", "24,326,974", "3,002,555", "3,501,252", "3,642,361", 
"3,790,060", "36,756,666", "4,269,245", "4,410,796", "4,479,800", 
"4,661,900", "4,939,456", "5,220,393", "5,627,967", "5,633,597", 
"5,911,605", "532,668", "591,833", "6,214,888", "6,376,792", 
"6,497,967", "6,500,180", "6,549,224", "621,270", "641,481", 
"686,293", "7,769,089", "8,682,661", "804,194", "873,092", "9,222,414", 
"9,685,744", "967,440"), class = "factor")), .Names = c("state", 
"pop08"), row.names = c(NA, 10L), class = "data.frame")

And a sample merge map (the actual one links FIPS codes to states, so it can’t be trivially generated):

merge_map <- data.frame(state=dat$state, id=seq(10) )

EDIT Building off of crippledlambda’s answer below, here’s an attempt at the function:

prepForMerge <- local({
  merge_map <- structure(list(state = c("alabama", "alaska", "arizona", "arkansas",  "california", "colorado", "connecticut", "delaware", "DC", "florida" ), id = 1:10), .Names = c("state", "id"), row.names = c(NA, -10L ), class = "data.frame")
  list(
    replace_merge_map=function(new_merge_map) {
      merge_map <<- new_merge_map
    },
    show_merge_map=function() {
      merge_map
    },
    return_prepped_data.frame=function(dat) {
      dat$state <- cleanStateNames(dat$state)
      dat <- merge(dat,merge_map)
      dat <- subset(dat,select=c(-state))
      dat
    }
  )
})

> prepForMerge$return_prepped_data.frame(dat)
        pop08 id
1   4,661,900  1
2     686,293  2
3   6,500,180  3
4   2,855,390  4
5  36,756,666  5
6   4,939,456  6
7   3,501,252  7
8     591,833  9
9     873,092  8
10 18,328,340 10

Two problems remain before I’d consider this question solved:

  1. Calling prepForMerge$return_prepped_data.frame(dat) is painful each time. Any way to have a default function such that I could just call prepForMerge(dat)? I’m guessing not given how it’s implemented, but perhaps there’s at least a convention for the default fxn….

  2. How do I avoid mixing the data and code in the merge_map definition? Ideally I’d clean merge_map elsewhere, then just grab it inside the closure and store that.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T05:55:10+00:00Added an answer on May 26, 2026 at 5:55 am

    I may be missing the point of your question, but this is one way in which you can use a closure:

    > replaceStateNames <- local({
    +   statenames <- c("Alabama", "Alaska", "Arizona", "Arkansas", 
    +                   "California", "Colorado", "Connecticut", "Delaware",
    +                   "District of Columbia", "Florida")
    +   function(patt,newtext) {
    +     statenames <- tolower(statenames)
    +     statenames[grepl(patt,statenames)] <- newtext
    +     statenames
    +   }
    + })
    > 
    > replaceStateNames("columbia","DC")
     [1] "alabama"     "alaska"      "arizona"     "arkansas"    "california" 
     [6] "colorado"    "connecticut" "delaware"    "DC"          "florida"    
    > replaceStateNames("alaska","palincountry")
     [1] "alabama"              "palincountry"         "arizona"             
     [4] "arkansas"             "california"           "colorado"            
     [7] "connecticut"          "delaware"             "district of columbia"
    [10] "florida"             
    > replaceStateNames("florida","jebbushland")
     [1] "alabama"              "alaska"               "arizona"             
     [4] "arkansas"             "california"           "colorado"            
     [7] "connecticut"          "delaware"             "district of columbia"
    [10] "jebbushland"    
    > 
    

    But to generalize, you can replace statenames with your data frame definition, and return a function (or list of functions) which uses this data frame without having to pass it as an argument to the function call. Example (but note I’ve used the ignore.case=TRUE argument in grepl):

    > replaceStateNames <- local({
    +   statenames <- c("Alabama", "Alaska", "Arizona", "Arkansas", 
    +                   "California", "Colorado", "Connecticut", "Delaware",
    +                   "District of Columbia", "Florida")
    +   list(justreturn=function(patt,newtext) {
    +     statenames[grepl(patt,statenames,ignore.case=TRUE)] <- newtext
    +     statenames
    +   },reassign=function(patt,newtext) {
    +     statenames <<- replace(statenames,grepl(patt,statenames,ignore.case=TRUE),newtext)
    +     statenames
    +   })
    + })
    

    Just like the first example:

    > replaceStateNames$justreturn("columbia","DC")
     [1] "Alabama"     "Alaska"      "Arizona"     "Arkansas"    "California" 
     [6] "Colorado"    "Connecticut" "Delaware"    "DC"          "Florida"    
    

    Just returns the lexically-scoped value of statenames to check that the original values are unchanged:

    > replaceStateNames$justreturn("shouldnotmatch","anythinghere")
     [1] "Alabama"              "Alaska"               "Arizona"             
     [4] "Arkansas"             "California"           "Colorado"            
     [7] "Connecticut"          "Delaware"             "District of Columbia"
    [10] "Florida"             
    

    Do the same thing, but make the change “permanent”:

    > replaceStateNames$reassign("columbia","DC")
     [1] "Alabama"     "Alaska"      "Arizona"     "Arkansas"    "California" 
     [6] "Colorado"    "Connecticut" "Delaware"    "DC"          "Florida"    
    

    And note that the value of statenames attached to these functions has changed.

    > replaceStateNames$justreturn("shouldnotmatch","anythinghere")
     [1] "Alabama"     "Alaska"      "Arizona"     "Arkansas"    "California" 
     [6] "Colorado"    "Connecticut" "Delaware"    "DC"          "Florida"    
    

    In any case, you can replace statenames with a data frame, and these simple functions with a “merge map” or any other mapping you desire.

    Edit

    Speaking of “merge”, is this what you’re looking for? An implementation of first ?merge example using a closure:

    > authors <- data.frame(surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
    +                       nationality = c("US", "Australia", "US", "UK", "Australia"),
    +                       deceased = c("yes", rep("no", 4)))
    > books <- data.frame(name = I(c("Tukey", "Venables", "Tierney",
    +                       "Ripley", "Ripley", "McNeil", "R Core")),
    +                     title = c("Exploratory Data Analysis",
    +                       "Modern Applied Statistics ...",
    +                       "LISP-STAT",
    +                       "Spatial Statistics", "Stochastic Simulation",
    +                       "Interactive Data Analysis",
    +                       "An Introduction to R"),
    +                     other.author = c(NA, "Ripley", NA, NA, NA, NA,
    +                       "Venables & Smith"))
    > 
    > mergewithauthors <- with(list(authors=authors),function(books) 
    +   merge(authors, books, by.x = "surname", by.y = "name"))
    > 
    > mergewithauthors(books)
       surname nationality deceased                         title other.author
    1   McNeil   Australia       no     Interactive Data Analysis         <NA>
    2   Ripley          UK       no            Spatial Statistics         <NA>
    3   Ripley          UK       no         Stochastic Simulation         <NA>
    4  Tierney          US       no                     LISP-STAT         <NA>
    5    Tukey          US      yes     Exploratory Data Analysis         <NA>
    6 Venables   Australia       no Modern Applied Statistics ...       Ripley
    

    Edit 2

    To read file into an object which will be lexically bound, you can either do

    fn <- local({
      data <- read.csv("filename.csv")
      function(...) {
        ...
      }
    })
    

    or

    fn <- with(list(data=read.csv("filename.csv")),
         function(...) {
           ...
         }
       })
    

    or

    fn <- with(local(data <- read.csv("filename.csv")),
         function(...) {
           ...
         }
       })
    

    and so on. (I assume the function(…) will have to do with your “merge_map”). You can also use evalq in place of local. To “bring in” objects residing in the global space (or enclosing environment), you can just do the following

    globalobj <- value      ## could be from read.csv()
    fn <- local({
      localobj <- globalobj ## if globalobj is not locally defined, 
                            ## R will look in enclosing environment
                            ## in this case, the globalenv()
      function(...) {
        ...
      }
    })
    

    then modifying globalobj later will not change localobj attached to the function (since almost(?) everything in R follows pass-by-value semantics). You can also use with instead of local as shown in examples above.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I am trying to loop through a bunch of documents I have to put
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
I am trying to understand how to use SyndicationItem to display feed which is
Basically, what I'm trying to create is a page of div tags, each has
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
this is what i have right now Drawing an RSS feed into the php,
I am trying to render a haml file in a javascript response like so:
I have this code to decode numeric html entities to the UTF8 equivalent character.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.