My site has profiles, and then pages beyond those profiles. (Example: http://www.site.com/profile, http://www.site.com/profile/settings)
I would like to block Google crawlers from the sub folders. I want google to index the /profile/ but not anything beyond it.
Another example: – http://twitter.com/bmull <– Allow – http://twitter.com/bmull/favorites <– Block
You could also use
<meta name="robots" content="noindex, nofollow" />in the pages you dont want to robots to index/follow, however always remember that everything in these files is voluntary and the robots can choose not to follow so I recommend ip or user agent blocking as a better route.