Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 104899
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T01:21:15+00:00 2026-05-11T01:21:15+00:00

Situation: Site with content protected by username/password (not all controlled since they can be

  • 0

Situation:

  • Site with content protected by username/password (not all controlled since they can be trial/test users)
  • a normal search engine can’t get at it because of username/password restrictions
  • a malicious user can still login and pass the session cookie to a ‘wget -r’ or something else.

The question would be what is the best solution to monitor such activity and respond to it (considering the site policy is no-crawling/scraping allowed)

I can think of some options:

  1. Set up some traffic monitoring solution to limit the number of requests for a given user/IP.
  2. Related to the first point: Automatically block some user-agents
  3. (Evil :)) Set up a hidden link that when accessed logs out the user and disables his account. (Presumably this would not be accessed by a normal user since he wouldn’t see it to click it, but a bot will crawl all links.)

For point 1. do you know of a good already-implemented solution? Any experiences with it? One problem would be that some false positives might show up for very active but human users.

For point 3: do you think this is really evil? Or do you see any possible problems with it?

Also accepting other suggestions.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T01:21:16+00:00Added an answer on May 11, 2026 at 1:21 am

    Point 1 has the problem you have mentioned yourself. Also it doesn’t help against a slower crawl of the site, or if it does then it may be even worse for legitimate heavy users.

    You could turn point 2 around and only allow the user-agents you trust. Of course this won’t help against a tool that fakes a standard user-agent.

    A variation on point 3 would just be to send a notification to the site owners, then they can decide what to do with that user.

    Similarly for my variation on point 2, you could make this a softer action, and just notify that somebody is accessing the site with a weird user agent.

    edit: Related, I once had a weird issue when I was accessing a URL of my own that was not public (I was just staging a site that I hadn’t announced or linked anywhere). Although nobody should have even known this URL but me, all of a sudden I noticed hits in the logs. When I tracked this down, I saw it was from some content filtering site. Turned out that my mobile ISP used a third party to block content, and it intercepted my own requests – since it didn’t know the site, it then fetched the page I was trying to access and (I assume) did some keyword analysis in order to decide whether or not to block. This kind of thing might be a tail end case you need to watch out for.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

According to yahoo and stackoverflow.com they advise having a static content site that you
Situation: running a Google App Engine site with my static content's default_expiration set to
I have a unique situation where I'm building a site that will call data
I am in a situation where I want to restructure my site's urls. That
let me explain my current situation i have a SharePoint site lets say it
First off, here is the situation. I'm using a guild hosting site that allows
We're having the following situation: a web site requires the user to log on
I have two database tables, one for Users of a web site, containing the
Here's the situation... Site 1) ASP.NET MVC application for customers to login, view and
Here is the situation, i have a website that can be accessed from multiple

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.