Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6369463
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T00:48:32+00:00 2026-05-25T00:48:32+00:00

I’ve got a fairly new website (~3 weeks old) running on Tomcat w/so far

  • 0

I’ve got a fairly new website (~3 weeks old) running on Tomcat w/so far pretty low numbers of visitors.

For the last week I’ve noticed 1,000+ active sessions, and checking Tomcat’s localhost_access* logs show that the
overwhelming majority are coming from IPs in this range: 119.63.196.* which all look to belong to Baidu Japan.

Here’s a small example from the logs of them hitting the front page.
119.63.196.107 – – [24/Aug/2011:07:02:46 +0000] “GET /;jsessionid=94085F76780ACFD96C8109A29446288D HTTP/1.1” 200 10311
119.63.196.44 – – [24/Aug/2011:07:03:21 +0000] “GET /;jsessionid=943133C77BB1756CF11592115BA81725 HTTP/1.1” 200 10333
119.63.196.39 – – [24/Aug/2011:07:03:56 +0000] “GET /;jsessionid=9B4384BDECF540C8628467F7AB4AB463 HTTP/1.1” 200 10311
119.63.196.19 – – [24/Aug/2011:07:04:31 +0000] “GET /;jsessionid=A0B555C3A18377D993B97D4491DD1012 HTTP/1.1” 200 10311
119.63.196.45 – – [24/Aug/2011:07:05:10 +0000] “GET /;jsessionid=A3782FA61558BF11C4D5AC4F3DD1EC86 HTTP/1.1” 200 10311
119.63.196.23 – – [24/Aug/2011:07:05:53 +0000] “GET /;jsessionid=A3AF84EF13F21492EB47FAB001A1C2E5 HTTP/1.1” 200 10311
119.63.196.120 – – [24/Aug/2011:07:06:31 +0000] “GET /;jsessionid=A7C490CEC2C7F2969772AC4050C6D761 HTTP/1.1” 200 10311
119.63.196.108 – – [24/Aug/2011:07:07:07 +0000] “GET /;jsessionid=A7F769D354CB37E99843292D650D6367 HTTP/1.1” 200 10311

No one individual IP is clobbering the site, but the collective requests from this IP range are racking up active sessions. And they seem to do it in somewhat of a coordinated fashion as one page at a time will get targeted and receive ~30 hits by ~30 different in the 119.63.196.* IP range over a 20 minute period. Then it’ll move on to another page… and this is going on pretty much all day and racking up Tomcat sessions.

I do have inactive session timeout set pretty high (720 minutes), and maybe I need to bring that number down a lot. Maybe Baidu Japan is doing frequent checks because it thinks the page has changed due to a change in the link (i.e., the jsessionid is always different)?

Thanks for reading. I welcome any/all suggestions!

Eric

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T00:48:33+00:00Added an answer on May 25, 2026 at 12:48 am

    Spiders do indeed usually not maintain a session with the website. That’s normal. You should ask yourself if it is really necessary if your website creates a session upon a normal GET request. Sessions are usually used to store the logged-in user, its preferences such as locale, etcetera. But spiders do not login at all and they do not submit any forms at all. Why would you create the session then?

    There are basically 2 ways to solve this “problem”:

    1. Fix your website so that it doesn’t unnecessarily create sessions as long as there’s no need to. Create it only once an user logs in or creates/updates a sessionwide preference/variable. How exactly to do it depends on the APIs/frameworks used by your website.

    2. Block (specific) spiders by robots.txt.

    Note that session creation and the session itself are not particularly expensive. An empty session object should not allocate more than 1KB. I find your session timeout however too high. The default of 30 minutes is already relatively a lot. As a completely different alternative, you could also set it to 5 minutes or something and introduce a JS/Ajax “heartbeat” which sends every timeout-1 minutes a poll request with the session cookie whenever the user is active on the document (click, keypress, etc). This would keep the session at the server alive. You can find an example in this answer.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I used javascript for loading a picture on my website depending on which small
I've got a string that has curly quotes in it. I'd like to replace
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I have a French site that I want to parse, but am running into
I want use html5's new tag to play a wav file (currently only supported
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
i got an object with contents of html markup in it, for example: string
I am currently running into a problem where an element is coming back from

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.