Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 654843
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T22:32:19+00:00 2026-05-13T22:32:19+00:00

I currently have a problem with my jgroups configuration, causing thousands of messages getting

  • 0

I currently have a problem with my jgroups configuration, causing thousands of messages getting stuck in the NAKACK.xmit_table. Actually all of them seem to end up in the xmit_table, and a another dump from a few hours later indicates that they never intend to leave either…

This is the protocol stack configuration

UDP(bind_addr=xxx.xxx.xxx.114;
bind_interface=bond0;
ip_mcast=true;ip_ttl=64;
loopback=false;
mcast_addr=228.1.2.80;mcast_port=45589;
mcast_recv_buf_size=80000;
mcast_send_buf_size=150000;
ucast_recv_buf_size=80000;
ucast_send_buf_size=150000):
PING(num_initial_members=3;timeout=2000):
MERGE2(max_interval=20000;min_interval=10000):
FD_SOCK:
FD(max_tries=5;shun=true;timeout=10000):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(discard_delivered_msgs=true;gc_lag=50;retransmit_timeout=600,1200,2400,4800;use_mcast_xmit=true):
pbcast.STABLE(desired_avg_gossip=20000;max_bytes=400000;stability_delay=1000):UNICAST(timeout=600,1200,2400):
FRAG(frag_size=8192):pbcast.GMS(join_timeout=5000;print_local_addr=true;shun=true):
pbcast.STATE_TRANSFER

Startup message…

2010-03-01 23:40:05,358 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [xxx.xxx.xxx.35:51723|17] [xxx.xxx.xxx.35:51723, xxx.xxx.xxx.36:53088, xxx.xxx.xxx.115:32781, xxx.xxx.xxx.114:32934]
2010-03-01 23:40:05,363 INFO  [org.jboss.cache.TreeCache] TreeCache local address is 10.35.191.114:32934
2010-03-01 23:40:05,393 INFO  [org.jboss.cache.TreeCache] received the state (size=32768 bytes)
2010-03-01 23:40:05,509 INFO  [org.jboss.cache.TreeCache] state was retrieved successfully (in 146 milliseconds)

… indicates that everything is fine so far.

The logs, set to warn-level does not indicate that something is wrong except for the occational

2010-03-03 09:59:01,354 ERROR [org.jgroups.blocks.NotificationBus] exception=java.lang.IllegalArgumentException: java.lang.NullPointerException

which I’m guessing is unrelated since it has been seen earlier without the memory memory issue.

I have have been digging through two memory dumps from one of the machines to find oddities but nothing so far. Except for maybe some statistics from the different protocols

UDP has

num_bytes_sent 53617832
num_bytes_received 679220174
num_messages_sent 99524
num_messages_received 99522

while NAKACK has…

num_bytes_sent 0
num_bytes_received 0
num_messages_sent 0
num_messages_received 0

… and a huge xmit_table.

Each machine has two JChannel instances, one for ehcache and one for TreeCache. A misconfiguration means that both of them share the same diagnositics mcast address, but this should not pose a problem unless I want to send diagnostics messages right? However they do of course have different mcast addresses for the messages.

Please ask for clarifications, I have lots of information but I’m a bit uncertain about what is relevant at this point.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T22:32:19+00:00Added an answer on May 13, 2026 at 10:32 pm

    It turns out that one of the nodes in the cluster did not receive any multicast messages at all. This caused all nodes to hang on to their own xmit_tables, since they did not get any stability messages from the ‘isolated’ node, stating that it had received their messages.

    A restart of ASs, changing the multicast address solved the issue.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

No related questions found

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.