Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8104153
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T23:45:16+00:00 2026-06-05T23:45:16+00:00

I have a problem in our BTS production environment which we cannot reproduce in

  • 0

I have a problem in our BTS production environment which we cannot reproduce in other environments. Bear with me here.

Part of our solution, an orchestration (orch1) makes sends a direct bound message to the message box and then steps into a listen shape with the correlated receive shape on one branch and a delay (implementing the receive timeout) on the other branch. The delay is set to 10 minutes.

The direct bound request is processed by a different orchestration (orch2), which then returns the response (again via direct bind) to the message box so that orch1 can pick it up.

What is happening is that about once in every 50 operations of this type the timeout in orch1 is being hit and when the response from orch2 comes back we get a routing failure (which is what you would expect as the instance subscription on orch1 for the message has been deleted).

The weird thing is that orch2 does not even initialise until AFTER the timeout has been hit in orch1 (see the following screenshots)

Orch1 timings

Here you can see orch1 sends the direct bound request to the message box and 10 minutes later the timeout is being hit. The request is sent at 11:26:31 and the timeout is hit at 11:36:32.

Orch2 timings

This shows the timings of orch2. As you can see the receive shape is only being hit after the timeout has fired in orch1 (at 11:36:45)

What is strange is that both orch1 and orch2 are hosted in the same host. Moreover, we have a load balanced cluster and we have 2 instances of this host available to do work. So I would expect that there should always be availability on orch2 to process incoming work. However this appears not to be the case.

My current suspicion is thread starvation across both host instances. However my question is

  1. Is this a sensible suspicion?
  2. Am I doing something fundamentally wrong?
  3. Is there anything about using the listen shape which affects threading?

Just to note, we have already configured host thread settings to recommended levels (MaxIOThreads = 100, MaxWorkerThreads = 100, MinIOThreads = 25, MinWorkerThreads = 25)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T23:45:17+00:00Added an answer on June 5, 2026 at 11:45 pm

    Sounds like a race condition but I have no idea where.

    Have you considered separating out the tasks?

    1. First part of orch1 sends request.
    2. Orch2 processes output from task 1.
    3. Second part of orch1 processes responses from orch2/Task 2.

    The drawback is this has no ability to respond to timeouts.
    I don’t know if that’s important to your problem or not.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We have a problem with our production environment which uses a modified version of
we have faced a problem with our production system yesterday which I am unable
We have problem with our Qt based production server for our business application. When
So here is our problem: We have a small team of developers with their
I have a problem with a client, who cannot log in to our system.
With our CUIT tests in Visual Studio we have the problem that we cannot
We have a problem with our current TFS installation. For some reason, which I
We have found a problem with our deployment to a production server that runs
I have a problem which is caused by our encapsulated design. Up till now
We have a problem in our swing based application since we've upgraded our java

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.