Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6778819
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T16:18:42+00:00 2026-05-26T16:18:42+00:00

First I would like to apologize, I’m giving so much information to make it

  • 0

First I would like to apologize, I’m giving so much information to make it as clear as possible what the problem is. Please let me know if there’s still anything which needs clarifying.

(Running erlang R13B04, kernel 2.6.18-194, centos 5.5)

I have a very strange problem. I have the following code to listen and process sockets:

%Opts used to make listen socket
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay, true}, {reuseaddr, true}, {active, false},{keepalive,true}]).

%Acceptor loop which spawns off sock processors when connections
%come in
accept_loop(Listen) ->
    case gen_tcp:accept(Listen) of
    {ok, Socket} ->
        Pid = spawn(fun()->?MODULE:process_sock(Socket) end),
        gen_tcp:controlling_process(Socket,Pid);
    {error,_} -> do_nothing
    end,
    ?MODULE:accept_loop(Listen).

%Probably not relevant
process_sock(Sock) ->
    case inet:peername(Sock) of
    {ok,{Ip,_Port}} -> 
        case Ip of
        {172,16,_,_} -> Auth = true;
        _ -> Auth = lists:member(Ip,?PUB_IPS)
        end,
        ?MODULE:process_sock_loop(Sock,Auth);
    _ -> gen_tcp:close(Sock)
    end.

process_sock_loop(Sock,Auth) ->
    try inet:setopts(Sock,[{active,once}]) of
    ok ->
        receive
        {tcp_closed,_} -> 
            ?MODULE:prepare_for_death(Sock,[]);
        {tcp_error,_,etimedout} -> 
            ?MODULE:prepare_for_death(Sock,[]);

        %Not getting here
        {tcp,Sock,Data} ->
            ?MODULE:do_stuff(Sock,Data);

        _ ->
            ?MODULE:process_sock_loop(Sock,Auth)
        after 60000 ->
            ?MODULE:process_sock_loop(Sock,Auth)
        end;
    {error,_} ->
        ?MODULE:prepare_for_death(Sock,[]) 
    catch _:_ -> 
        ?MODULE:prepare_for_death(Sock,[])
    end.

This whole setup works wonderfully normally, and has been working for the past few months. The server operates as a message passing server with long-held tcp connections, and it holds on average about 100k connections. However now we’re trying to use the server more heavily. We’re making two long-held connections (in the future probably more) to the erlang server and making a few hundred commands every second per each of those connections. Each of those commands, in the common case, spawn off a new thread which will probably make some kind of read from mnesia, and send some messages based on that.

The strangeness comes when we try to test those two command connections. When we turn on the stream of commands, any new connection has about 50% chance of hanging. For instance, using netcat if I were to connect and send along the string “blahblahblah” the server should immediately return back an error. In doing this it won’t make any calls outside the thread (since all it’s doing is trying to parse the command, which will fail because blahblahblah isn’t a command). But about 50% of the time (when the two command connections are running) typing in blahblahblah results in the server just sitting there for 60 seconds before returning that error.

In trying to debug this I pulled up wireshark. The tcp handshake always happens immediately, and when the first packet from the client (netcat) is sent it acks immediately, telling me that the tcp stack of the kernel isn’t the bottleneck. My only guess is that the problem lies in the process_sock_loop function. It has a receive which will go back to the top of the function after 60 seconds and try again to get more from the socket. My best guess is that the following is happening:

  • Connection is made, thread moves on to process_sock_loop
  • {active,once} is set
  • Thread receives, but doesn’t get data even though it’s there
  • After 60 seconds thread goes back to the top of process_sock_loop
  • {active, once} is set again
  • This time the data comes through, things proceed as normal

Why this would be I have no idea, and when we turn those two command connections off everything goes back to normal and the problem goes away.

Any ideas?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T16:18:42+00:00Added an answer on May 26, 2026 at 4:18 pm

    it’s likely that your first call to set {active,once} is failing due to a race condition between your call to spawn and your call to controlling_process

    it will be intermittent, likely based on host load.

    When doing this, I’d normally spawn a function that blocks on something like:
    {take,Sock}

    and then call your loop on the sock, setting {active,once}.

    so you’d change the acceptor to spawn, set controlling_process then Pid ! {take,Sock}

    something to that effect.
    note: I don’t know if the {active,once} call actually throws when you aren’t the controlling processes, if it doesn’t, then what I just said makes sense.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to make a First Access Database Setup Process in my spring
First of all, I would like to apologize for posting this again. I am
First off I would like to apologize beforehand, in case this turns out to
First, Let me apologize as I know NOTHING about ruby. I can read through
First of all I would like to apologize for the question itself. I simply
I would like to apologize at first instant for asking some stupid question(well not
First let me apologize for the scale of this problem but I'm really trying
First I would like to say that I am new to this site, never
I've been thinking about playing with Java3D . But first I would like to
I would like to get the first item from a list matching a condition.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.