I have two threads in a process. These two threads have a race to a share memory which is attempting to be synchronized by semaphore. But I randomly got a failure with errno 4 when one threads is next to the other to call semop function. I did a little digging and found it looks to seem that calling was interrupted by system call.
EINTR While blocked in this system call, the process caught a signal; see signal(7). errno 4 is this one?
Please note lines 583 and 601.
which system call interrupted it? the function semop() itself? Any way to ignore this system call interrupting or recover/restart this function?
semop can be used in a multi-thread environment?
[Switching to Thread -1208269120 (LWP 4501)]
GetMyQue2Wait (MyModule=RM, wait_shm_ptr=0xbf8a5cf4) at tdm_ipc.c:247
247 TDM_SEM_P( MyModule );
(gdb) s
tdm_sem_p (mid=RM) at tdm_ipc.c:579
579 sem_b.sem_num = 0;
(gdb) s
580 sem_b.sem_op = -1;
(gdb) s
581 sem_b.sem_flg = SEM_UNDO;
(gdb) s
583 if (semop(TDM_M[mid].semid, &sem_b, 1) == -1)
(gdb) s
[Switching to Thread -1208480880 (LWP 4506)]
GetMyQue2Send (MyModule=RM, send_shm_ptr=0xb7f7ff54) at tdm_ipc.c:180
180 DMINT TryTimes = SEND_TIMES;
(gdb) s
353 TDM_SEM_V( DstModule );
(gdb) s
tdm_sem_v (mid=RM) at tdm_ipc.c:597
597 sem_b.sem_num = 0;
(gdb) s
598 sem_b.sem_op = 1;
(gdb) s
599 sem_b.sem_flg = SEM_UNDO;
(gdb) s
601 if (semop(TDM_M[mid].semid, &sem_b, 1) == -1)
(gdb) s
606 return SUCC;
(gdb) s
607 }
(gdb) s
RM:4501: V operation on Semaphore .
SEND_MSG (SrcModule=51, DstModule=RM, msg_ptr=0xb7f7ff94, MsgLength=28) at tdm_ipc.c:368
368 printf("%s:%d: SEND_MSG: succeeded.\n",
(gdb) s
RM:4501: SEND_MSG: succeeded.
[Switching to Thread -1208269120 (LWP 4501)]
tdm_sem_p (mid=RM) at tdm_ipc.c:585
585 printf("thread %u: errno = %d\n", (unsigned int)pthread_self(),errno);
(gdb) s
thread 3086698176: errno = 4
[Switching to Thread -1208480880 (LWP 4506)]
main thread:
...
while(1)
{
if ((RetVal = WAIT_MSG( p1, &Msg )) !=SUCC)
{
switch ( RetVal )
{
...
}
}
}
------------------------------------
thread1:
...
send(src, dst, &msg, lenght);
/* both SEND_MSG() and WAIT_MSG() have an operation P and V on semid by calling the following */
DMINT tdm_sem_p( key_t semid )
{
struct sembuf sem_b;
sem_b.sem_num = 0;
sem_b.sem_op = -1;
sem_b.sem_flg = SEM_UNDO;
if (semop(semid, &sem_b, 1) == -1)
{
printf("thread %u: errno = %d\n", (unsigned int)pthread_self(),errno);
return S_PFAIL;
}
return SUCC;
}
DMINT tdm_sem_v( key_t semid )
{
struct sembuf sem_b;
sem_b.sem_num = 0;
sem_b.sem_op = 1;
sem_b.sem_flg = SEM_UNDO;
if (semop(semid, &sem_b, 1) == -1)
{
return S_VFAIL;
}
return SUCC;
}
/* semid is init by the following */
DMINT tdm_set_sem(key_t semid)
{
union semun sem_union;
sem_union.val = 1;
if (semctl(semid, 0, SETVAL, sem_union) == -1)
{
return FAILURE;
}
return SUCC;
}
this problem have a another link which may have a bad problem description.
P semaphore failed
Thanks.
Errno 4 is indeed
EINTR. When you get that error, it means the system call you were running (semopin this case) was interrupted by a signal.You’re responsible for restarting the system call in that case. Only a limited set of system calls restart automatically, and then only if the signal handler was set up using the
SA_RESTARTflag. Seesignal(7)for the details on that, “Interruption of System Calls and Library Functions by Signal Handlers” section. You’ll noticesemopis in the list of system calls that is never restarted, regardless of the disposition of the signal handler.How you restart the call is up to you. One of the ways is to do something like:
You don’t know what signal interrupted a given system call unless you have a handler for that signal.
gdbdoes have options for signal handling though, so you could try and find out with that. Tryhandle all printto start with maybe.