Deadlocked App


 
Thread Tools Search this Thread
Top Forums Programming Deadlocked App
# 1  
Old 10-13-2005
Deadlocked App

Hello All -

We have a legacy C program running non stop on one of our servers, with several instances often running at once. Fairly regularly, one of the instances while stop outputting to the log file and will just deadlock/hang. They must be then 'kill'ed by myself.

When I gdb into one of the hung running processes, and enter the 'where' command I invariably get something like the following:

(gdb) where
#0 0x009f0402 in ?? ()
#1 0x00bdf1ce in __lll_mutex_lock_wait () from /lib/libc.so.6
#2 0x00b86abf in _L_mutex_lock_1965 () from /lib/libc.so.6
#3 0x00000000 in ?? ()

Does anyone recognise this? I'm sure it's indicative of a bug in the app but amn't sure how to track it down. Any suggestions would be very welcome.

Mark.
# 2  
Old 10-13-2005
A mutex (mutual exclusion semaphore) is a gatekeeper for interprocess cooperation.
It is used to allow one and only one process at a time to have access to a resource or
memory or whatever.

When one process sets (owns) a mutex the other processes that want the protected resource cooperate by calling mutex wait until the mutex becomes free. Then they can get it. If the process that owns the mutex dies or does not play fair by not releasing the mutex, the other process stays in a wait state forever.

As well, It is possible for two processes to set mutexes that another other process needs, then wait to get the other process' held mutex without releasing it's own mutex, so neither process can go anywhere.

This is what you are seeing - forever waiting. Since more than one freezes
the bit of locking each other out is prolly what you are seeing.

It's a programming error.
# 3  
Old 10-13-2005
Thanks for the quick response Jim.

I'm sure it is a programming error, and probably mine. But could you give me any indication how I might track it down?

The application certainly wasn't programmed to support interprocess communication, so could the mutex problem as you explained it be down to an external library (such as MySQL) or the code used to reference them (ie. the MySQL API)?

Thanks,

Mark.
# 4  
Old 10-14-2005
I'm shaky on MySQL, but other db's provide table and sometimes record level locking.
Assuming it does, check if you are doing 'SELECT stuf from mytable for update;' which exclusively locks the records selected, for example. (It does in Oracle, which I do understand).

Databases also provide for exclusive access to a resource. An Oracle example:
'LOCK mytable in EXCLUSIVE MODE;' locks the entire table against any access by any other Oracle session.

If you can translate this concept to MySQL terms, that's very likely the place to start looking.
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Programming

Wuhan Coronavirus Status App for China - Rapid Prototype using MQTT and the IoT OnOff IOS App

With a little bit of work, was able to build a nice "Wuhan Coronavirus Status" app using MQTT and the IoT-OnOff app. More on this technique here: ESP32 (ESP-WROOM-32) as an MQTT Client Subscribed to Linux Server Load Average Messages The result turned out nice, I think. I like the look and... (10 Replies)
Discussion started by: Neo
10 Replies

2. Solaris

Problem with /app

Hi folks, i have a problem with my /app directory on solaris 10.It is mounted under rpool root and sometimes it increase dimension bringing root out of space.I want to mount /app under different position, maybe under secondary hardisk for which i have created a mount point with zfs pool...How... (10 Replies)
Discussion started by: mattpunk
10 Replies

3. AIX

AIX 6.1 app running on 5.x?

Hi, A quick question. If I build an application on AIX 6.1 TL3 using XL C/C++ 8.0 and Oracle 10g, can I then take those binaries and run them on AIX 5.3 and previous? Regards Kevin (3 Replies)
Discussion started by: KevB
3 Replies

4. Red Hat

userid with nothing to do on the os/app

Hi All, I got this userid apache with the same userid and groupid and /sbin/nologin and the /www/a home folder is empty. Can I just delete this userid? How can I investigate if userid have something to do with the application? Thanks for any comment you may add. (1 Reply)
Discussion started by: itik
1 Replies

5. Solaris

luminis app

The guys at SunGard want to charge a lot of $$$$ for installing Luminis and we are trying to see if this can be done without them. Their installation guide provided page #53 ( http://www.luminis.nocccd.edu/documents/Luminis%20IV/lp40000in.pdf ) doesn't really tell you much. All they say is that... (4 Replies)
Discussion started by: ceci1
4 Replies

6. Shell Programming and Scripting

How Detecting DeadLocked Processes?

Hi, First let me inform you about the environment: We have four servers (sun machines), connected via WAN to centralized Oracle Database Server. Each sun server has a lot of users, The Problem is I don't a Unix Monitor tool to tell me which Process is deadlocked to kill. all we can do... (1 Reply)
Discussion started by: so_friendly
1 Replies
Login or Register to Ask a Question