![]() |
![]() |
|
|
mi
Author / Автор: Sergey Satskiy
|
![]() |
![]() |
This is how to run mi with an application to test:
satsky.homelinux.com:~> ./mi.py ./test/test-bad-lock-order-elf
Test (improper lock order, elf) t0: m1.lock -> m2.lock -> m3.lock -> m3.unlock -> m2.unlock -> m1.unlock
t1: m1.lock -> m3.lock -> m2.lock -> m2.unlock -> m3.unlock -> m1.unlock
|
A log file is formed as the result of run:
satsky.homelinux.com:~> ll mi.log -rw-rw-r-- 1 guest guest 4411 2010-01-31 20:54 mi.log |
Now you can run the script which analyses the collected information:
swift@satsky.homelinux.com:~> ./statmi.py
Execution environment:
application: /home/guest/test/test-bad-lock-order-elf
log file: /home/guest/mi.log
libpthread.so path: /lib64/libpthread.so.0
do not print stack trace
Number of threads: 3
Number of mutexes: 4
Successfull operations: 56
Failed operations: 0
Collecting chains and statistics...
Threads legend:
t0 -> 140209742624528
t1 -> 140209742620944
t2 -> 140209732131088
Mutexes legend:
m0 -> 0x310baf7040
m1 -> 0x6013c0
m2 -> 0x601400
m3 -> 0x601440
--- E000 -- t2: m3 -> m2 -- t1: m2 -> m3
ERROR: potential dead lock detected
Thread t2 lock stack:
Operation: lock Object: 0x601400(m2) Thread: 140209732131088(t2) Return code: 0 Clocks: 0
Operation: lock Object: 0x601440(m3) Thread: 140209732131088(t2) Return code: 0 Clocks: 0
Operation: lock Object: 0x6013c0(m1) Thread: 140209732131088(t2) Return code: 0 Clocks: 0
Thread t1 lock stack:
Operation: lock Object: 0x601440(m3) Thread: 140209742620944(t1) Return code: 0 Clocks: 0
Operation: lock Object: 0x601400(m2) Thread: 140209742620944(t1) Return code: 0 Clocks: 0
Operation: lock Object: 0x6013c0(m1) Thread: 140209742620944(t1) Return code: 0 Clocks: 0
--- E000
|
Each error message is surrounded by lines which start from three dashes:
--- E000 -- t2: m3 -> m2 -- t1: m2 -> m3 . . . --- E000 |
The ‘E’ character means that this is an error. The ‘W’ character is used for warnings. The message type character is followed by a three digit number.
The message header has a short error descripton. The notation:
t2: m3 -> m2 -- t1: m2 -> m3 |
should be read as follows: thread t2 locked mutex m3 and then locked mutex m2 while thread t1, in turn, locked mutex m2 and then locked mutex m3.
The error details have information about the mutexes lock order for both threads. The lock order is given in the reverse chronological order i.e. the last locked mutex comes first.
Even more detailed information which includes call stacks for each mutex lock operation can be collected if mi is run with the --option stack option key, for example:
satsky.homelinux.com:~mi> ./mi.py --option stack ./test/test-bad-lock-order-elf
Test (improper lock order, elf) t0: m1.lock -> m2.lock -> m3.lock -> m3.unlock -> m2.unlock -> m1.unlock
t1: m1.lock -> m3.lock -> m2.lock -> m2.unlock -> m3.unlock -> m1.unlock
|
And then:
satsky.homelinux.com:~> ./statmi.py
Execution environment:
application: /home/guest/test/test-bad-lock-order-elf
log file: /home/guest/mi.log
libpthread.so path: /lib64/libpthread.so.0
print stack trace
Number of threads: 3
Number of mutexes: 4
Successfull operations: 56
Failed operations: 0
Collecting chains and statistics...
Threads legend:
t0 -> 140490701883152
t1 -> 140490701879568
t2 -> 140490691389712
Mutexes legend:
m0 -> 0x310baf7040
m1 -> 0x6013c0
m2 -> 0x601400
m3 -> 0x601440
--- E000 -- t2: m3 -> m2 -- t1: m2 -> m3
ERROR: potential dead lock detected
Thread t2 lock stack:
Operation: lock Object: 0x601400(m2) Thread: 140490691389712(t2) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x400a57]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
Operation: lock Object: 0x601440(m3) Thread: 140490691389712(t2) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x400a4d]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
Operation: lock Object: 0x6013c0(m1) Thread: 140490691389712(t2) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x400a43]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
Thread t1 lock stack:
Operation: lock Object: 0x601440(m3) Thread: 140490701879568(t1) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x4009fe]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
Operation: lock Object: 0x601400(m2) Thread: 140490701879568(t1) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x4009f4]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
Operation: lock Object: 0x6013c0(m1) Thread: 140490701879568(t1) Return code: 0 Clocks: 0
Backtrace:
/home/guest/test/test-bad-lock-order-elf [0x4009ea]
/lib64/libpthread.so.0 [0x310640685a]
/lib64/libc.so.6 : clone()+0x6d
--- E000
|
The call stacks for each mutex lock operation are printed in the reverse chronological order similar to the mutex locks stacks.
mi uses the following packages:
To build the shared library which intercepts pthread_mutex_zzz(...) calls type:
make love |
mi does not require any special installation. Just copy the mi files (libmi.so, mi.py, statmi.py) to the location where you want to keep it.
You might also want to create symbolic links and update the PATH variable respectively.
mi was tested on Fedora 11 distribution.
To get help type:
./mi.py –help |
mi-0.0.1.tar.bz2 (13610 bytes)
Changelog (07-Feb-2010):
Q: How do I run mi if the application to test has to be run via a shell script?
A: It is necessary to specify a full path to the libpthread.so library using the --pthread option key at the moment of running the mi.py script, for example:
satsky.homelinux.com:~> ./mi.py --pthread /lib/libpthread-2.10.2.so my_prog.sh |
Q: Are there any extra steps to test a daemon application?
A: No. Your daemon is run the same as you run a usual application to test. It is however required to wait till the daemon is finished before you run the statmi.py script.
Q: What are the statmi.py return codes?
A: 0 – if there are no errors and no warnings. 1 – if there is at least one warning and there are no errors for the tested application. 2 – if there is at least one error in the tested application. 3 – if there are internal statmi.py errors.
Q: statmi.py prints information about mutexes which are not in my code. Is this a mi bug?
A: No. The run-time environment can use mutexes for internal purposes and mi intercepts operations with those mutexes too. Usually such mutexes have a distinguished range of addresses (see mutex m0 in the mi usage example section) and operations on those mutexes are finished before the first operation on the tested application mutexes.
mi was developed for fun though a few side goals were also persued:
If you like mi and wish to help improve it – you are welcome to!