An issue I lately encountered was that a collegue complained about several processes which kept hanging on a Solaris 10 machine. After investigation, processes like format, powermt and even a for diagnostics invoked dtrace kept hanging, and could not even be killed :
# pkill -9 format # ps -ef |grep -c format 2
In such cases, a good old truss session mostly explains what's going on; but in this case, truss came back with a quite peculiar message :
# truss -p 26632 truss: unanticipated system error: 26632 # # pstack 26632 pstack: cannot examine 26632: unanticipated system error # # pfiles 26632 pfiles: unanticipated system error: 26632
In those cases, the only option you have is to rely on the kernel debugger to determine the cause :
# mdb -k Loading modules: [ unix genunix specfs dtrace ufs sd pcisch md ip hook neti sctp arp usba fcp fctl ssd nca lofs zfs cpc fcip random crypto logindmux ptm nfs ipc ] > ::pgrep format S PID PPID PGID SID UID FLAGS ADDR NAME R 1241 1 942 686 0 0x4a004900 000006001414c060 format > 000006001414c060::thread ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTR 000006001414c060 inval/2000 1424 de50 0 0 0 0 n/a > 000006001414c060::walk thread | ::findstack stack pointer for thread 300012b7700: 2a10055cb01 [ 000002a10055cb01 cv_wait+0x38() ] 000002a10055cbb1 PowerSleep+0x14() 000002a10055cc71 PowerGetSema+0xe8() 000002a10055cd31 power_open+0x364() 000002a10055cea1 spec_open+0x4f8() 000002a10055cf61 fop_open+0x78() 000002a10055d011 vn_openat+0x500() 000002a10055d1d1 copen+0x260() 000002a10055d2e1 syscall_trap32+0xcc()
In this case, it was the PowerPath MPIO which was blocked on a semaphore. Further investigation revealed that the drivers for PowerPath were removed from the /etc/system file. Restoring the correct version of that file and a reboot solved the problem.
- Log in to post comments