Watch out for zombies!

My mother’s Yahrzeit is coming up, and her name will be on the Kaddish list this Shabbat, so perhaps it’s appropriate that I’m making a posting she would have considered complete gibberish.

For the last three weeks, my MacBook Pro has been giving me fits. When I tried to start a program, sometimes it just wouldn’t start. And, when I looked in /var/log/system.log, it was littered with lovely messages like these:

Apr 17 00:45:31 dssmac com.apple.launchd[103] ([0x0-0x2effefd].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:31 dssmac com.apple.launchd[103] ([0x0-0x2effefd].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
Apr 17 00:45:36 dssmac /usr/bin/osascript[13552]: spawn_via_launchd() failed, errno=12 label=[0x0-0x2f01eff].com.apple.systemevents path=/System/Library/CoreServices/System Events.app/Contents/MacOS/System Events flags=1
Apr 17 00:45:36 dssmac com.apple.launchd[103] ([0x0-0x2f01eff].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:36 dssmac com.apple.launchd[103] ([0x0-0x2f01eff].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
Apr 17 00:45:42 dssmac /usr/bin/osascript[13553]: spawn_via_launchd() failed, errno=12 label=[0x0-0x2f03f01].com.apple.systemevents path=/System/Library/CoreServices/System Events.app/Contents/MacOS/System Events flags=1
Apr 17 00:45:42 dssmac com.apple.launchd[103] ([0x0-0x2f03f01].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:42 dssmac com.apple.launchd[103] ([0x0-0x2f03f01].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
    

with the occasional

Apr 15 14:41:42 dssmac kernel[0]: proc: table is full    
    

thrown in for bad measure.

I couldn’t figure out what was going wrong (Activity Monitor only showed between 60-80 processes, far fewer than the system limit), so yesterday, I reinstalled Mac OS X (using the archive-and-install method) — and it didn’t help.

I was, needless to say, unhappy. I hadn’t brought my external drive to the office, so I couldn’t do a bare-metal reinstall yet. But I could (and did) tweet about my problem:

Still getting fork(1) failures (“resource not available” — which one, dammit?), so I guess it’s time for a full reinstall. Crud.

This one caught the eye of many people who wanted to help, and I want to mention two in particular:

Ed Costello thought it might be hardware — I ran the hardware diagnostics, which showed nothing.

Rich Berlin (from Sun) made the suggestion which wound up putting me on the right path — he suggested running:

sudo dtrace -n 'syscall::fork*:entry{printf("%s %d",execname,pid);}'

which showed two Eclipse-based processes forking their little hearts out. So I did a “ps” to discover what they were (unsurprisingly, Lotus Notes and Lotus Sametime), but what startled me was how many “(NotesDynConfig)” processes there were in the process table. I wondered how many, so I ran

ps -aA | wc

and was shocked to see a result of about 160, compared with the 70 processes shown in Activity Monitor. So I stopped Notes and suddenly, I was down to 70 processes via both methods.

It seems that Activity Monitor doesn’t report zombie processes. Neither does the line at the top of “top(1)”, which I’d also used while trying to troubleshoot.

Given that discrepancy, I can now understand why the system was running out of processes. I don’t know why Notes is leaving zombies around, but that’s a problem for another day (my next step is to upgrade to the latest beta and see if it helps — I’ve also reported the problem, of course).

And I guess I probably don’t really have to do a full reinstall…though I might, anyway — it’s my Windows training coming to the fore.

This entry was posted in Computer Stuff. Bookmark the permalink.

7 Responses to Watch out for zombies!

  1. epc says:

    Wow.

    I hadn’t realized the errors were mostly coming from launchd, I’d thought it was a variety of apps.

    Need to learn more about dtrace, that looks handy.

  2. David says:

    Upgrading to the latest beta has solved this problem, btw. I wish I’d looked at a ‘ps’ display a few weeks ago!

  3. David says:

    @epc Actually, the errors were coming from a variety of apps; I just picked a set of launchd error messages to post here.

    I had many others I could have chosen.

  4. David Conradie says:

    You aren’t using ID Vault by any chance, are you? I am trialling ID Vault in our company AND simultaneously moved to the 8.5 Standard client on a new MBPro after happily running the Basic config client on a PowerBook. Other colleagues who use Std on MBP have never reported this issue, so I’m currently suspicious that something in the ID Vault-related code is what’s causing NotesDynConfig to keep re-running.

  5. David says:

    No; I’ve never heard of ID Vault, in fact.

    This problem was due to a bug in an internal beta of 8.5.1, and has been fixed in later builds.

  6. Clytie Siddall says:

    Thankyou very much for this info. I have been struggling with this “cannot fork” problem (and its inevitable and far too frequent enforced restarts) ever since I upgraded to Leopard.

    Using “ps”, I found my culprit to be QuickSilver. I certainly don’t want to live without QS, so I will follow this up on its forum.

  7. David says:

    Glad to be able to help, Clytie — but you may need to dig deeper (sorry!).

    Did you look at the details from the ‘ps’ command to see what the zombies were? It’s quite possible that something Quicksilver is launching is leaving the zombie behind, but when you kill QS, the system reaps all of its children for you.

    ps -aA | grep ‘(‘

    will show you all of the zombie processes (of course, you can add flags to get even more details) — their names may point you at the real culprit.

    [I’ve also posted this to your thread on macrumors.com, but as a new user there, it won’t show up until a moderator gets to it.]