Watch out for zombies!

My mother’s Yahrzeit is coming up, and her name will be on the Kaddish list this Shabbat, so perhaps it’s appropriate that I’m making a posting she would have considered complete gibberish.

For the last three weeks, my MacBook Pro has been giving me fits. When I tried to start a program, sometimes it just wouldn’t start. And, when I looked in /var/log/system.log, it was littered with lovely messages like these:

Apr 17 00:45:31 dssmac com.apple.launchd[103] ([0x0-0x2effefd].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:31 dssmac com.apple.launchd[103] ([0x0-0x2effefd].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
Apr 17 00:45:36 dssmac /usr/bin/osascript[13552]: spawn_via_launchd() failed, errno=12 label=[0x0-0x2f01eff].com.apple.systemevents path=/System/Library/CoreServices/System Events.app/Contents/MacOS/System Events flags=1
Apr 17 00:45:36 dssmac com.apple.launchd[103] ([0x0-0x2f01eff].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:36 dssmac com.apple.launchd[103] ([0x0-0x2f01eff].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
Apr 17 00:45:42 dssmac /usr/bin/osascript[13553]: spawn_via_launchd() failed, errno=12 label=[0x0-0x2f03f01].com.apple.systemevents path=/System/Library/CoreServices/System Events.app/Contents/MacOS/System Events flags=1
Apr 17 00:45:42 dssmac com.apple.launchd[103] ([0x0-0x2f03f01].com.apple.systemevents): fork() failed, will try again in one second: Resource temporarily unavailable
Apr 17 00:45:42 dssmac com.apple.launchd[103] ([0x0-0x2f03f01].com.apple.systemevents): Bug: launchd_core_logic.c:6780 (23714):35: jr->p
    

with the occasional

Apr 15 14:41:42 dssmac kernel[0]: proc: table is full    
    

thrown in for bad measure.

I couldn’t figure out what was going wrong (Activity Monitor only showed between 60-80 processes, far fewer than the system limit), so yesterday, I reinstalled Mac OS X (using the archive-and-install method) — and it didn’t help.

I was, needless to say, unhappy. I hadn’t brought my external drive to the office, so I couldn’t do a bare-metal reinstall yet. But I could (and did) tweet about my problem:

Still getting fork(1) failures (“resource not available” — which one, dammit?), so I guess it’s time for a full reinstall. Crud.

This one caught the eye of many people who wanted to help, and I want to mention two in particular:

Ed Costello thought it might be hardware — I ran the hardware diagnostics, which showed nothing.

Rich Berlin (from Sun) made the suggestion which wound up putting me on the right path — he suggested running:

sudo dtrace -n 'syscall::fork*:entry{printf("%s %d",execname,pid);}'

which showed two Eclipse-based processes forking their little hearts out. So I did a “ps” to discover what they were (unsurprisingly, Lotus Notes and Lotus Sametime), but what startled me was how many “(NotesDynConfig)” processes there were in the process table. I wondered how many, so I ran

ps -aA | wc

and was shocked to see a result of about 160, compared with the 70 processes shown in Activity Monitor. So I stopped Notes and suddenly, I was down to 70 processes via both methods.

It seems that Activity Monitor doesn’t report zombie processes. Neither does the line at the top of “top(1)”, which I’d also used while trying to troubleshoot.

Given that discrepancy, I can now understand why the system was running out of processes. I don’t know why Notes is leaving zombies around, but that’s a problem for another day (my next step is to upgrade to the latest beta and see if it helps — I’ve also reported the problem, of course).

And I guess I probably don’t really have to do a full reinstall…though I might, anyway — it’s my Windows training coming to the fore.

Twitter Search beats Google — malware attack averted

As I was driving to work this morning, my iPhone tinged, letting me know I had a new SMS awaiting me. And it tinged a second time as I pulled into my parking place, since I don’t check SMS messages while I’m driving.

It was a Facebook notification from an IBM colleague with a subject of “How did you manage to get on this video?”, sent to me and 19 others, with a link to a geocities.com page.

I was immediately suspicious, because the note wasn’t in my colleague’s style — but it was rather short, so perhaps that wasn’t valid. I was also suspicious because the names on the note were a rather mixed bag.

But it was vaguely possible that the video had something to do with IBM’s Smarter Planet initiative, so I didn’t want to discard the note.

Instead, I did the obvious thing: I Googled for “Facebook” and “get on this video”, looking for reports of malware. But I found nothing. I tried a few other variants, including “Facebook malware” and still found nothing.

So I went to plan B: Twitter. Nothing was obvious on my home page, so I posted a query: “Just got suspicious-looking facebook msg: ‘How did you manage to get on this video?’ with a link to GeoCities. Anyone know if it’s malware?”

While I waited for an answer, I tried Twitter Search, using “Facebook” as my query. Within seconds, I had my answer — yes, it was malware, and apparently virulent stuff.

And when I went back to my Twitter page, I’d gotten three replies from friends telling me the same thing (the first one arrived less than a minute after my tweet).

For timely questions, Twitter is my new go-to tool — sure, Google has depth, but it’s not instantaneous. Twitter gives me three paths to an answer:

  • Stumbling on it in my friends’ tweetstream without ever asking the question
  • Asking the question and hoping a friend answers
  • Using Twitter Search

My search strategy on Twitter is different than what I’d use on Google, though. On Google, it helps to be specific — a search on “Facebook” alone would be pretty useless, hence my attempts to qualify with the hook phrase and the word “malware”.

In contrast, on Twitter, timeliness is your friend — a one-word query (“Facebook”) is just fine, because you’re going to get the current conversation, and the human eye can do a good job of picking out the pay dirt if there is any.

I guess I’ll never find out how I got on that video, though.

In my CUPS yet again!

A few days ago, I posted a reminder about how to deal with installing the latest level of CUPS and the HPIJS drivers on a Mac.

Tonight, I discovered an important caveat when I tried to install them on Jeff’s new Mac: do not try to install the HPIJS drivers before you install GhostScript (and possibly FooMatic)!

If you do, the installer will go into an uninterruptable loop, beeping at you, and the only way out is a shutdown.

Needless to say, I discovered this the hard way.

When “It Just Works” becomes inoperative

Last Friday, I discovered that I couldn’t synch my iPod with my MacBook Pro. The iPod thought it was connected, but the Mac didn’t; oddly enough, I was able to synch my iPhone just fine.

Rebooting the iPod didn’t help, so I decided to reboot the Mac. It didn’t want to go down gracefully for some reason (it kept complaining about programs not ending), so I finally brought up a terminal window and typed “sudo shutdown -r now” to force a reboot.

That was a mistake. I got a big Do-Not-Enter sign on the screen. Repeatedly. So I booted the install DVD and ran Disk Utility to verify the drive — it had no complaints.

Back to booting the disk — this time in Verbose mode (press Apple-V right after the power switch, keep it pressed until the bong sounds). The first attempt was a complete failure; it couldn’t load mach.kernel. But I persevered (it’s not like I had a choice), and got farther — to the point that I started seeing “disk0s2: 0xe0030005 (Undefined)” errors on screen, each of which was accompanied by a long pause.

A quick visit to Google told me that the disk was failing if not already dead (which undoubtedly explained my many spinning beachballs and failures to shutdown over the past few days). So I decided to go home and see if I could rescue any data before taking the machine into the shop.

At home, I connected the system to my Mac mini and brought it up in Firewire Disk Mode (press and hold “T” right after powering on) and managed to recover most of my home directory before it was time for my appointment with a Genius at the Apple Store.

The Genius asked me what I’d done and then suggested I try a reboot while he watched, not in verbose mode. 15 minutes later, the system was up. He then suggested I:

  1. Take the machine home without rebooting
  2. Make a copy of the disk on an external drive
  3. Use Disk Utility to write zeros on the hard drive so that it would assign alternate sectors
  4. Reinstall the OS
  5. Move data back to the machine
  6. Get on with my life

He was half right.

I used SuperDuper! to clone the drive. It took three tries, extending well into Saturday night, before I was able to get a complete copy made.

Then I ran Disk Utility in “secure erase” mode to zero out the drive, reinstalled Leopard, and started the long process of moving things back from the external drive. I was suspicious of the integrity of the copy, so I didn’t move any binaries back, just my data — that meant reinstalling many programs and getting them back up to date (Microsoft Office 2004 was especially pernicious, requiring me to run the updater at least 8 times).

But by late Sunday, I was finished.

On Tuesday, though, I started seeing spinning beachballs again. By Wednesday morning, they were frequent. And a perusal of /var/log/system.log showed more “disk0s2: 0xe0030005 (Undefined)” errors. So I knew I had to have the disk replaced, which was going to be a problem, because the Genius had told me that it would take 4-7 business days, which would extend into an upcoming trip.

The machine was out of warranty, so I could have fixed it myself, but life is too short for that. And since it was the company’s machine, not mine, I really wanted to take it to an authorized servicer. But 4-7 days was unacceptable. Fortunately, there are alternatives to the Apple Store, listed right on the Apple site.

I called the closest one, ClickAway and was speaking to a tech within a few minutes. He said that they’d happily install a new drive (which they’d sell me or I could pick up at Fry’s) the same day. And they’d try to recover the data, or they could sell me a SATA case for $25 so I could do it myself (and then wipe the drive afterwards).

And they did just that. They even installed Leopard for free, saving me the trouble of doing it from the DVD. And they finished two hours earlier than they’d estimated. And the price of the whole process, including the SATA case and a larger drive than I’d originally had, was just about the same as just getting the drive swapped for an identical drive at the Apple Store.

I still had to reinstall and reupdate my software again and put my data back on. But I’ve gotten good at that.

Lessons Learned:

  • Backups are good.
  • Backups before the drive fails are better.
  • Even if all your “important” data is on multiple machines, backups are good.
  • Image backups are very good.
  • Geniuses are not always right.

I now also have a Time Capsule, which is busily backing up my Mac mini as I speak (I’ve already backed up the MBP). I wish the mini were close enough to connect it through a cable, because an over-the-air backup of 300MB takes a very long time.

And I’ve registered my copy of SuperDuper!
to make image backups easier in the future. $27.95 is cheap insurance — and I already have used the program to save my butt, so it’s even retroactive insurance (hey, it works for Warren Buffett, so why not for me?).

Fluid Twittering

On Monday, I read a posting on 43 Folders about using Fluid to create site-specific browsers. The author created a browser for I Want Sandy (a tool I plan to check out one of these days), but I thought it would be perfect for Twitter.

But since I was at work and mostly busy when I read the posting, I contented myself with posting it to del.icio.us for “later”.

That evening, though, I was on Twitter and noticed that Firefox was suffering from Spinning Beachball Syndrome — it didn’t die on me, but it spent a lot of time gazing at its own navel. Restarting it helped, but only for a short while. Then someone mentioned Flock, which I’d tried early in its life but hadn’t looked at since (I even managed to pass by their booth at Macworld, though it wasn’t intentional on my part). I didn’t really want to install Yet Another Browser, but the conversation made me think of Fluid.

I downloaded it and fired it up; less than a minute later, I had a Twitter-specific browser on my system. Since it’s Webkit-based, it doesn’t have the extensions and add-ons that I’ve laden Firefox with — and it’s fast. And since it’s an independent browser, it survives when I forget myself and close Firefox (or when it closes itself).

I just wish I could figure out how to make F5 the refresh key; instead, I have to remember to use Cmd-R. Which doesn’t work in Firefox.

Highly recommended, and the price is right: free (as in beer). That’s http://fluidapp.com — check it out!