Mainstage is Mac, right?
I'd say give JackOSX a try. The buffer size you set up in Jack overrides the delay setting in Pd. IIRC, I was using a buffer size of 512 samples on a CoreDuo with no problems and a SIGNIFICANT drop in CPU usage. At 48k, 512 comes out to about 10.7 ms. It's just long enough to notice the latency, but short enough that you can play almost anything without problems (it takes some adjustment to get used to shredding with it, though, but fast notes aren't so easily picked up by pitch detection, anyway).
I don't think the IAC drivers add any delay. However, Pd does have one really annoying behavior with MIDI output, and that is that it is delayed by the buffer size. I think the idea is that Pd's audio output would be synced with the MIDI messages, but the problem is that any audio software you are controlling with it is going to have another buffer layer, so it gets delayed again. So you want Pd's (and Mainstage's) buffers as low as you can get away with.
One (albeit annoyingly complicated) workaround for the MIDI delay is to open a second instance of Pd, don't run any audio in it, and drop the delay down to the lowest setting. In your audio patch, have your guitar trigger OSC messages and send them to a patch in the second instance. Then, convert the OSC messages to MIDI messages and send them out to Mainstage from there. OSC messages don't get the added delay, and since you are now converting them and sending them as MIDI messages from a patch with the lowest delay setting, they aren't bound by the buffer size of the audio patch.