PDA

View Full Version : Multibox client connect latency



Otlecs
05-01-2007, 05:51 AM
Hi folks,

I'll say right of the bat that this isn't a Multibox software problem, hence posting in the "General"forum :)

In fact, it's probably better suited to a random tech forum, but I'm sure somebody here knows the answer.

Short version: Does anyone know why GetHostByAddr() would hang for 29 seconds before returning a correctly resolved hostname under Windows XP?

Long version:

I have three XP Pro machines and one Vista box. I run the Multibox Server on one of the XP Pro machines and the others (yes, including the Vista machine) all run the client software for a 4 box setup.

This has been running successfully for several weeks now.

On Saturday evening, just before hitting the sack, I decided to perform a bunch of Windows updates on all machines (oh, when will I ever learn...).

Sunday morning, I fired up my rig and... hmmm... that's not right... every time a remote client connects to the server, the server hangs, the client loses connection due to keepalive timeout, the server wakes up, rinse and repeat.

I figured it was something to do with Vista clients, but the same problem occurred for all machines EXCEPT when the local client connected. Every time a remote client connected, the server would hang.

I checked these forums, and whilst not finding a solution I did find the latest source code (THANK YOU for making that public... /hug). I downloaded it, grabbed the wxWidgets package and other things I seemed to be missing from my development environment and set about debugging it.

As an aside, it took me about 3 hours to get to a buildable state due to initial problems getting wxWidgets to build, but my mate Google managed to find answers to all my problems and I got there in the end.

I disabled the mouse hook (so as to have at least SOME control over my machine while the server was hung!) and set about debugging it.

The problem became immediately apparent. As part of the client connection code in the server, it does a "is this a local client connecting?" test.

There are a number of checks, including an implicit (through wxWidgets) reverse DNS lookup.

The server would get to GetHostByAddr and disappear inside it for 29 seconds (consistently) before returning having *successfully* resolved the address into a host name!

Now, although I'm a bit of a code monkey, my living is made programming platforms other than Windows so I'm no expert in this environment, but I do recall there are (were?) problems with GetHostByAddr because it attempts a number of address resolution methods, each with its own timeout.

If it was failing to resolve the address, I could understand this being the case, but it wasn't failing. Although I imagine it's possible that it was failing in one or more of the multiple methods it attempted, before successfully resolving it.

I briefly considered changing GetHostByAddr to GetNameInfo to see if that helped, but was anxious just to play by then so decided that I might as well just stick entries in my local hosts file for all the machines on my LAN.

Hey presto problem solved.

So, yes, I've solved the problem but I'm... a) puzzled as to what started it (maybe one of the updates disabled a previously running service?), and... b) curious as to why nobody else had any problems!

Sorry for the waffle. It's a slow day here ;)

Thanks in advance for any hints!

Micah
08-26-2007, 04:27 PM
Ugh, just started having the same problem today. Software was running great last night and when I woke up this morning it was broken. :/

Lance
08-26-2007, 04:42 PM
I used to have the same prob as well until I did what otlecs suggested. Worked perfectly ever since.

Heenan
08-27-2007, 11:59 AM
You're not alone...

I've been using MultiBox for awhile now and this just started happening to me this weekend (I guess we all performed the same Windows updates). I added entries to all host files and now it works like it used to :)

Ughmahedhurtz
08-27-2007, 03:16 PM
This problem also affects Multiplicity (or the version I use anyway). If I set the units up by hostname using DHCP on the boxes, causes all sorts of issues. If I give them static IPs and configure through IP numbers, zero issues.