Weblog entry #6 for drgraefy
For the better part of the last year, I have been struggling with a frustrating svn problem. My group's svn is served through apache2+ssl. Seemingly totally inconsitently, svn checkouts/updates would hang indefinitely, a "C-c" the only way to escape. Certain files would seem to be especially problematic, but again not consistently. Most times, but not always, hangs would be accompanied by the following message in /var/log/apache/error.log:
[Thu Feb 22 12:46:37 2007] [error] [client 220.127.116.11] Provider encountered an error while streaming a REPORT response. [500, #0] [Thu Feb 22 12:46:37 2007] [error] [client 18.104.22.168] A failure occurred while driving the update report editor [500, #104] [Thu Feb 22 12:46:37 2007] [error] [client 22.214.171.124] Error writing base64 data: Connection reset by peer [500, #104]
I have been trying for months to figure out what the problem is, but to no avail. Numerous google searches turned up people with similar issues, but never with any indication of what the problem might actually be, or how to get around it. Finally, after much struggle, we had a breakthrough yesterday.
Our network is on a private NAT'd lan. We finally noticed that the hangs were only occuring on machines located on our internal lan, and not for machines on the wan. This was a curious and important revelation. Internally the fqdn of our web site (foo.bar for the sake of argument), which resides on a server in our private lan, maps to our external IP address (126.96.36.199), just as it does externally, which in turn corresponds to the wan port of our crappy D-Link router/gateway. External ports 80 and 443 at 188.8.131.52 are then mapped to the web server (10.0.0.5).
What does this mean? Well, it means that internally, requests for foo.bar are first routed to the wan port of the router, which then sends them back to the web server. Apparently this was causing our crappy little router to choke, and drop connections. To confirm this, we changed the internal DNS to point foo.bar to the web server 10.0.0.5 directly. Once this was done, no more svn checkout/update hangs.
It's funny this never seemed to manifest itself elsewhere, but we don't do much heavy data transfer internally from the web server over ssl, except via svn. Basically the router couldn't turn around ssl packets fast enough. Ultimately, the problem is that we are running our network on a private, NAT'd lan. We shouldn't have to do this, and it's always a pain in the ass for one reason or another. As a wise man once quoted to me: 'NAT is not the answer. "NAT?" is the question, and the answer is "NOT!"'.
Comments on this Entry