Bug 1357 - SOCKS proxy attempts fail to some servers due to DNS timeouts
Summary: SOCKS proxy attempts fail to some servers due to DNS timeouts
Status: CLOSED WORKSFORME
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 4.6p1
Hardware: ix86 Linux
: P2 normal
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-04 11:35 AEST by Jamie Nicolson
Modified: 2020-02-14 15:59 AEDT (History)
3 users (show)

See Also:


Attachments
proposed patch (724 bytes, patch)
2007-09-04 11:35 AEST, Jamie Nicolson
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jamie Nicolson 2007-09-04 11:35:16 AEST
Created attachment 1345 [details]
proposed patch

---PROBLEM DESCRIPTION---

I use ssh as a SOCKS 5 proxy for Firefox, and I have configured firefox to perform remote DNS lookups. That is, the SOCKS request contains the hostname rather than the IP address of the host I want to connect to.

For the vast majority of sites I connect to, this works great. However, for a few hosts, including www.etrade.com and www.vanguard.com, the connection hangs for several seconds, then times out.

Although I think it's irrelevant, my SSH client is OpenSSH 4.6p1 on MacOS 10.4.

My server is OpenSSH 4.6p1 on Linux 2.6.12.5.


---INVESTIGATION---

I ran strace on the sshd and saw that the DNS lookup of www.vanguard.com was hanging (the DNS server took a long time to respond, much more than 5 seconds). I decoded the DNS request and saw that it is requesting QTYPE  28, which is the DNS AAAA record. This is the request for the IPv6 address.

Next I tried this DNS lookup with dig. I ran "dig -t aaaa www.vanguard.com", and it hung for about 20 seconds before finally returning. I ran "dig -t aaaa www.yahoo.com", and it returned immediately.

I ran these same dig tests on a different machine, serviced by a different ISP and DNS servers, and got the same results.

My conclusion is that an AAAA lookup on some hosts will hang for a long time.

Next I downloaded portable OpenSSH, compiled my own sshd, and found the function connect_to() in channels.c. Note that the call to getaddrinfo() is passing in a hints structure consisting of ai_family=IPv4or6 and ai_socktype=SOCK_STREAM. The hints parameter is optional, and if it is not specified it still allows either IPv4 or IPv6 results. I replaced hints with NULL and recompiled. My problem went away.

---RECOMMENDATION---

I recommend that the hints parameter be omitted, as this seems to fix the hanging behavior while still working correctly on all sites I try to connect to.
Comment 1 Damien Miller 2007-09-04 11:44:03 AEST
Could you try setting "AddressFamily inet" in your /etc/ssh/sshd_config instead?

The fix is not correct and will, among other things, break the AddressFamily option.
Comment 2 Jamie Nicolson 2007-09-04 11:55:05 AEST
Yes, that fixed it. *sigh*

Would you not agree that "AddressFamily=any" is still broken in the common case (where IPv6 is not used)? It should not hang like it does.
Comment 3 Darren Tucker 2007-09-04 12:03:55 AEST
(In reply to comment #2)
> Yes, that fixed it. *sigh*
> 
> Would you not agree that "AddressFamily=any" is still broken in the
> common case (where IPv6 is not used)? It should not hang like it does.

I think the brokenness is in the DNS infrastructure in question.

Quoth RFC4074 (ftp://ftp.rfc-editor.org/in-notes/rfc4074.txt):

"4.  Problematic Behaviors

   There are some known cases at authoritative servers that do not
   conform to the expected behavior.  This section describes those
   problematic cases.

  4.1.  Ignore Queries for AAAA

   Some authoritative servers seem to ignore queries for an AAAA RR,
   causing a delay at the stub resolver to fall back to a query for an A
   RR.  This behavior may cause a fatal timeout at the resolver or at
   the application that calls the resolver.  Even if the resolver
   eventually falls back, the result can be an unacceptable delay for
   the application user, especially with interactive applications like
   web browsing."

Comment 4 Damien Miller 2007-09-04 12:06:24 AEST
Your platform's resolver should "do the right thing" when AddressFamily=any is in use, as this sets hints.ai_family to be AF_UNSPEC which should be equivalent to not setting a hints (for the common case at least). 

Perhaps your resolver is defaulting to IPv4-only lookups when no hints is specified, but doing IPv6-then-IPv4 when hints.ai_family==AF_UNSPEC. IMO this would be rather silly behaviour, but I have seen libc authors do dumber things...

IIRC we need to fill out hints for either SRV RR or SCTP support, but my memory is hazy.
Comment 5 Jamie Nicolson 2007-09-04 12:18:54 AEST
Wow, great research on the AAAA problem.

So it's pretty clear that the DNS server is misbehaving. My getaddrinfo implementation (glibc 2.5 I think) is also doing something I don't understand, which may be wrong. Maybe the behavior has changed in later versions of getaddrinfo(). I'll try to explore that more.

Does this warrant a note in the FAQ or other documentation? Maybe it's rare problem. "If you use SOCKS proxying with remote DNS lookup and connections to some hosts timeout, and you don't need IPv6 support, try setting 'AddressFamily inet' in your sshd config." Unfortunately many users won't have access to their server's sshd config. Would changing it on the client side would work?
Comment 6 Darren Tucker 2019-01-24 08:55:59 AEDT
Cleaning up some old bugs: I don't think there's anything else we should do here.  IPv6 has gotten better since the original report and I don't think it's OpenSSH's job to document all of its possible failure modes.
Comment 7 Damien Miller 2020-02-14 15:59:20 AEDT
Closing all resolved bug with release of openssh-8.2