I bought a new GSM router to have DNS where I cannot get Internet through otherwise. The TP-Link TL-MR6400 has much better connectivity and hence is faster than using my phone as hotspot. Only trouble, it has some DNS issues when trying to access the Internet from a raspberry running Linux. In the following, I will see how far I can get to identify the problem.
TP-Link Nameserver
Time to understand what is going. strace is a tool that shows all the system calls made by a program which allows us to get an idea what the program is doing. Let’s do a strace ping yahoo.com. The end of the output gives the following:
... socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) sendmmsg(5, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\242\225\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\206\350\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}], 2, MSG_NOSIGNAL) = 2 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [97]) = 0 recvfrom(5, "\242\225\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 97 poll([{fd=5, events=POLLIN}], 1, 4937
And after some time, we get the result of the last poll operation. Essentially, the operation times out and ping tries to resend the request.
) = 0 (Timeout) poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) send(5, "\242\225\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 poll([{fd=5, events=POLLIN}], 1, 5000
After a couple of retries, ping tries another strategy which again fails.
close(5) = 0 socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) send(5, "8\365\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [106]) = 0 recvfrom(5, "8\365\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 106 poll([{fd=5, events=POLLOUT}], 1, 4998) = 1 ([{fd=5, revents=POLLOUT}]) send(5, "O\373\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", 31, MSG_NOSIGNAL) = 31 poll([{fd=5, events=POLLIN}], 1, 4997) = 0 (Timeout)
Eventually, ping tries a third approach which is successful. However, we do not always want to wait for all these timeouts to occur.
close(5) = 0 socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) send(5, "8\365\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [106]) = 0 recvfrom(5, "8\365\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 106 close(5) = 0 socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0 poll([{fd=5, events=POLLOUT}], 1, 4997) = 1 ([{fd=5, revents=POLLOUT}]) send(5, "O\373\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", 31, MSG_NOSIGNAL) = 31 poll([{fd=5, events=POLLIN}], 1, 4997) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [121]) = 0 recvfrom(5, "O\373\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1\300"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 121 close(5) = 0
What is the difference between the different approaches?
- In the first approach, send multiple messages (sendmmsg) is used to send the following two messages: {iov_base="\242\225\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", iov_len=31} and {iov_base="\206\350\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", iov_len=31}. For each timeout, the second message is resent: send(5, "\242\225\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31,
- The second approach right away instead of using send multiple messages, sends the two messages as two individual messges: send(5, "8\365\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 and send(5, "O\373\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", 31, MSG_NOSIGNAL) = 31.
- The third approach, like the second approach sends the two messages as separate messages but inbetween closes the connection to the DNS server and recreates the connection for the second message.
Essentially, the problem seems to be that the TP-Link DNS server does not answer multiple messages on the same connection.
Another Nameserver
Now for a quick and dirty solution. The problem seems to be the DNS of the TP-Link acting in a way that the Linux resolver does not like it. Let’s replace the TP-Link name server listed in /etc/resolv.conf with a nameserver from Google (8.8.8.8) and redo the ping operation.
... socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, 16) = 0 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) sendmmsg(5, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\222\344\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\317\205\1\0\0\1\0\0\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}], 2, MSG_NOSIGNAL) = 2 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [97]) = 0 recvfrom(5, "\222\344\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, [28->16]) = 97 poll([{fd=5, events=POLLIN}], 1, 4958) = 1 ([{fd=5, revents=POLLIN}]) ioctl(5, FIONREAD, [121]) = 0 recvfrom(5, "\317\205\201\200\0\1\0\3\0\0\0\0\3www\5yahoo\3com\0\0\34\0\1\300"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, [28->16]) = 121 close(5) = 0
Essentially, the Google nameservers are happy to receive the two messages as part of a send multiple messages invocation.
My Solution
I would prefer to configure my local system to stick to the TP-Link DNS server and simply recreate the connection to the DNS server. I don’t believe that running to Google’s DNS server whenever there is a problem is the right solution.
Anyway, I have to admit after having spent enough time, this is the “solution”, I am sticking to for the time being. What is the proper way of adding the Google nameservers. After looking at /etc/resolv.conf, there is a nice message that on my system, this file is managed by resolvconf. After looking at man resolvconf and man resolvconf.conf, it turns out the proper way of specifying the nameservers is by adding the following line to /etc/resolvconf.conf:
name_servers="8.8.4.4 8.8.8.8"
Figuring out the proper syntax required looking into /sbin/resolvconf which fortunately is a shell script. To make the changes take affect, run the following command:
prompt# resolvconf -u Too few arguments. Too few arguments.
Yes there are two errors but /etc/resolv.cong has been updated and I am happy for now. If you are interested on the two nameservers, here is what Google says about them.
Hope this helps, if there are any comments or ideas of how to solve it better please let me know. And yes, bug report for resolvconf and request for documentation update has been sent to the maintainers of my distribution.