Subscribe to the feed
Linux 

If you have ever worked in the networking space, or in support for that matter, you probably have good stories concerning troubleshooting a lost connection. Inevitably, some new system is installed and configured, but we cannot get traffic to pass to or from the system.

One such story I can recall goes something like this.

The use case

I was troubleshooting a back-end storage array that had just been installed by a large, well-known American financial corporation. The customer was trying to configure data replication from his production (PROD) site in Houston to a disaster recovery (DR) site in San Antonio. The disaster recovery system had long been implemented and had a reputation for being a known good config.

The production site was brand new, and of course, the cause of all of our problems. The real issue was that we were unable to get traffic moving between the two replication interfaces that we had configured. We were able to reach outside of the gateway on the DR site, but we were unable to reach outside of the PROD site. Traffic was hitting the gateway device and being dropped.

Within a half-hour of troubleshooting, I asked the client if they had a firewall at the PROD site that could be blocking traffic on the needed ports. "Of course not, there is no firewall between these sites." This response was shocking considering that this was one of the largest financial institutions in the entire country. But, as all customer-facing positions go, you have to be respectful and polite.

So, I ran through every check I could think of from my side. Internal storage firewall disabled: check. Ports open from DR to PROD: check. Ports open from PROD to DR? Nope.

Turns out, that after four hours of troubleshooting and reconfiguring interfaces, the customer said, "Let me get our firewall guy on the call."

Your what guy?

That is a weird position to have hired for considering that you don't have a firewall between these sites. But it wasn't weird, because of course, they had a firewall. Issue solved. Now that the nightmare was over, the tools that I used to figure out where the issue occurred were good old-fashion Telnet (which we can cover in a later article) and of course, traceroute.

The command

Now that you can see a clear use case for traceroute, let's talk about the command itself, and what information you can get from it. The purpose of this article, after all, is that you come away with a little more knowledge about the utility that traceroute offers.

The syntax is rather simple. The command traceroute <x> (x here being an IP or hostname) is the most basic version and it will begin to send packets to the designated target. This result will allow you to trace the path of the packets sent from your machine to each of the systems between you and your desired destination.

For example, if I wanted to trace the path from my computer to google.com, I would enter something like this:

[root@rhel8dev ~]# traceroute www.google.com
traceroute to www.google.com (216.58.194.100), 30 hops max, 60 byte packets
 1  _gateway (192.168.2.1)  2.396 ms  2.726 ms  3.057 ms
 2  145.sub-66-174-43.myvzw.com (66.174.43.145)  119.355 ms  119.315 ms  119.508 ms
 3  * * *
 4  10.209.189.140 (10.209.189.140)  120.321 ms  119.836 ms  120.009 ms
 5  66.sub-69-83-106.myvzw.com (69.83.106.66)  119.042 ms  119.489 ms  119.156 ms
 6  2.sub-69-83-107.myvzw.com (69.83.107.2)  120.039 ms  125.954 ms  101.450 ms
 7  112.sub-69-83-96.myvzw.com (69.83.96.112)  110.757 ms  108.485 ms  122.108 ms
 8  112.sub-69-83-96.myvzw.com (69.83.96.112)  115.028 ms  121.073 ms  125.537 ms
 9  116.sub-69-83-96.myvzw.com (69.83.96.116)  121.793 ms  124.769 ms  124.434 ms
10  Bundle-Ether10.GW6.DFW13.ALTER.NET (140.222.237.123)  128.082 ms  128.400 ms  126.509 ms
11  google-gw.customer.alter.net (204.148.43.118)  106.276 ms  107.885 ms  105.718 ms
12  108.170.252.129 (108.170.252.129)  99.725 ms  101.797 ms 108.170.252.161 (108.170.252.161)  101.671 ms
13  108.170.230.109 (108.170.230.109)  101.207 ms  100.515 ms  99.730 ms
14  dfw06s48-in-f100.1e100.net (216.58.194.100)  99.059 ms  94.502 ms  94.015 ms
[root@rhel8dev ~]# 

The breakdown

Let's break down these results into smaller bites. This command can produce a lot of information, and as the saying goes, "The best way to eat an elephant is one bite at a time:"

[root@rhel8dev ~]# traceroute www.google.com
traceroute to www.google.com (216.58.194.100), 30 hops max, 60 byte packets
 1  _gateway (192.168.2.1)  2.396 ms  2.726 ms  3.057 ms

We are only looking at the first hop here. However, we can use this hop to dissect the info that is on display. First up, we see what is actually being sent, and where:

traceroute to www.google.com(IP), 30 hops max, 60 byte packets

From this output, we gather that we are sending traffic to the desired target (www.google.com). Traceroute, by default, measures 30 hops of 60-byte packets.

Next, we see the first hop occur. Here we are hitting the external gateway:

 1  _gateway (192.168.2.1)  2.396 ms  2.726 ms  3.057 ms

You can tell here where hop one actually landed, and then there are three numerical values. These are known as the Round-Trip Time (RTT), which refers to the amount of time that a given packet takes to reach its destination and route back an ICMP message to the source. By default, traceroute routes three packets of data to test each hop. You can find more information on this process online, however, the short of it is that every packet routes an ICMP error message back to the source when it reaches a device on the network. This action allows traceroute to determine the RTT of that packet and does not necessarily indicate an error.

Now, let's look at hops 2 to 4:

 2  145.sub-66-174-43.myvzw.com (66.174.43.145)  109.206 ms  109.400 ms  109.423 ms
 3  * * *
 4  10.209.189.140 (10.209.189.140)  124.793 ms  123.585 ms  124.585 ms

We can see something new here. Hop 2 looks normal: A device is hit with RTT times in the 100 millisecond range. Then, it gets interesting. We see only stars (*).

What do these stars (asterisks) mean? Were the packets dropped? Are they timed out?

Let me explain. There are two possibilities when it comes to these stars. First, ICMP/UDP may not be configured. If the traceroute command completes successfully and you see these stars, most likely the device that was hit was not configured to reply to ICMP/UDP traffic. This result does not mean that the traffic wasn't passed. The second possibility is that the packets were dropped due to an issue on the network. These results are usually packet timeouts, or the traffic has been blocked by a firewall.

As you can see in the above example, even after we see stars at hop 2, the packets continue and are routed back in hop 4. This behavior leads to a successful traceroute as we can see that Google has been reached.

The takeaway

Traceroute can be an invaluable tool when it comes to troubleshooting network issues. It really helps to visualize where the issue is actually occurring. Of course, there are other operations going on behind the scenes of traceroute that were not covered here.

I highly suggest if you want an even further in-depth look at this tool that you do some research online. There is a lot of info on Time-to-Live (TTL) and RTT that, in the interest of time, was not included in this article. My goal is that you now have a better understanding of when and why you should use the traceroute tool, and how to interpret the data that it offers. For more information on networking troubleshooting and concepts, check out our related articles here.


About the author

Tyler is the Sr. Community Manager at Enable Sysadmin, a submarine veteran, and an all-round tech enthusiast! He was first introduced to Red Hat in 2012 by way of a Red Hat Enterprise Linux-based combat system inside the USS Georgia Missile Control Center. Now that he has surfaced, he lives with his wife and son near Raleigh, where he worked as a data storage engineer before finding his way to the Red Hat team. He has written numerous technical documents, from military procedures to knowledgebase articles and even some training curricula. In his free time, he blends a passion for hiking, climbing, and bushcraft with video games and computer building. He is loves to read and enjoy a scotch or bourbon. Find him on Twitter or on LinkedIn.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech