12 Replies Latest reply: Mar 28, 2017 11:08 PM by Sergey RSS

    IP SLA constant state change

    Trent

      I have configured a topology simmilar to the listing below. However, I'm observing a constant series of state changes and cant seem  to figure out why. I have icmp-echo tracking applied and have ran aa packet capture. I see the ICMP echo requests being sent, but I see no reply is returning over the link. I have configured the source-interface command to this particular interface that packet capture shows icmp reply is not traversing.

       

      topology.jpg

       

       

       

      *Mar 27 20:43:02.689: %TRACK-6-STATE: 1 ip sla 1 state Down -> Up

      *Mar 27 20:43:12.691: %TRACK-6-STATE: 1 ip sla 1 state Up -> Down

      *Mar 27 20:43:17.693: %TRACK-6-STATE: 1 ip sla 1 state Down -> Up

      *Mar 27 20:43:27.693: %TRACK-6-STATE: 1 ip sla 1 state Up -> Down

      *Mar 27 20:43:32.697: %TRACK-6-STATE: 1 ip sla 1 state Down -> Up

      *Mar 27 20:43:42.697: %TRACK-6-STATE: 1 ip sla 1 state Up -> Down

       

      PCAP on R1 - ISP link:

      pcap.jpg

       

      R1:

      track 1 ip sla 1

      ip route 0.0.0.0 0.0.0.0 200.1.1.2 track 1

      ip route 0.0.0.0 0.0.0.0 201.1.1.2 50

      !

      ip sla 1

      icmp-echo 10.1.1.100 source-interface GigabitEthernet0/0

      frequency 5

      ip sla schedule 1 life forever start-time now

        • 1. Re: IP SLA constant state change
          Anthony

          These kind of things are tricky.  You don't really say where 10.1.1.100 lives but here's what I know.  If the purpose of the IP SLA is to test a particular link and you want to pin that track to a route then you absolutely need to make sure that the IP SLA traffic can only traverse the link you want it to test or you can run into goofy problems.  In other words, don't let the IP SLA take the long way around (through the other provider link) to get to the target IP address or you're sort of defeating the purpose of the whole setup and you should expect to see unpredictable behavior out of the configuration.

          • 2. Re: IP SLA constant state change
            Sergey

            Trent,

             

            Your flapping track object state can be caused by this series of events:

             

            1. Your reachability between R1 and ISP1 routers is broken, but interfaces stay in the UP state.

            2. This removes the default route via 200.1.1.2

            3. The default route via 201.1.1.2 is installed in the routing table.

            4. The address 10.1.1.100 gets reachable via 201.1.1.2

            5. The default route via 200.1.1.2 is reinstated.

            6. Go to 1

            • 3. Re: IP SLA constant state change
              Naps

              where is 10.1.1.100 ip configured in the topology ???

               

              This can also be due to no ip assignment of 10.1.1.100 anywhere. If no interface has assigned this ip then we can't expect reply from other end.

              • 4. Re: IP SLA constant state change
                Sergey

                Sandeep, I think Trent doesn't see the reply because he is only monitoring one link and that link is broken. ICMP reply might come back via another link, that's why he only sees request.

                • 5. Re: IP SLA constant state change
                  Naps

                  Hi Sergey,

                   

                  I agree with your point. Then how the track object is changing its state frequently.

                   

                  As per my understanding if the link is flapping intermittently then during the failure of one interface the reply should come from other interface in that case reply should come from interface being captured.

                   

                  Correct me if i am wrong.

                  • 6. Re: IP SLA constant state change
                    Sergey

                    Sandeep, I think the link isn't flapping, it is broken. What is flapping is the tracking object. And that object is flapping because of the process I described in my previous post. So, here how it goes:

                    The ping packets are sent over the broken link and because it is broken, there is no reply.

                    The tracking object goes down.

                    As the tracking object is down, another default route is used.

                    The ping packets are sent over the not broken link and the reply is received.

                    The tracking object is happy again and the default route via broken link is reinstated.

                    The process is started all over again.

                    So, as you can see, the broken link stays broken all the time and the tracking object is flapping. I think the main mistake that Trent have done is that he thinks that giving the source interface to a ping used for IP SLA guarantees that this ping will use that interface as egress, but this is wrong. All it does is that the ping will use that interface's IP address as the source in the packets. And it will happily send these packets over any valid egress interface to reach the destination. The correct solution would be to send ping to the other side of the link and make sure that this IP address can only be reached over that interface, for example, do this:

                     

                    config t

                    ip route 200.1.1.2 255.255.255.255 GI0/0

                    ip sla 1

                    icmp-echo 200.1.1.2

                    • 7. Re: IP SLA constant state change
                      Trent

                      Sergey,et al:

                       

                      I agree with what you are saying. However, I was under the assumption that by adding the source-interface GigabitEthernet0/0 to the IP SLA object, it would only base the state decision on icmp-echo replies that traverse the broken link, which in my case is GigabitEthernet0/0. I intentionally broke the link from ISP1 to VLAN10.1.1.0, the destination for the track. This VLAN is reachable via both interfaces of R1, intentionally. However, the IP SLA track decision should only consider source-interface GigabitEthernet0/0 with my configuration, correct? Therefore, the route should remain removed and the track state should stay down, correct?

                       

                      In the training video I watched, it stated that, to avoid route flapping due to reachable IP addresses out both your track route and your secondary default route, you must configure the source-interface command, which I have done above. Perhaps I can run a debug command that will inform me if the route addition decision is based on icmp reply being received on the GigabitEthernet0/1, and then determine from there why that is the case if I configured source-interface on GigabitEthernet0/0.

                      • 8. Re: IP SLA constant state change
                        Anthony

                        As long as Gi0/0 is up then the IP SLA can source from that interface and send the traffic the other way.  You need to figure out how to prevent that from happening.

                        • 9. Re: IP SLA constant state change
                          Sergey

                          Trent,

                           

                          It works slightly different. The source-interface is only that - an interface from which the SLA probes are sourced. You can confirm it by running a command from your R1 router

                          ping 10.1.1.100 source Gi0/0

                          and you'll see that ping packets happily make it across using Gi0/1, but with the source IP address of Gi0/0.

                          • 10. Re: IP SLA constant state change
                            Anthony

                            Trent wrote:

                             

                            In the training video I watched, it stated that, to avoid route flapping due to reachable IP addresses out both your track route and your secondary default route, you must configure the source-interface command, which I have done above. Perhaps I can run a debug command that will inform me if the route addition decision is based on icmp reply being received on the GigabitEthernet0/1, and then determine from there why that is the case if I configured source-interface on GigabitEthernet0/0.

                            The part you're missing, here, is that your interface isn't going down.  Even if you shutdown the interface on the ISP side it probably will stay up on the CE side unless you have keepalives configured.  You need to account for all possibilities.  The most straightforward approach, given your topology, is to place an ACL on the ISP2-facing interface (inbound) to block icmp echo-replies from 10.1.1.100 to 12.0.0.1 (or whatever IP is ultimately the source of the IP SLA traffic).  You could also experiment with CoPP (class maps can match input interface) to drop traffic if it comes back over the wrong interface.

                            • 11. Re: IP SLA constant state change
                              Naps

                              Hi Sergey,

                               

                              But Sending ping to the next hope ip some time may lead to network connectivity issue.

                               

                              For example : If you are trying to reach any server/Site beyond your ISP connection. You are tracking your Next hope ip which is in our case 200.1.1.2. This link is up and working fine then our track object will not go down mean a while there is some issue in between ISP and server. In that case User will not be able to reach Server but as our track object is up so all the packet will go to ISP1 and there will be no Failover.

                               

                              As per my understanding tracking an IP which comes after crossing our ISP will be good practice.

                              • 12. Re: IP SLA constant state change
                                Sergey

                                Sandeep,

                                 

                                You are right, but as you can see from Trent's example, it can lead to other complications. Usually what I've seen being done in production to test ISP's connectivity to the rest of the world is that: an unused host on the internet was chosen. For example, 23.235.37.144. A static route to that host IP was configured via one link only. That ensured that any probes were sent only via one link. And it also shows that even though connectivity to the SP itself is up and healthy, the SP's transit links might be down.