10 Replies Latest reply: Feb 21, 2012 2:07 AM by Terry RSS

    BGP strange issue

    Calin Chiorean

      Hello all!

       

      Even this is more related to troubleshooting than CCIE RS, I believe that it can be a good practice for the upcoming v4.0 exam which will include such a section.

       

      I have 2 routers (PE <-> CE) connected directly with a crosscable. On the PE router I have 2 VRFs (let's call them DODO and BOBO) and on the the physical interface, I have 2 subinterfaces one for each VRF. Connection is OK ( e.g. ping is working ).

       

      Now, under the BGP configuration, on each VRF I have a peering between PE and CE:

       

      #### PE ####

      address-family ipv4 vrf BOBO

      neighbor 1.1.1.1 remote-as 65300
      neighbor 1.1.1.1 description CE
      neighbor 1.1.1.1 password 7 123A2C243124
      neighbor 1.1.1.1 version 4
      neighbor 1.1.1.1 timers 5 15
      neighbor 1.1.1.1 activate
      neighbor 1.1.1.1 send-community both
      neighbor 1.1.1.1 as-override

       

      #### CE ####

      address-family ipv4 vrf BOBO

      neighbor 1.1.1.2 remote-as 65000
      neighbor 1.1.1.2 description PE
      neighbor 1.1.1.2 password 7 123A2C243124
      neighbor 1.1.1.2 version 4
      neighbor 1.1.1.2 timers 5 15
      neighbor 1.1.1.2 activate
      neighbor 1.1.1.2 send-community both

       

      Now with the obvious differences (IP subnet, password) the BGP is configured identical for BOBO and DODO VRF, but while DODO is working , on the BOBO VRF the BGP peering is flapping on and on. Some logs you have below:

       

      #### LOGS #####

      May 28 07:21:34 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
      May 28 07:21:50 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent
      May 28 07:21:50 UTC: %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0 bytes
      May 28 07:22:36 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
      May 28 07:22:51 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent
      May 28 07:22:51 UTC: %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0

       

      #### DEBUG IP BGP ... #####

       

       

      May 28 07:34:10 UTC: BGP: 1.1.1.1 open active, local address 1.1.1.2
      May 28 07:34:10 UTC: BGP: 1.1.1.1 went from Active to OpenSent
      May 28 07:34:10 UTC: BGP: 1.1.1.1 sending OPEN, version 4, my as: 65300, holdtime 15 seconds
      May 28 07:34:10 UTC: BGP: 1.1.1.1 send message type 1, length (incl. header) 45
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcv message type 1, length (excl. header) 26
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcv OPEN, version 4, holdtime 15 seconds
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcv OPEN w/ OPTION parameter len: 16
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has CAPABILITY code: 1, length 4
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has MP_EXT CAP for afi/safi: 1/1
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has CAPABILITY code: 128, length 0
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has ROUTE-REFRESH capability(old) for all address-families
      May 28 07:34:10 UTC: BGP: 1.1.1.1 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has CAPABILITY code: 2, length 0
      May 28 07:34:10 UTC: BGP: 1.1.1.1 OPEN has ROUTE-REFRESH capability(new) for all address-families
      BGP: 1.1.1.1 rcvd OPEN w/ remote AS 65000
      May 28 07:34:10 UTC: BGP: 1.1.1.1 went from OpenSent to OpenConfirm
      May 28 07:34:10 UTC: BGP: 1.1.1.1 went from OpenConfirm to Established
      CE #undebug all
      May 28 07:34:10 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Up
      CE #undebug all
      May 28 07:34:26 UTC: BGP: 1.1.1.1 connection timed out 15816ms (last update) 15000ms (hold time)
      May 28 07:34:26 UTC: BGP: 1.1.1.1 went from Established to Closing
      May 28 07:34:26 UTC: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down BGP Notification sent
      CE #undebug all
      May 28 07:34:26 UTC: %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0 bytes
      May 28 07:34:27 UTC: BGP: 1.1.1.1 connection timed out - has not accepted a message from us for 15000ms (hold time), 1 messages pending transmition
      CE #undebug all
      May 28 07:34:28 UTC: BGP: 1.1.1.1 connection timed out - has not accepted a message from us for 15000ms (hold time), 1 messages pending transmition
      May 28 07:34:28 UTC: BGP: 1.1.1.1 connection timed out - has not accepted a message from us for 15000ms (hold time), 1 messages pending transmition
      May 28 07:34:28 UTC: BGP: 1.1.1.1 connection timed out - has not accepted a message from us for 15000ms (hold time), 1 messages pending transmition
      May 28 07:34:28 UTC: BGP: 1.1.1.1 connection timed out - has not accepted a message from us for 15000ms (hold time), 1 messages pending transmition

       

       

      What I have tried:

      -reconfigure BGP for VRF BOBO from zero

      -change the p2p subnet

      -change BGP paramenters (e.g. timers)

      -reload both devices

      -change the subinterface configuration (e.g. different dot1q vlan ID)

      -hard / soft peering reset

       

      Now if you can help me with additional troubleshooting tips I will be grateful. Additional info, all this issues started after a power outage when both devices reload...but if this is hardware related, then it's a strange problem.

       

      Thanks and cheers,

      Calin

        • 1. Re: BGP strange issue
          Scott Morris - CCDE/4xCCIE/2xJNCIE

          FYI, CE's don't typically run VRFs.

           

          Neither side appears to have any routes to send. 

           

          I'd try removing your address family from the CE side and see if that simplifies your message output there.  (should work, but rule out oddities) and try adding some networks or redistributing something.

           

          A lot of times this comes with recursion of a route.  Try turning on "debug ip routing" and see if anything strange is there.

           

          HTH,

           

          Scott

          • 2. Re: BGP strange issue
            Conwyn

            Hi Scott

             

            Here is a version with Dynamips. It looks OK.

            I suspect they had something in runn config and when they re-booted they lost it.

             

            Regards Conwyn

             

             

            version 12.4
            service timestamps debug datetime msec
            service timestamps log datetime msec
            no service password-encryption
            !
            hostname PE
            !
            boot-start-marker
            boot-end-marker
            !
            !
            no aaa new-model
            memory-size iomem 5
            ip cef
            ip vrf AAAA
            rd 11:11
            !
            ip vrf BBBB
            rd 12:12
            !
            ip vrf forwarding
            !
            ip auth-proxy max-nodata-conns 3
            ip admission max-nodata-conns 3

            !
            interface Loopback0
            ip vrf forwarding AAAA
            ip address 10.0.11.21 255.255.255.255
            !
            interface Loopback11
            no ip address
            !
            interface Loopback12
            ip vrf forwarding BBBB
            ip address 10.0.12.24 255.255.255.255
            !
            interface FastEthernet0/0
            no ip address
            duplex auto
            speed auto
            !
            interface FastEthernet0/0.11
            encapsulation dot1Q 11
            ip vrf forwarding AAAA
            ip address 10.0.11.2 255.255.255.252
            !
            interface FastEthernet0/0.12
            encapsulation dot1Q 12
            ip vrf forwarding BBBB
            ip address 10.0.12.2 255.255.255.252
            !
            interface FastEthernet0/1
            no ip address
            shutdown
            duplex auto
            speed auto
            !
            router bgp 1
            no synchronization
            bgp log-neighbor-changes
            no auto-summary
            !
            address-family ipv4 vrf BBBB
              redistribute connected
              neighbor 10.0.12.1 remote-as 1
              neighbor 10.0.12.1 activate
              neighbor 10.0.12.1 send-community both
              no synchronization
            exit-address-family
            !
            address-family ipv4 vrf AAAA
              redistribute connected
              neighbor 10.0.11.1 remote-as 1
              neighbor 10.0.11.1 activate
              no synchronization
              network 0.0.0.0
            exit-address-family
            !
            ip forward-protocol nd
            !
            !
            ip http server
            no ip http secure-server
            !
            control-plane
            !
            line con 0
            line aux 0
            line vty 0 4
            login
            !
            !
            end

            PE#show ip bgp all summary
            For address family: VPNv4 Unicast
            BGP router identifier 10.0.12.2, local AS number 1
            BGP table version is 12, main routing table version 12
            5 network entries using 685 bytes of memory
            6 path entries using 408 bytes of memory
            3/2 BGP path/bestpath attribute entries using 372 bytes of memory
            0 BGP route-map cache entries using 0 bytes of memory
            0 BGP filter-list cache entries using 0 bytes of memory
            BGP using 1465 total bytes of memory
            BGP activity 5/0 prefixes, 6/0 paths, scan interval 15 secs

            Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
            10.0.11.1       4     1      14      15       12    0    0 00:07:39        0
            10.0.12.1       4     1       6       7       12    0    0 00:01:52        2
            PE#show ip route vrf AAAA

            Routing Table: AAAA
            Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
                   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
                   N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
                   E1 - OSPF external type 1, E2 - OSPF external type 2
                   i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
                   ia - IS-IS inter area, * - candidate default, U - per-user static route
                   o - ODR, P - periodic downloaded static route

            Gateway of last resort is not set

                 10.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
            C       10.0.11.0/30 is directly connected, FastEthernet0/0.11
            C       10.0.11.21/32 is directly connected, Loopback0
            PE#show ip route vrf BBBB

            Routing Table: BBBB
            Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
                   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
                   N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
                   E1 - OSPF external type 1, E2 - OSPF external type 2
                   i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
                   ia - IS-IS inter area, * - candidate default, U - per-user static route
                   o - ODR, P - periodic downloaded static route

            Gateway of last resort is not set

                 10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
            C       10.0.12.0/30 is directly connected, FastEthernet0/0.12
            B       10.0.12.12/32 [200/0] via 10.0.12.1, 00:06:03
            C       10.0.12.24/32 is directly connected, Loopback12

            *********************************************************************************
            *********************************************************************************
            !
            version 12.4
            service timestamps debug datetime msec
            service timestamps log datetime msec
            no service password-encryption
            !
            hostname CE
            !
            boot-start-marker
            boot-end-marker
            !
            !
            no aaa new-model
            memory-size iomem 5
            ip cef
            !
            !
            !
            !
            ip vrf AAAA
            rd 11:11
            !
            ip vrf BBBB
            rd 12:12
            !
            ip auth-proxy max-nodata-conns 3
            ip admission max-nodata-conns 3

            !
            interface Loopback11
            ip vrf forwarding AAAA
            ip address 10.0.11.11 255.255.255.255
            !
            interface Loopback12
            ip vrf forwarding BBBB
            ip address 10.0.12.12 255.255.255.255
            !
            interface FastEthernet0/0
            no ip address
            duplex auto
            speed auto
            !
            interface FastEthernet0/0.1
            ip vrf forwarding AAAA
            !
            interface FastEthernet0/0.11
            encapsulation dot1Q 11
            ip vrf forwarding AAAA
            ip address 10.0.11.1 255.255.255.252
            !
            interface FastEthernet0/0.12
            encapsulation dot1Q 12
            ip vrf forwarding BBBB
            ip address 10.0.12.1 255.255.255.252
            !
            interface FastEthernet0/1
            no ip address
            shutdown
            duplex auto
            speed auto
            !
            router bgp 1
            no synchronization
            bgp log-neighbor-changes
            no auto-summary
            !
            address-family ipv4 vrf BBBB
              redistribute connected
              neighbor 10.0.12.2 remote-as 1
              neighbor 10.0.12.2 activate
              neighbor 10.0.12.2 send-community both
              no synchronization
            exit-address-family
            !
            address-family ipv4 vrf AAAA
              neighbor 10.0.11.2 remote-as 1
              neighbor 10.0.11.2 activate
              no synchronization
            exit-address-family
            !
            ip forward-protocol nd
            !
            !
            ip http server
            no ip http secure-server
            !
            control-plane

            line con 0
            line aux 0
            line vty 0 4
            login
            !
            !
            end

            CE#show ip bgp all summ
            For address family: VPNv4 Unicast
            BGP router identifier 10.0.12.1, local AS number 1
            BGP table version is 9, main routing table version 9
            5 network entries using 685 bytes of memory
            6 path entries using 408 bytes of memory
            3/2 BGP path/bestpath attribute entries using 372 bytes of memory
            0 BGP route-map cache entries using 0 bytes of memory
            0 BGP filter-list cache entries using 0 bytes of memory
            BGP using 1465 total bytes of memory
            BGP activity 5/0 prefixes, 6/0 paths, scan interval 15 secs

            Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
            10.0.11.2       4     1      15      14        9    0    0 00:10:09        2
            10.0.12.2       4     1      10       9        9    0    0 00:04:21        2
            CE#


            CE#show ip route vrf BBBB

            Routing Table: BBBB
            Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
                   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
                   N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
                   E1 - OSPF external type 1, E2 - OSPF external type 2
                   i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
                   ia - IS-IS inter area, * - candidate default, U - per-user static route
                   o - ODR, P - periodic downloaded static route

            Gateway of last resort is not set

                 10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
            C       10.0.12.0/30 is directly connected, FastEthernet0/0.12
            C       10.0.12.12/32 is directly connected, Loopback12
            B       10.0.12.24/32 [200/0] via 10.0.12.2, 00:07:09
            CE#

            • 3. Re: BGP strange issue
              Scott Morris - CCDE/4xCCIE/2xJNCIE

              In general it'll work, but I'd still be curious for the need of running VRFs on the CE side.  This is typically done in an Inter-AS environment when "stitching" two VRFs together between the SPs.

               

              Scott

              • 4. Re: BGP strange issue
                Conwyn

                Hi Scott

                 

                It is useful for seperating Joint Venture Companies from the Main Corporate.

                 

                Regards Conwyn

                • 5. Re: BGP strange issue
                  Calin Chiorean

                  Hi Scott!

                   

                  Since I have this issue in the company's intranet and it's confidential stuff I cannot tell you why we are using VRFs on CE, but I can tell you it's working in more than 1 place and it was working here as well, until this power outage came into play.

                   

                  I will debug ip routing as adviced and hope to find something to help me crack this problem.

                   

                  Thanks you all for replying to my problem.

                   

                  Cheers,

                  Calin

                   

                  P.S. I already tried removing the VRF from CE and keep everything in the "global" table, but the issue is still there.

                  • 6. Re: BGP strange issue
                    tnewshott

                    If this occurred after a power outage, I would open a TAC case and have them start looking at it.  You pay for that support and should leverage it!

                     

                    Have you tried removing the VRFs, powering down the box, powering it back up, and then reconfiguring them?

                    • 7. Re: BGP strange issue
                      Scott Morris - CCDE/4xCCIE/2xJNCIE

                      If it's something specific to a power outage, I'd be looking at damage to the DRAM on the router causing oddball things to happen.  I guess the damage could be anyplace, but DRAM seems the most flowing/logical place to look first.

                       

                      And I understand about internal needs.  I thought it was just labbing.    So as long as you understand why you're separating things, it's all good.  Should work perfectly fine, I just didn't want to add an extra layer of complexity if not needed!

                       

                      Power outages are no fun.  But on the other hand, things like this stress the need for good UPSs in your network design!

                       

                      Scott

                      • 8. Re: BGP strange issue
                        Conwyn

                        Hi Calin

                         

                        Check you configuration register  (show ver) and make sure you are picking up the IOS image you are expecting from the correct compact flash

                        When you show ip bgp all sum and nei is what is the state Idle Active.

                         

                        Regards Conwyn

                        • 9. Re: BGP strange issue
                          suman

                          Hi there,

                           

                          How was this problem solved ? I'm having exactly same problem like yours .

                           

                          Regards

                          Lek

                          • 10. Re: BGP strange issue
                            Terry

                            Hey

                            Just had this same issue and the problem was related to IP addressing.

                            Check your IP addressing on the interface you have the vrf forwarding and make sure it isn't overlapping with another interface. Theoretically it shouldn't have accepted the commands and IOS should have notified you of overlapping address space. In my case that was it and it solved the problem when I used the correct mask...

                            HTH