OSPF Troubleshooting on Cisco IOS-XE: Adjacency Failures, Flapping, and Fixes

Why OSPF Adjacency Matters (and Why It Breaks)

OSPF is the backbone routing protocol for most enterprise and service provider networks, but when adjacencies go down — or refuse to come up — troubleshooting it can feel like chasing ghosts. The protocol is unforgiving about mismatches: a single wrong parameter and two routers will sit in EXSTART forever, generating log noise and black-holing traffic.

This post walks through the most common OSPF adjacency failures on Cisco IOS-XE, with real show command output, root causes, and fixes. Whether you’re dealing with a flapping neighbor, a stuck EXSTART, or a mysterious DR election gone wrong, there’s a systematic path to the answer.

If you’re newer to how IOS-XE differs from classic IOS, check out our Cisco IOS vs IOS-XE vs IOS-XR breakdown before diving in.

OSPF Neighbor State Machine: A Quick Refresher

Before troubleshooting, you need to know what “normal” looks like. A fully adjacent OSPF neighbor goes through these states:

  1. Down — No hellos received
  2. Init — Hello received, but router doesn’t see itself in the neighbor’s hello
  3. 2-Way — Bidirectional communication confirmed; DR/BDR election happens here
  4. ExStart — Master/slave negotiation for DBD exchange
  5. Exchange — Database Description packets being exchanged
  6. Loading — LSR/LSU exchange to fill in missing LSAs
  7. Full — Adjacency complete; databases are synchronized

On point-to-point links, you want Full. On broadcast networks (Ethernet), non-DR/BDR routers will be 2-Way with each other but Full with the DR and BDR. Seeing anything else persistently is a problem.

The First Command: Always Start Here

R1# show ip ospf neighbor

Neighbor ID     Pri   State           Dead Time   Address         Interface
10.0.0.2          1   FULL/DR         00:00:37    192.168.1.2     GigabitEthernet0/0
10.0.0.3          1   EXSTART/  -     00:00:34    192.168.1.3     GigabitEthernet0/0
10.0.0.4          0   2WAY/DROTHER    00:00:38    192.168.1.4     GigabitEthernet0/0

That EXSTART state on 10.0.0.3 is your first flag. Something is blocking the DBD exchange. Read on.

Issue 1: Stuck in EXSTART — MTU Mismatch

This is the single most common OSPF adjacency failure in real networks. EXSTART/EXCHANGE stalls are almost always an MTU mismatch. OSPF uses the interface MTU in DBD packets, and if both sides don’t agree, the exchange never completes.

Symptoms

R1# show ip ospf neighbor detail | begin 10.0.0.3
 Neighbor 10.0.0.3, interface address 192.168.1.3
    In the area 0.0.0.0 via interface GigabitEthernet0/1
    Neighbor priority is 1, State is EXSTART, 6 transitions
    Dead timer due in 00:00:33
    Neighbor is up for 00:12:47
    Number of DBD retrans during last exchange 8

“DBD retrans” climbing is the giveaway. The routers keep trying to negotiate master/slave but can’t agree on packet sizes.

Confirm It

R1# show interface GigabitEthernet0/1 | include MTU
  MTU 1500 bytes, BW 1000000 Kbit/sec

R2# show interface GigabitEthernet0/0 | include MTU
  MTU 9000 bytes, BW 1000000 Kbit/sec

There it is — 1500 vs 9000. Jumbo frames on one side, standard on the other.

Fix Option 1: Correct the MTU (Preferred)

R2(config)# interface GigabitEthernet0/0
R2(config-if)# ip mtu 1500

Note: ip mtu adjusts the OSPF-visible MTU without changing the interface MTU. Use this when you can’t change the underlying interface MTU (common with tunnel interfaces or when jumbo frames are needed for other traffic).

Fix Option 2: Ignore MTU (Use Carefully)

R1(config)# interface GigabitEthernet0/1
R1(config-if)# ip ospf mtu-ignore

This tells OSPF to stop checking MTU during DBD exchange. It works, but you’re masking the mismatch — OSPF will come up but you may still drop packets larger than the smaller MTU. Use only when the MTU difference is intentional and you understand the fragmentation implications.

Issue 2: Hello/Dead Timer Mismatch

OSPF neighbors must agree on hello and dead intervals. By default, IOS-XE uses 10s hello / 40s dead on broadcast interfaces and 10s/40s on point-to-point. If someone changed one side, neighbors will never reach 2-Way.

Diagnose

R1# show ip ospf interface GigabitEthernet0/0

GigabitEthernet0/0 is up, line protocol is up
  Internet Address 192.168.1.1/24, Area 0, Attached via Network Statement
  Process ID 1, Router ID 10.0.0.1, Network Type BROADCAST, Cost: 1
  Transmit Delay is 1 sec, State DR, Priority 1
  Designated Router (ID) 10.0.0.1, Interface address 192.168.1.1
  Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
  oob-resync timeout 40
  Hello due in 00:00:04
R2# show ip ospf interface GigabitEthernet0/0
  Timer intervals configured, Hello 5, Dead 20, Wait 20, Retransmit 5

R1 is sending hellos every 10 seconds; R2 is sending every 5. They’ll never match. R1’s dead timer will expire before it receives a hello it considers valid.

Fix

R2(config)# interface GigabitEthernet0/0
R2(config-if)# ip ospf hello-interval 10
R2(config-if)# ip ospf dead-interval 40

Or standardize both to the faster timers for quicker convergence:

! On both routers:
R1(config-if)# ip ospf hello-interval 5
R1(config-if)# ip ospf dead-interval 20

Issue 3: Area Type Mismatch

Mixing regular areas with stub or NSSA areas breaks adjacency. A router configured as stub and its neighbor configured as normal will never form a full adjacency.

Diagnose

R1# show ip ospf | include Area
    Area BACKBONE(0)
    Area 10
        Number of interfaces in this area is 1
        It is a stub area

R2# show ip ospf | include Area
    Area BACKBONE(0)
    Area 10
        Number of interfaces in this area is 1
        It is a normal area

Area 10 is stub on R1, normal on R2. The hello packets include options bits that encode area type — they won’t match.

Fix

R2(config)# router ospf 1
R2(config-router)# area 10 stub

Both sides must use the same area type: stub, stub no-summary (totally stubby), or NSSA. If you want NSSA:

! Both routers in the area:
router ospf 1
 area 10 nssa

Issue 4: Authentication Mismatch

OSPF supports MD5 and SHA (IOS-XE) authentication. A key ID mismatch or wrong key value silently drops hellos — the neighbor never comes up and you see nothing obvious in logs.

Diagnose

R1# show ip ospf interface GigabitEthernet0/0 | include auth
  Simple password authentication enabled

R2# show ip ospf interface GigabitEthernet0/0 | include auth
  Cryptographic authentication enabled
    Youngest key id is 1

One is using plain-text, the other MD5. Also check for key ID mismatches:

R1# show ip ospf interface GigabitEthernet0/0 | include auth
  Cryptographic authentication enabled
    Youngest key id is 2

R2# show ip ospf interface GigabitEthernet0/0 | include auth
  Cryptographic authentication enabled
    Youngest key id is 1

Both sides have MD5 but different key IDs — they won’t match.

Fix — Standardize on MD5 Key ID 1

R1(config)# interface GigabitEthernet0/0
R1(config-if)# ip ospf authentication message-digest
R1(config-if)# ip ospf message-digest-key 1 md5 MySecureKey123

R2(config)# interface GigabitEthernet0/0
R2(config-if)# ip ospf authentication message-digest
R2(config-if)# ip ospf message-digest-key 1 md5 MySecureKey123

For modern IOS-XE (16.x+), consider using the newer key chain-based authentication:

key chain OSPF-KEYS
 key 1
  key-string MySecureKey123
  cryptographic-algorithm hmac-sha-256

interface GigabitEthernet0/0
 ip ospf authentication key-chain OSPF-KEYS

Issue 5: Duplicate Router IDs

Two routers with the same Router ID will have intermittent, hard-to-diagnose adjacency issues — one may come up but database exchange will be corrupted.

Diagnose

R1# show ip ospf neighbor
! Neighbor appears and disappears repeatedly

R1# show log | include OSPF
*Apr 27 14:23:11.445: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 10.0.0.2
  from GigabitEthernet0/0

That syslog message is definitive. Check Router IDs:

R1# show ip ospf | include Router ID
  Router ID 10.0.0.2

R2# show ip ospf | include Router ID
  Router ID 10.0.0.2

Fix

R2(config)# router ospf 1
R2(config-router)# router-id 10.0.0.3
R2(config-router)# do clear ip ospf process
Reset ALL OSPF processes? [no]: yes

Always explicitly set Router IDs in production — don’t rely on the highest loopback IP election. This prevents surprises when interfaces change.

Issue 6: Network Type Mismatch (Broadcast vs Point-to-Point)

This one is subtle. If one end of an Ethernet link is set to point-to-point and the other is broadcast, OSPF will partially come up but never reach Full.

R1# show ip ospf interface GigabitEthernet0/0 | include Network Type
  Network Type POINT_TO_POINT, Cost: 1

R2# show ip ospf interface GigabitEthernet0/0 | include Network Type
  Network Type BROADCAST, Cost: 1

Fix

! Both sides must match:
R1(config)# interface GigabitEthernet0/0
R1(config-if)# ip ospf network point-to-point

R2(config)# interface GigabitEthernet0/0
R2(config-if)# ip ospf network point-to-point

Point-to-point is often preferred on Ethernet links between two routers — it skips DR/BDR election and converges faster.

Troubleshooting Flapping Adjacencies

A neighbor that comes up and goes down repeatedly is harder to catch. Use these tools:

Check Neighbor History

R1# show ip ospf neighbor detail | begin 10.0.0.2
 Neighbor 10.0.0.2, interface address 192.168.1.2
    Neighbor is up for 00:02:14
    Number of state changes: 14
    Last state change 00:02:14 ago, State Full, Event HelloReceived

14 state changes on a neighbor that’s been up 2 minutes — it’s flapping hard. Now check why:

R1# show log | include OSPF|LINEPROTO
*Apr 27 14:01:22: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.0.2 on Gi0/0 from FULL to DOWN, Neighbor Down: Dead timer expired
*Apr 27 14:01:47: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.0.2 on Gi0/0 from LOADING to FULL

“Dead timer expired” means hellos are being dropped or delayed. Check for:

  • High CPU: show processes cpu sorted | head — OSPF hello processing can starve under load
  • Interface errors: show interface Gi0/0 | include error|drop|reset
  • QoS dropping control plane traffic: OSPF multicast (224.0.0.5/6) must be in a high-priority queue. See our QoS on Cisco IOS-XE guide for proper CoPP and QoS policy design

EEM Script: Alert on OSPF Flaps

Catch flapping neighbors automatically and log details when it happens:

event manager applet OSPF-FLAP-ALERT
 event syslog pattern "OSPF-5-ADJCHG.*from FULL to DOWN"
 action 1.0 syslog priority critical msg "OSPF adjacency dropped - capturing state"
 action 2.0 cli command "enable"
 action 3.0 cli command "show ip ospf neighbor detail"
 action 4.0 cli command "show ip interface brief"
 action 5.0 cli command "show log | include OSPF"
 action 6.0 mail server "10.0.0.10" to "noc@company.com" from "router@company.com" subject "OSPF Flap Detected on $_system_sysname" body "$_cli_result"

This EEM applet fires the moment OSPF logs an adjacency drop and emails a snapshot before the adjacency recovers and evidence disappears. Note: $_cli_result captures the output of the last CLI action — if you need all outputs, save each to a variable with action N set VAR "$_cli_result" and concatenate them in the body. Use $_system_sysname for the router hostname in EEM action scripts on IOS-XE.

Useful Debug Commands (Use Carefully in Production)

Debug commands generate significant output. Always use with debug ip ospf adj rather than the broad debug ip ospf which can overwhelm a router under load.

! Targeted adjacency debugging only:
R1# debug ip ospf adj
OSPF adjacency debugging is on

! What you'll see during a normal adjacency formation:
*Apr 27 14:05:01.123: OSPF-1 ADJ   Gi0/0: 2 Way Communication to 10.0.0.2, state 2WAY
*Apr 27 14:05:01.234: OSPF-1 ADJ   Gi0/0: Backup seen Event before WAIT timer
*Apr 27 14:05:01.345: OSPF-1 ADJ   Gi0/0: DR/BDR election
*Apr 27 14:05:01.456: OSPF-1 ADJ   Gi0/0: Elect BDR 10.0.0.2
*Apr 27 14:05:01.567: OSPF-1 ADJ   Gi0/0: Elect DR 10.0.0.1
*Apr 27 14:05:02.123: OSPF-1 ADJ   Gi0/0: Send DBD to 10.0.0.2 seq 0x2E9 opt 0x52 flag 0x7 len 32
*Apr 27 14:05:02.234: OSPF-1 ADJ   Gi0/0: Rcv DBD from 10.0.0.2 seq 0x1B3 opt 0x52 flag 0x7 len 32
*Apr 27 14:05:02.345: OSPF-1 ADJ   Gi0/0: NBR Negotiation Done. We are the MASTER
*Apr 27 14:05:03.456: OSPF-1 ADJ   Gi0/0: Neighbor change Event
*Apr 27 14:05:03.567: OSPF-1 ADJ   Gi0/0: 10.0.0.2 is now Full
! Always turn off debug when done:
R1# undebug all

OSPF Troubleshooting Checklist

When an OSPF neighbor won’t come up, run through this in order:

  1. show ip ospf neighbor — What state are they stuck in?
  2. show ip ospf interface [int] — Timers, network type, authentication, area
  3. show interface [int] | include MTU — MTU match on both sides?
  4. show ip ospf | include Router ID — Duplicate Router IDs?
  5. show log | include OSPF — Any error messages?
  6. ping [neighbor-ip] — Layer 3 reachability?
  7. show ip access-list — ACL blocking OSPF multicast 224.0.0.5/6?
  8. debug ip ospf adj — Only if above doesn’t reveal the issue

Most OSPF problems are caught at step 2 or 3. MTU mismatches and timer differences account for the majority of real-world adjacency failures.

Verifying a Healthy OSPF Topology

Once adjacencies are up, verify the full topology is correct:

! Check all neighbors are Full:
R1# show ip ospf neighbor | include FULL
10.0.0.2          1   FULL/DR         00:00:38    192.168.1.2     Gi0/0
10.0.0.3          1   FULL/BDR        00:00:35    192.168.1.3     Gi0/0

! Verify routes are being learned:
R1# show ip route ospf
Codes: O - OSPF, IA - OSPF inter area

O     10.1.1.0/24 [110/2] via 192.168.1.2, 00:04:22, GigabitEthernet0/0
O     10.2.2.0/24 [110/3] via 192.168.1.2, 00:04:22, GigabitEthernet0/0
O IA  172.16.0.0/24 [110/11] via 192.168.1.3, 00:04:22, GigabitEthernet0/0

! Check LSDB is complete:
R1# show ip ospf database summary

            OSPF Router with ID (10.0.0.1) (Process ID 1)

                Router Link States (Area 0)
Link ID         ADV Router      Age         Seq#       Checksum Link count
10.0.0.1        10.0.0.1        412         0x80000004 0x00A3B2 3
10.0.0.2        10.0.0.2        398         0x80000003 0x00C4D1 2
10.0.0.3        10.0.0.3        445         0x80000002 0x00B2E4 2

If you’re also managing BGP on the same devices, internal OSPF is often used as the IGP underneath — understanding how the two interact is covered in our BGP fundamentals guide.

Final Thoughts

OSPF troubleshooting is methodical: the protocol is strict about its parameters, but it’s also very transparent about what’s wrong when you know which commands to run. The majority of real-world failures fall into four buckets — MTU mismatches, timer mismatches, authentication problems, and area type mismatches. Nail those four and you’ll resolve 90% of adjacency issues before ever reaching for debug.

Set explicit Router IDs, standardize hello/dead timers across your environment, and consider deploying the EEM flap alert script — catching adjacency drops automatically saves hours of reactive troubleshooting during an outage.

Enjoying this post?

Get more guides like this delivered straight to your inbox. No spam, just tech and trails.