Category: Networking

  • Can consumer AI troubleshoot networks?

    Me and other members of the IPA team were pleased to assist NANOG 94, where we found lots of food, excellent drinks, great friends and colleagues, and state-of-the-art talks on many topics related to networking. This edition was focused on datacenters, protocols, automation, platforms and of course, AI.

    AI development and adoption is still on the hype, and it doesn’t seem to be slowing at any time soon, but rather accelerating. One of the talks regarding this topic was “GenAI powered Network Automation”, which asked us a very interesting question:

    • Can LLM Agents be Network Operators?

    Network troubleshooting can (and should) be a structured approach, based on standards and best practices, which hopefully can get us to a standardized methodology that we can apply to most vendors, systems and platforms. This structured methodology not only improves consistency in how issues are diagnosed and resolved but also enhances collaboration across teams.

    By basing our efforts on established standards and industry best practices, we can reduce guesswork, minimize downtime, and increase efficiency. I mean, hey, I would really like to keep your phone not ringing with emergencies.

    But all of us know that there will be bumps on the road. Is AI up to this task? Or at least, give us a helping hand?

    Of course, as with many new technologies over the years, there was excitement, consensus and disbelief, so we thought we would give it a try.

    Preparing our test environment

    Our objective here is to challenge the ability of some consumer AI tools, with a very simple scenario.

    Two routers, running IOS-XE, a few addresses v4 and v6 addresses, OSPF and OSPFv3 as IGP, and iBGP between them.

    By breaking protocols and running status, we expect the agents to be able to restore the network to a functioning state – with no previous information about the network.

    Configuration for both test routers is as simple as it gets..

    Router 1

    hostname R1
    ipv6 unicast-routing
    
    interface lo0
    ip address 192.168.1.1 255.255.255.255
    ipv6 address 2001:db8::192:168:1:1/128
    ipv6 ospf 1 area 0
    
    interface gi1
    ip address 10.1.2.1 255.255.255.252
    ipv6 address 2001:db8::2001:10:1:2:1/64
    ipv6 ospf 1 area 0
    
    router ospf 1
    router-id 192.168.1.1
    passive-interface default
    no passive-interface gi1
    network 192.168.1.1 0.0.0.0 area 0
    network 10.1.2.0 0.0.0.3 area 0
    
    router ospfv3 1
    router-id 192.168.1.1
    passive-interface default
    no passive-interface gi1
    network 192.168.1.0 0.0.0.0 area 0
    network 10.1.2.0 0.0.0.3 area 0
    

    Router 2

    hostname R2
    ipv6 unicast-routing
    
    interface lo0
    ip address 192.168.1.2 255.255.255.255
    ipv6 address 2001:db8::192:168:1:2/128
    ipv6 ospf 1 area 0
    
    interface gi1
    ip address 10.1.2.2 255.255.255.252
    ipv6 address 2001:db8::2001:10:1:2:2/64
    ipv6 ospf 1 area 0
    
    router ospf 1
    router-id 192.168.1.2
    passive-interface default
    no passive-interfaace gi1
    network 192.168.1.2 0.0.0.0 area 0
    network 10.1.2.0 0.0.0.3 area 0
    
    router ospfv3 1
    router-id 192.168.1.2
    passive-interface default
    no passive-interface gi1

    IPv4 and IPv6 Routing tables

    The previous config gets the network into the following state. We are aware of this, but we won’t feed the AI agents with this information. We wanna challenge them.

    And of course, we need to prompt them so the models can understand what we are doing. We will use the following:

    You are a network engineer. Your task is to help in tshoot and hopefully fix issues on your own. The only thing we know is something is broken between the communication between 2 routers. We dont have any knowledge of the previous state of the network, and lucky us, we can get CLI into both devices and check and change config. I dont want you to give me instructions but rather act as the engineer itself. I will just be your interface between your train of thought and the command line.

    How would you like to proceed?

    Gemini as a network operator – Full conversation

    ‎Gemini – Router Communication Troubleshooting Session


    Okay, understood. I'm the network engineer, and you're my interface to the devices. My goal is to troubleshoot and fix the communication issue between these two routers, starting from a blank slate regarding their previous state.
    First, I need to establish a baseline. I'll start by checking the basic connectivity and configuration on both routers.
    My initial approach will be:
    1. Verify Physical Connectivity and Interface Status:

    Are the interfaces up/up?
    Are there any obvious errors on the interfaces (CRC, input errors, etc.)?
    2. Check IP Addressing:

    Are the interfaces on the same subnet?
    Is the subnet mask correct?
    3. Test Basic Reachability (Ping):

    Can each router ping its directly connected neighbor's interface IP?
    4. Check Routing Table:

    Do both routers have routes to reach each other's networks?
    Are there any default routes that might be misdirecting traffic?
    What routing protocols are in use, if any?
    5. Inspect for ACLs or Firewall Rules:

    Are there any access control lists or firewall rules blocking traffic between the two routers?
    6. Review Log Messages:

    Are there any relevant error messages in the router logs that indicate a problem?
    Given this, my first step will be to log into both routers and check the interface status and IP configuration.
    What is the hostname or IP address of the first router you'd like me to investigate? Let's call them Router A and Router B for now. I'll need to know which one I'm on when I issue commands

    first router is R1, the other is R2

    I can get cli now into R1


    Excellent. I'm now on R1.
    My first command will be to check the interface status and IP configuration.
    Please issue the following command on R1:
    show ip interface brief
    This will give me a quick overview of all interfaces, their IP addresses, and their current status (up/down, administratively down).

    Okay, this is a critical piece of information.
    On R1, the GigabitEthernet1 interface, which has the IP address 10.1.2.1, is in an "administratively down" state. This means it's been manually shut down. The other GigabitEthernet interfaces are also administratively down and unassigned. The Loopback0 interface is up, but that's not the interface connecting to R2.
    This is likely the root cause of the communication issue. We need to bring up the interface that connects to R2. Assuming GigabitEthernet1 is the interface intended to connect to R2 (given its IP 10.1.2.1), we need to enable it.
    My next step is to bring up GigabitEthernet1 on R1.
    Please enter the following commands on R1:
    configure terminal
    interface GigabitEthernet1
    no shutdown
    end
    copy running-config startup-config (to save the change)
    After this, I'll need to re-check the interface status on R1

    done, what should i do next

    Great! Now that we've issued the no shutdown command on GigabitEthernet1 on R1 and saved the configuration, the immediate next step is to verify if the interface has come up.
    Please issue the show ip interface brief command on R1 again.
    I'm looking for the "Status" and "Protocol" columns for GigabitEthernet1 to both show "up". If they do, it means Layer 1 and Layer 2 are operational on R1's side.

    Excellent! On R1, GigabitEthernet1 is now "up up". This means the physical layer and data link layer are operational on R1's side for that interface.
    Now we need to do the same investigation on R2. The communication is between two routers, so both ends of the link need to be functional.
    My next step is to investigate R2. Please get CLI access to R2.
    Once on R2, I'll need you to issue the same command we started with on R1:
    show ip interface brief
    This will give me the status of its interfaces and IP configuration, which is crucial for determining the next steps for R2.

    R2 looks like this


    Alright, I see I have console access to R2.
    Please press RETURN as prompted to get to the command prompt.
    Once you're at the prompt (it should look something like R2> or R2#), please issue the following command:
    show ip interface brief
    This will allow me to assess the status of R2's interfaces, similar to what we did on R1.

    Okay, this is excellent news!
    On R2, GigabitEthernet1 is also "up up" and has the IP address 10.1.2.2.
    Let's summarize:

    R1's GigabitEthernet1 is 10.1.2.1, status "up up".
    R2's GigabitEthernet1 is 10.1.2.2, status "up up".
    Both interfaces are in the same 10.1.2.0/24 subnet, and both are up at Layer 1 and Layer 2. This suggests our physical and data link layers are working correctly between these two interfaces.
    Now that we have both ends of the link up and configured with IP addresses in the same subnet, the next logical step is to test basic IP reachability between them.
    My next step is to attempt a ping from R2 to R1's GigabitEthernet1 interface.
    Please issue the following command on R2:
    ping 10.1.2.1
    Tell me the results of the ping. This will confirm Layer 3 connectivity.

    Excellent! We have 100% success rate on the ping from R2 to R1's GigabitEthernet1 interface (10.1.2.1). The !!!!! output confirms that all 5 ICMP echo requests were successful.
    This means that basic Layer 3 connectivity is established and working correctly between the GigabitEthernet1 interfaces of R1 and R2. The immediate communication issue between the two directly connected interfaces is resolved.
    Now, the initial problem description was "something is broken between the communication between 2 routers." While we've fixed the direct link, this doesn't necessarily mean all communication between the routers (e.g., between their internal networks or loopbacks) is resolved if there are other networks involved.
    My next step is to investigate the routing tables on both R1 and R2 to ensure they know how to reach each other's other networks (like the Loopback0 interfaces we saw earlier) or any other networks that might be behind them. Without knowledge of the network topology, I'll start with the Loopback0 interfaces as an example.
    First, let's check the routing table on R1.
    Please issue the following command on R1:
    show ip route
    I'll be looking for routes to 192.168.1.2 (R2's Loopback0) and any other expected networks.

    Okay, I see R1's routing table.
    Looking at the output, I see the following relevant entries:

    C 10.1.2.1/32 is directly connected, GigabitEthernet1: This is R1's own IP address on the link.
    L 10.1.2.1/32 is directly connected, GigabitEthernet1: This is the local route for the G1 interface.
    C 10.1.2.0/30 is directly connected, GigabitEthernet1: This is the directly connected network between R1 and R2.
    C 192.168.1.1 is directly connected, Loopback0: This is R1's own Loopback0 interface.
    O 192.168.1.2 [110/2] via 10.1.2.2, 00:11:45, GigabitEthernet1: This is a very important entry! It shows that R1 has learned a route to 192.168.1.2 (which we know is R2's Loopback0) via 10.1.2.2 (R2's GigabitEthernet1) through OSPF. The 00:11:45 indicates this route has been learned recently.
    The presence of the O (OSPF) route to 192.168.1.2 on R1 is a good sign. It tells me that OSPF is likely running and exchanging routing information between R1 and R2.
    Now, I need to check the routing table on R2 to confirm it also has a route back to R1's Loopback0 (192.168.1.1).
    Please issue the following command on R2:
    show ip route
    This will help us confirm if two-way routing is established for these non-directly connected networks.

    Excellent! Looking at R2's routing table, I see the following:

    C 10.1.2.0/30 is directly connected, GigabitEthernet1: The directly connected link network.
    L 10.1.2.2/32 is directly connected, GigabitEthernet1: R2's own IP on the link.
    C 192.168.1.2 is directly connected, Loopback0: R2's own Loopback0 interface.
    O 192.168.1.1 [110/2] via 10.1.2.1, 00:12:45, GigabitEthernet1: This confirms that R2 has also learned a route to R1's Loopback0 (192.168.1.1) via R1's GigabitEthernet1 IP (10.1.2.1) through OSPF.
    This is exactly what we want to see. Both routers have routes to each other's directly connected networks and to each other's Loopback0 interfaces, and these routes are being learned via OSPF. This indicates that OSPF is functioning correctly and exchanging routes between R1 and R2.
    Since we've:

    Brought up the interfaces on both sides.
    Verified direct Layer 3 connectivity with ping.
    Confirmed that OSPF is exchanging routes for the Loopback interfaces.
    The next logical step is to attempt to ping from one router's Loopback0 to the other router's Loopback0. This will test end-to-end routing between the two routers beyond just the directly connected link.
    My next step: I'd like to attempt a ping from R1's Loopback0 to R2's Loopback0.
    Please issue the following command on R1:
    ping 192.168.1.2 source Loopback0
    Tell me the results. If this ping is successful, it means the fundamental communication between the two routers is fully restored and routing is working as expected.

    Fantastic! We have a 100% success rate on the ping from R1's Loopback0 to R2's Loopback0. The !!!!! confirms all 5 echo requests were successful.
    This is a strong indication that the communication issue between the two routers has been resolved. We've gone from an "administratively down" interface to full end-to-end IP connectivity and verified routing protocol operation (OSPF).
    To summarize what we've done and the current state:

    Identified Root Cause on R1: Found GigabitEthernet1 on R1 was administratively down.
    Remediation on R1: Brought GigabitEthernet1 on R1 up with no shutdown.
    Verification on R1: Confirmed GigabitEthernet1 on R1 was "up up".
    Investigation on R2: Confirmed GigabitEthernet1 on R2 was already "up up" with an appropriate IP.
    Basic Connectivity Test: Successfully pinged from R2's GigabitEthernet1 to R1's GigabitEthernet1, confirming Layer 3 reachability on the direct link.
    Routing Table Verification: Confirmed both R1 and R2 had OSPF-learned routes to each other's Loopback0 interfaces.
    End-to-End Connectivity Test: Successfully pinged from R1's Loopback0 to R2's Loopback0, demonstrating full routing capability between the two routers beyond the direct link.
    At this point, based on the information provided and the successful pings, the core communication issue between R1 and R2 appears to be fixed.
    Is there anything else specific you'd like me to check, or any other aspects of their communication that were originally failing? Otherwise, my assessment is that the problem is resolved.

    Gemini’s approach

    • Gemini just assumed we were talking Cisco routers. Not good.
    • It never asked for config, or intended status of the network. Just wanted to set up whatever was down. And what about if it was down for a good reason?
    • It also assumed we were running just IPv4.
    • It did a great job identifying the down interfaces, validating L3 reachability with pings and confirming OSPF routes.

    ChatGPT as a network operator – Full conversation

    https://chatgpt.com/share/685a0d29-4960-8006-9511-7429aa97598b


    Perfect. Since we don’t know anything about the previous state, we’ll treat this as a black-box scenario and go step by step, starting from basic reachability and progressively checking control plane and data plane.
    Let’s start with these assumptions:

    Two routers, A and B.
    Something’s broken between them.
    I can interact with their CLI through you.
    Let’s begin with Router A.
    Please run the following commands on Router A and share the outputs:

    show ip interface brief 
    show ip route 
    ping <Router B interface IP> (if known, or else we’ll find it from #1) 
    show cdp neighbors or show lldp neighbors (depending on what’s supported) 
    Let’s gather these basics, and we’ll go from there.

    Thanks. Here's what I see on Router A (R1):

    All physical interfaces (Gi1 through Gi4) are administratively down. 
    Only the loopback interface Loopback0 is up with IP 192.168.1.1. 
    There are no routes other than the directly connected loopback.
    CDP and LLDP are both disabled, so we can't auto-discover neighbors.
    There's no interface on this router currently capable of communicating with anything external.
    So at this point, R1 is isolated—it’s not connected to anything, so there's no way it can reach Router B or any other device.
    Let’s now check Router B.
    Please run the same set of commands on Router B:

    show ip interface brief 
    show ip route 
    show cdp neighbors or show lldp neighbors 
    Once I see the status of B, I’ll decide the next best step.

    Excellent. Here's what we can conclude from R2:

    GigabitEthernet1 is up and assigned 10.1.2.2/30. 
    Loopback0 is up with 192.168.1.2. 
    Routing table confirms that 10.1.2.0/30 is directly connected. 
    This router is ready and active on Gi1. 
    Now, going back to R1, it has 10.1.2.1 configured on its GigabitEthernet1, but that interface is administratively down. That’s our problem.
    To bring communication up:
    Enable GigabitEthernet1 on R1.

    Please run this on R1:

    conf t
    interface GigabitEthernet1
    no shutdown
    end

    Then, confirm with:

    show ip interface brief


    Once that’s done, I’ll have you test connectivity.

    Loopback0. That tells us:

    Layer 1 and 2 are good (interface is up/up).
    Layer 3 is functional (OSPF adjacency established).
    OSPF is already configured and working between R1 and R2.
    Now let’s validate full connectivity.
    From R1, please run:

    ping 10.1.2.2
    ping 192.168.1.2
    show ip route


    This will confirm end-to-end reachability and whether OSPF has populated the routing table properly. Let’s check that next.

    Excellent — here’s the current status:

    ICMP to both 10.1.2.2 (R2's interface) and 192.168.1.2 (R2's loopback) is successful. 
    OSPF route to 192.168.1.2/32 is present and marked as O (intra-area), with a metric of 2. 
    R1 has full Layer 3 connectivity to R2.
    OSPF adjacency is up and functioning.
    Conclusion: The communication issue between the two routers is resolved.
    The root cause was that GigabitEthernet1 on R1 was administratively shut down. Once brought up, OSPF automatically established adjacency and exchanged routes.
    Let me know if you want to validate any specific protocol or service over this path now — or if we consider this troubleshooting session complete.

    ChatGPT approach

    • ChatGPT also assumed we were talking Cisco routers. Are there no other platforms around?
    • It neither asked for config, or intended status of the network. Just wanted to set up whatever was down.
    • It also assumed we were running just IPv4.
    • Tried to figure out network topology with CDP and LLDP, I consider this a good attempt.
    • It also figured out we were running OSPF, probably out of the informational messages.
    • It did a great job identifying the down interfaces, validating L3 reachability with pings and confirming OSPF routes.
  • Generalidades de los Problemas en Redes

    Este es un refrito de una vieja transmisión en vivo
    Ya con un poquito mas de experiencia en el editor, espero se transmita mejor

    “Houston, tenemos un problema”
    https://youtu.be/jIv2_Jho8lo

    Muchas de mis instancias las tengo ejecutando en Digital Ocean
    Si vos querés crear la tuya, dale con este link de referido que te van a dar $200 en crédito para usar en lo que quieras
    https://m.do.co/c/47b87a73eb62

    Estamos en
    – YouTube https://www.youtube.com/@cceste
    – Instagram https://www.instagram.com/ccesteok
    – Facebook https://www.facebook.com/ccesteok
    – Discord https://discord.gg/ZWQVg7cgdR

  • El Caso del Enlace Punto a Punto IPv6 – /64 o /127 ?

    El Caso del Enlace Punto a Punto IPv6 – /64 o /127 ?

    Qué nos conviene al momento de conectar dos equipos con IPv6 ?
    Usar un /64? O usar /127?
    Te explico las razones técnicas por las que debieras usar /64 para hacer tu vida mas fácil

    RFC6164 Using 127-Bit IPv6 Prefixes on Inter-Router Links
    https://datatracker.ietf.org/doc/html/rfc6164

    RFC draft-palet-v6ops-p2p-links-04 IPv6 Point-to-Point Links
    https://datatracker.ietf.org/doc/html/draft-palet-v6ops-p2p-links-04

    Configuring an LPM Table With Junos OS
    https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/ipv6-addresses-65-127-qfx-series.html

    Muchas de mis instancias las tengo ejecutando en Digital Ocean
    Si vos querés crear la tuya, dale con este link de referido que te van a dar $200 en crédito para usar en lo que quieras
    https://m.do.co/c/47b87a73eb62

    Estamos en
    – YouTube https://www.youtube.com/@cceste
    – Instagram https://www.instagram.com/ccesteok
    – Facebook https://www.facebook.com/ccesteok
    – Discord https://discord.gg/ZWQVg7cgdR

  • MPLS 09 – L3VPN en MikroTik RouterOS v6

    Continuando con VPNs en MPLS, vamos a ver rápidamente L3VPN en MikroTik RouterOS v6

    Bug RouterOS v6
    https://forum.mikrotik.com/viewtopic.php?t=78816&start=100

    Estamos en
    – YouTube https://www.youtube.com/@cceste
    – Instagram https://www.instagram.com/ccesteok
    – Facebook https://www.facebook.com/ccesteok
    – Discord https://discord.gg/ZWQVg7cgdR

  • Racks y dispositivos en Netbox

    Una de las principales funciones de Netbox es hacer las veces de DCIM, o DataCenter Inventory Management
    Es decir, dónde están mis equipos!

    En este video vamos a crear un rack y un switch para darle servicio a un cliente

    Esta instancia de Netbox la tengo ejecutando en Digital Ocean
    Si vos querés crear la tuya, dale con este link de referido que te van a dar $200 en crédito para usar en lo que quieras
    https://m.do.co/c/47b87a73eb62

    YouTube https://www.youtube.com/@cceste
    Instagram https://www.instagram.com/ccesteok
    Facebook https://www.facebook.com/ccesteok
    Discord https://discord.gg/ZWQVg7cgdR

  • MPLS 08 – Qué es L3VPN o VPN IP

    Continuando con VPNs en MPLS, vamos a ver rápidamente la teoría acerca de VPN IP o L3VPN

    Estamos en
    – YouTube https://www.youtube.com/@cceste
    – Instagram https://www.instagram.com/ccesteok
    – Facebook https://www.facebook.com/ccesteok
    – Discord https://discord.gg/ZWQVg7cgdR

    00:00 Intro y musiquita
    00:09 Bienvenida
    01:00 Agenda
    02:00 Terminología y Conceptos
    05:24 Control Plane vs Data Plane
    05:55 PE Routers vs PE Routers
    06:23 Diferentes tablas de rutas en el PE
    07:31 Qué es VRF ?
    09:21 Control Plane y Update BGP
    12:28 RD o Route Distinguisher
    13:20 RT o Route Target
    14:14 Label de VPN
    15:26 Juntando todo !
    17:50 Forwarding Plane o Data Plane
    18:50 Reenvío de Paquetes
    20:24 Stack de Etiquetas o Labels
    20:51 Despedida

  • Asignando contactos a sitios en Netbox

    Si tenés equipos instalados en un sitio, es indispensable que sepas a quien contactar para poder asistir al lugar y poder entrar, hacer mantenimiento, o acudir en una emergencia

    Esta instancia de Netbox la tengo ejecutando en Digital Ocean
    Si vos querés crear la tuya, dale con este link de referido que te van a dar $200 en crédito para usar en lo que quieras
    https://m.do.co/c/47b87a73eb62

    YouTube https://www.youtube.com/@cceste
    Instagram https://www.instagram.com/ccesteok
    Facebook https://www.facebook.com/ccesteok
    Discord https://discord.gg/ZWQVg7cgdR

  • Netbox DCIM Tenants y Sitios

    En nuestra infra de red o datacenter, los dispositivos existen en algún sitio, vamos che que no todo es cloud… y si es cloud eso también existe en algún sitio
    Netbox nos deja identificar cada una de estas particularidades para que no se nos extravíe ningun equipo

    Esta instancia de Netbox la tengo ejecutando en Digital Ocean
    Si vos querés usarlo también dale con este link de referido que te sirve a
    a vos que te van a dar $200 en crédito pasa usar así podés probarlo
    https://m.do.co/c/47b87a73eb62

    YouTube https://www.youtube.com/@cceste
    Instagram https://www.instagram.com/ccesteok
    Facebook https://www.facebook.com/ccesteok
    Discord https://discord.gg/ZWQVg7cgdR

  • Usando Cloudflare como CDN y Proxy

    A pesar de que estamos sirviendo nuestro servicio con Nginx, no está de más ponerle una capa adicionar arriba que nos brida el plan gratuito de Cloudflare
    Esto nos va a permitir usar su CDN y protección DDoS si es necesario

    Yo uso Digital Ocean, si vos querés usarlo también dale con este link de referido que te sirve a vos y a mi
    A vos te van a dar $200 en crédito pasa usar así podés probarlo
    https://m.do.co/c/47b87a73eb62

    YouTube https://www.youtube.com/@cceste
    Instagram https://www.instagram.com/ccesteok
    Facebook https://www.facebook.com/ccesteok
    Discord https://discord.gg/ZWQVg7cgdR

  • SSL automático con Let’s Encrypt en Docker

    No podemos dejar el servicio expuesto en HTTP, hay info sensible que se intercambia, como credenciales y datos de infraestructura de red
    Vamos a usar un contenedor de ayuda que nos va a gestionar la terminación SSL de nuestro Netbox

    Yo uso Digital Ocean, si vos querés usarlo también dale con este link de referido que te sirve a vos y a mi
    A vos te van a dar $200 en crédito pasa usar así podés probarlo
    https://m.do.co/c/47b87a73eb62

    El repo del proyecto lo encontrás en
    https://github.com/nginx-proxy/acme-companion

    YouTube https://www.youtube.com/@cceste
    Instagram https://www.instagram.com/ccesteok
    Facebook https://www.facebook.com/ccesteok
    Discord https://discord.gg/ZWQVg7cgdR