VM to VM communication, same network, different compute hosts

In the last post, we spoke about VM to VM communication when they belong to the same network and happen to get deployed to the same host. This is a good scenario, but in a big openstack deployment, it’s unlikely that all your VMs belonging to the same network will end up on the same compute host. The more likely scenario is that VMs will be deployed on multiple compute hosts.

When the VMs lived on the same host, unicast traffic got handled using br-int. But we have to remember that br-int is local to the compute host. so when VMs get deployed on multiple compute hosts another technique needs to be used. Traffic will have to flow between the compute hosts over an overlay network. br-tun is responsible for the overlay network establishing and handling.  The overlay network can be VXLAN or GRE depending on your choice

Let’s look at how this looks like

vm-t-vm2

When VM1 wants to send unicast traffic to VM2, traffi will have to flow down from vNIC to tap to qbr to qvb-qvo veth pair. This time it get VLAN tagged on the br-int but have to exit through the patch interface to the br-tun bridge. The br-tun bridge strips the VLAN ID out of the traffic and pushes the traffic to every compute host in the environement over a dedicated lane, called VXLAN tunnel ID. You can think of the VXLAN tunnel ID as a way to seggregate traffic from different networks when it enters the overlay network (VXLAN in our case)

Let’s look again into the same example of VMs test and test2, but this time they are on different compute hosts. Their logical diagram remains unchanged

vm1-vm23

Now let’s look at Compute host 1 that hosts the “test” VM.

33

We will focus on the qvo portion and the br-tun bridge since we already know how the traffic will flow until it reaches the br-int.  So let’s see the VLAN tag for the traffic from the “test” VM.

1

As you can see , from the br-int definition, the tag of this traffic is VLAN ID 1

We also know that this traffic will have to exit the compute host via br-tun. So, let’s look at the br-tun openflow rules. We’r expecting to see the VLAN tag being stripped out and the VXLAN Tunnel ID to be added before traffic is sent over the overlay network

2

As you can see, outbound traffic with VLAN tag 1 gets its VLAN ID stripped and the traffic gets loaded over 0x39 tunnel ID. So we know that the traffic will be sent to all compute hosts in the environment (and network hosts as well) over the tunnel ID 0x39

Let’s see how compute host 2, that hosts “test2” vm looks like

6

Let’s look at the qvo portion and the br-tun definition

4

We see that the VLAN tag for qvo interface is 2, which is different from compute host 1. This is expected. Since the two VMs live on different hosts, there is no guarantee that their qvo interfaces will have a common VLAN ID.

So let’s look at the br-tun flow rules

5

As you can see, incoming traffic on VXLAN tunnel ID 0x39 gets VLAN tagged with VLAN tag 2 and gets sent to the br-int. In other words a path is opened for it to reach the qvo of “test2” instance. While if the traffic is outbound from “test2” VM, i.e. with VLAN tag 2, its VLAN tag gets stripped and the traffic gets sent over the overlay network with VXLAN tunnel ID 0x39.

So the basic idea is that traffic gets VLAN tagged on the br-int, and if it is destined to leave the host it get sent through the br-tun. The br-tun deos a VLAN ID to VXLAN tunnel ID translation. Where a dedicated VXLAN tunnel ID is given for every tenant network in your environment.

One point to mention here is that br-tun is smart, so instead of sending the traffic over the overlay network to every compute host and network host in the environment, it slowly learns what sits where. In other words the next time the “test” instance will send traffic to “test2” instance, traffic will be sent only to compute host 2, not to all compute hosts in the environment. This is done by adding an openflow rule to the br-tun flows for the MAC address of “test2” instance interface

 

 

 

 

VM to VM communication: Same network & same compute host

In a physical world, machines communicate with each other without routers when they belong to the same network. This is the same case with openstack, VMs communicate over the same network without routers.

When two VMs belonging to the same network happen to get deployed on the same compute host, their logical diagram looks like this

vm-part8

As we can see above, each VM will have its own tap device, qbr bridge, qvb-qvo veth pair and they both connect to br-int. br-int is in charge of VLAN tagging the traffic, and in this case it will VLAN tag this traffic to the same VLAN, since they belong to the same network.

We can verify this in the following example: 2 VMs test and test belong to the same network and the same subnet.

vm-part9

One thing to mention here, VLAN tags for the same network on the same host are the same. This applies regardless whether the VMs are on the same subnet or different subnets. Now let’s look into the VMs test & test2 logical diagram and focus on the qbr bridges definitions and the integration bridges definitions

vm-part12

using br-ctl show , we can see the qbr bridge for every VM and the associated interfaces

vm-part10

now let’s look at the definition of integration bridge using ovs-vsctl showvm-part11

as we see in the previous image, there are two qvo interfaces with VLAN tag “1”. So the idea is that since the VMs are on the same network, their qvo interfaces belong to the same VLAN tag on the same host. This way traffic can flow normally as with physical world, where switch ports are segregated using VLAN tags.

Unicast traffic flows between test and test2 VMs within the same host using the br-int bridge over dedicated VLAN tag for this particular network.

In openstack, as in physical world, switches have no idea if your machines/VMs are on different IP subnets. Switches operate at layer 2 so for them subnets are not visible. This is the reason that VLAN tag IDs are dedicated per network, not per subnet. So if you have a network with 2 subnets and you have a VM on each, their qvo interfaces will have the VLAN tag if they end up on the same compute host

Next post will be about VM to VM communication, same network but different compute hosts

 

Traffic flows from an Openstack VM

As we mentioned in the last post, traffic flows through a set of Linux virtual devices/switches to reach its destination after leaving the VM. Outbound traffic goes downward while inbound traffic moves upwards.

The flow of traffic from the VM goes through the following steps

VM-network

  • A VM generates traffic that goes through its internal vNIC
  • Traffic reaches the tap device where it undergoes traffic filtering using iptables to implement the security group rules for the security group attached to this VM
  • traffic leaves the tap device to go through the qbr bridge
  • qbr bridge hands the traffic over to qvb
  • qvb hands the traffic over to qvo
  • Traffic reaches the br-int. br-int VLAN tags the traffic and either
    • sends it to another port on br-int if the traffic is destined locally
    • sends it through the patch-tun interface to br-tun if the traffic is destined outside the host
  • br-tun receives the traffic on patch-int interface. It sends it through the established tunnels to other compute hosts and network hosts in the environment
    • br-tun is smart, it will add to its openflow rules specific rules such that it reduces waste traffic (i.e. traffic that gets sent to every host). It instead learns where VMs and routers exist and send traffic specifically to hosts that have the VMs or the routers on them
    • traffic is seggregated in the tunnels using a dedicated tunnel_id for each tenant network. This way individual network’s traffic doesn’t get mixed with other tenant network’s traffic

The above is the logical layout of traffic flow. If we wan to map everything to physical on the compute host.  The first portion is the running VM. In order to view the running process of the VM, we can do a ps -ef

vm-part1vm-part2

So as we see above the VM is a qemu-kvm process ( if you are using this hypervisor) that is attached to a tap device with a certain MAC address.

If we go a bit further , we can see how the qbr bridge is implemented, we can use br-ctl show

vm-part3

vm-part4

As shown above , a qbr bridge exists with two interfaces the tap interface and tap device

To verify that iptables rules are implemented on the tap device to reflect the security group rules. We can use iptables -L | grep tap4874fb2a   which is the tap device name mentioned above

vm-part5

The last step is to view the OVS switches using ovs-vsctl show

vm-part6

The output of ovs-vsctl show

vm-part7

As shown above, two switches exist br-tun and br-int

  • br-int has two ports
    • patch-tun to connect br-int to br-tun
    • qvo interface which is VLAN tagged with tag 1
  • br-int has two ports
    • patch-int to connect br-tun to br-int
    • vxlan interface which establishes the VXLAN tunnel to other hosts in the environment . There should be one interface per VXLAN tunnel to every host in the environment.

One thing to remember is that a VXLAN tunnel is the highway that connects multiple compute and network hosts. The real segregation happens using tunnel IDs which act as lanes dedicated for every tenant network

Neutron: How a VM communicates

In order to identify how a VM communicates in Openstack, we need to look into how it is connected logically when it’s created.  This will allow us to know the steps that the VM traffic will have to go through before reaching its destination

Before we speak about neutron, we first have to explain 6 main concepts in linux networking

1- TAP device: A tap device is a software-only interface, that a userspace program can attach to itself and send/receive packets to it. TAP devices are they way that KVM/QEMU implement virtual NIC (vNIC) attached to the VMs

2- veth pair: veth pair is a pair of virtual NIC cards connected via a virtual cable. If a packet is sent on one of them, it will come out of the other one and vise versa. veth are usually used to connect two entities.

3- Linux bridge: Linux bridge is a virtual switch implemented in Linux

4- Openvswitch: openvswitch is a more complicated vritual switch implemented in Linux. It allows openflow rules to be applied to traffic at layer 2 such that decisions are made on MAC addresses, VLAN ID of the traffic flow. Openvswitch provides native support for VXLAN tunnels

5- Patch interfaces in openvswitch :a special kind of interface that is used to connect two openvswitch switches

6- Network namespaces: An isolated network stack in Linux, where you can have isolated interfaces, routing tables and iptables rules. Network name spaces do not “see” each other’s traffic. This is vital for openstack since you let your users create their own VM networks and you need this level of isolation at layer 3.

As you understand those set of concepts, we can move ahead with the second set of concepts to understand.

An instance in openstack runs on a hypervisor. KVM is one of the most popular hypervisor with openstack deployments. An instance running on KVM has a virtual NIC card (vNIC) attached to it. This vNIC is the interface that the applications in the instance communicate to the outside world through. But for this vNIC to be operational, it has to be able to connect to something on the other end that gets it to the outside world. This is the purpose of the rest of the network architecture that I will explain next.

A VM in openstack logically looks like that

VM-network

On the compute node, the following virtual network architecture exists to support allowing the vNIC to communicate :

tap-uuid: Virtual interface that the instance connects to. IPtables rules are used on this tap device to implement the security groups asscociated with your instance. For example, if you enable http access ingress to your instance in a security group, you will find an iptables rule using ‘-i tap-xxxx …. -dst-port 80 -j ACCEPT’ to specify that port 80 ingress is allowed. This tap device is connected to the qbr bridge explained below.

qbr-uuid : A standard linux bridge with two ports , tap-xxxx and qvb-yyyy

qvb-uuid: Veth pair on the bridge side (qbr). qvb is a veth pair with qvo interface listed below

qvo-uuid: Veth pair on the openvswtich side, qvb ( mentioned above) and qvo are basically connected via a virtual cord and solely exist to connect qbr to the openvswitch bridge (br-int) mentioned below

br-int: is an openvswitch virtual switch which acts as integration point of all the running instances on the compute host. VLAN IDs are assigned per tenant network. This is important to remember that VLAN IDs are not assigned per tenant (user), but per tenant network. This means that if a tenant has multiple networks and instances on each network. The ports attached to those instances will have different VLAN IDs. The VLAN IDs are only significant to the same host. i.e. two VMs on two different hosts and on the same network may have different VLAN IDs (since br-int is different between hosts)

br-tun: is an openvswitch virtual switch. From its name it is in charge of creating tunnels to the rest of compute and network hosts in your openstack deployments. Tunnels act as highways for traffic between the compute/network hosts. The most common tunnel technology is VXLAN, which is based on UDP.

This was basically the logical layout of the VM networking in openstack.  In the next few posts we will look into the physical implementation and the traffic flows from and to the VMs in openstack