[Rdo-list] Trying out Neutron Quickstart running into issues with netns (l2 agent and dhcp agent)

Maru Newby marun at redhat.com
Mon Aug 5 17:31:33 UTC 2013


On Aug 5, 2013, at 10:23 AM, Brent Eagles <beagles at redhat.com> wrote:

> On 08/04/2013 11:27 AM, Perry Myers wrote:
>> Hi,
>> 
>> I followed the instructions at:
>> http://openstack.redhat.com/Neutron-Quickstart
>> http://openstack.redhat.com/Running_an_instance_with_Neutron
>> 
>> I ran this on a RHEL 6.4 VM with latest updates from 6.4.z.  I made sure
>> to install the netns enabled kernel from RDO repos and reboot with that
>> kernel before running packstack so that I didn't need to reboot the VM
>> after the packstack install (and have br-ex disappear)
>> 
>> The packstack install went without incident.  And I was able to follow
>> the launch an instance instructions.
>> 
>> I noticed that the cirros VM took a long time to get to a login prompt
>> on the VNC console.  From looking at the console output it appears that
>> the instance was waiting for a dhcp address.
>> 
>> Once the VNC session got me to a login prompt, I logged in (as the
>> cirros user) and confirmed that eth0 did not have an ip address.
>> 
>> So, something networking related prevented the instance from getting an
>> IP which of course makes ssh'ing into the instance via the floating ip
>> later in the instructions not work properly.
>> 
>> I tried ifup'ing eth0 and dhcp discovers were sent out but not responded to.
>> 
>> One thing is that on the host running OpenStack services (the VM I ran
>> packstack on), I don't see dnsmasq running except for the default
>> libvirt network:
>> 
>>> [admin at rdo-mgmt ~(keystone_demo)]$ ps -ef | grep dnsmas
>>> nobody    1968     1  0 08:59 ?        00:00:00 /usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --except-interface lo --bind-interfaces --listen-address 192.168.122.1 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
>> 
>> So... that seems to be a problem :)
>> 
>> Just to confirm, I am running the right kernel:
>>> [root at rdo-mgmt log(keystone_demo)]# uname -a
>>> Linux rdo-mgmt 2.6.32-358.114.1.openstack.el6.x86_64 #1 SMP Wed Jul 3 02:11:25 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
>> 
>>> [root at rdo-mgmt log(keystone_demo)]# rpm -q iproute kernel
>>> iproute-2.6.32-23.el6_4.netns.1.x86_64
>>> kernel-2.6.32-358.114.1.openstack.el6.x86_64
>> 
>> From quantum server.log:
>>> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error opening certificate file /var/lib/quantum/keystone-signing/signing_cert.pem
>>> 140222780139336:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/signing_cert.pem','r')
>>> 140222780139336:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
>>> 
>>> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error loading file /var/lib/quantum/keystone-signing/cacert.pem
>>> 140279285741384:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/cacert.pem','r')
>>> 140279285741384:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
>>> 140279285741384:error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib:by_file.c:279:
>> 
>> From quantum dhcp-agent.log:
>> 
>>> 2013-08-04 09:08:05    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>>     data = self._dataqueue.get(timeout=self._timeout)
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>>     return waiter.wait()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>>     return get_hub().switch()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>>     return self.greenlet.switch()
>>> Empty
>>> 2013-08-04 09:08:05    ERROR [quantum.agent.dhcp_agent] Failed reporting state!
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 702, in _report_state
>>>     self.agent_state)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>>     topic=self.topic)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>>     return rpc.call(context, self._get_topic(topic), msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>>     return _get_impl().call(CONF, context, topic, msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>>     rpc_amqp.get_connection_pool(conf, Connection))
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>>     rv = list(rv)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>>     raise rpc_common.Timeout()
>>> Timeout: Timeout while waiting on RPC response.
>>> 2013-08-04 09:08:05  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.853869 sec
>>> 2013-08-04 09:08:06     INFO [quantum.agent.dhcp_agent] Synchronizing state
>>> 2013-08-04 09:32:34    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
>>>     getattr(driver, action)()
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
>>>     reuse_existing=True)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
>>>     namespace=namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>>     ns_dev.link.set_address(mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>>     self._as_root('set', self.name, 'address', mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>>     kwargs.get('use_root_namespace', False))
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>>     namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>>     root_helper=root_helper)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>>     raise RuntimeError(m)
>>> RuntimeError:
>>> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'tap07d8cc77-fc', 'address', 'fa:16:3e:da:66:28']
>>> Exit code: 2
>>> Stdout: ''
>>> Stderr: 'RTNETLINK answers: Device or resource busy\n'
>>> 2013-08-04 09:32:36     INFO [quantum.agent.dhcp_agent] Synchronizing state
>>> 2013-08-04 09:32:41    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
>>>     getattr(driver, action)()
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
>>>     reuse_existing=True)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
>>>     namespace=namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>>     ns_dev.link.set_address(mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>>     self._as_root('set', self.name, 'address', mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>>     kwargs.get('use_root_namespace', False))
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>>     namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>>     root_helper=root_helper)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>>     raise RuntimeError(m)
>> 
>> The RTNETLINK errors just repeat indefinitely
>> 
>> From openvswitch-agent.log:
>> 
>>> 2013-08-04 09:08:29    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>>     data = self._dataqueue.get(timeout=self._timeout)
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>>     return waiter.wait()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>>     return get_hub().switch()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>>     return self.greenlet.switch()
>>> Empty
>>> 2013-08-04 09:08:29    ERROR [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Failed reporting state!
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/plugins/openvswitch/agent/ovs_quantum_agent.py", line 201, in _report_state
>>>     self.agent_state)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>>     topic=self.topic)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>>     return rpc.call(context, self._get_topic(topic), msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>>     return _get_impl().call(CONF, context, topic, msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>>     rpc_amqp.get_connection_pool(conf, Connection))
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>>     rv = list(rv)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>>     raise rpc_common.Timeout()
>>> Timeout: Timeout while waiting on RPC response.
>> 
>> Do we have a race condition wrt various Quantum agents connecting to the
>> qpid bus that is just generating initial qpid connection error messages
>> that can be safely ignored?
>> 
>> If so, is there any way we can clean this up?
>> 
>> From l3-agent.log:
>> 
>>> 2013-08-04 09:08:06    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>>     data = self._dataqueue.get(timeout=self._timeout)
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>>     return waiter.wait()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>>     return get_hub().switch()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>>     return self.greenlet.switch()
>>> Empty
>>> 2013-08-04 09:08:06    ERROR [quantum.agent.l3_agent] Failed reporting state!
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 723, in _report_state
>>>     self.agent_state)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>>     topic=self.topic)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>>     return rpc.call(context, self._get_topic(topic), msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>>     return _get_impl().call(CONF, context, topic, msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>>     rpc_amqp.get_connection_pool(conf, Connection))
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>>     rv = list(rv)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>>     raise rpc_common.Timeout()
>>> Timeout: Timeout while waiting on RPC response.
>>> 2013-08-04 09:08:06  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.554131 sec
>>> 2013-08-04 09:08:10    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>>     data = self._dataqueue.get(timeout=self._timeout)
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>>     return waiter.wait()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>>     return get_hub().switch()
>>>   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>>     return self.greenlet.switch()
>>> Empty
>>> 2013-08-04 09:08:10    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 637, in _sync_routers_task
>>>     context, router_id)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 77, in get_routers
>>>     topic=self.topic)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>>     return rpc.call(context, self._get_topic(topic), msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>>     return _get_impl().call(CONF, context, topic, msg, timeout)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>>     rpc_amqp.get_connection_pool(conf, Connection))
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>>     rv = list(rv)
>>>   File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>>     raise rpc_common.Timeout()
>>> Timeout: Timeout while waiting on RPC response.
>>> 2013-08-04 09:08:10  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 20.022704 sec
>>> 2013-08-04 09:11:33    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
>>>     self._process_routers(routers, all_routers=True)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
>>>     self.process_router(ri)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
>>>     self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
>>>     prefix=EXTERNAL_DEV_PREFIX)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>>     ns_dev.link.set_address(mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>>     self._as_root('set', self.name, 'address', mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>>     kwargs.get('use_root_namespace', False))
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>>     namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>>     root_helper=root_helper)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>>     raise RuntimeError(m)
>>> RuntimeError:
>>> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'qg-46ed452c-5e', 'address', 'fa:16:3e:e7:d8:30']
>>> Exit code: 2
>>> Stdout: ''
>>> Stderr: 'RTNETLINK answers: Device or resource busy\n'
>>> 2013-08-04 09:12:11    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
>>>     self._process_routers(routers, all_routers=True)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
>>>     self.process_router(ri)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
>>>     self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
>>>     prefix=EXTERNAL_DEV_PREFIX)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>>     ns_dev.link.set_address(mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>>     self._as_root('set', self.name, 'address', mac_address)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>>     kwargs.get('use_root_namespace', False))
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>>     namespace)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>>     root_helper=root_helper)
>>>   File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>>     raise RuntimeError(m)
>> 
>> Same qpid connection issue, which I'm assuming can just be ignored at
>> this point.  But also similar device busy errors with creating the
>> namespace for the l2 agent
>> 
>> It appears that the issue with both the l2 agent and the dhcp agent that
>> the namespace can't be created, which causes both of them to fail.
>> 
>> Anyone have any thoughts on what to look at next here?
>> 
>> Perry
> 
> I ran into these issues as well. I noticed that ovs_use_veth was commented out in dhcp_agent.ini and l3_agent.ini. I uncommented them and set them to True and restarted. The vm now has an IP address.
> 
> I noticed something else peculiar though... the public network.. the one set as the gateway for the router has dhcp enabled. I'm not sure why we would do that.

Good catch - an omission on my part.  I'll update packstack accordingly and make sure there weren't any other deviations.


m.

> Cheers,
> 
> Brent
> 





More information about the rdo-list mailing list