Posts Tagged ‘opennebula’

Opennebula: dhcpd contextualization magic

February 17, 2011

One of the most frequent questions on the Opennebula lists relates to network contextualization of Virtual Machines (VMs). Specifically, contrary to Eucalyptus or Nimbus, Opennebula does not directly manage a DHCP server. Instead Opennebula:

  • suggests using a simple rule for extracting the IPv4 address from the MAC address within the VM
  • manages just MAC addresses

This moves the burden of IPv4 configuration to the VM operating system, which has to dynamically calculate the IPv4 address details based on each interface MAC address. While Opennebula provides a relevant sample VM template and script to do this, it comes up a little bit short. Specifically, the script is linux specific, it will probably not work with other Unix O/S like Solaris or FreeBSD, let alone Windows. In addition, extra work is required in order to configure additional but required network parameters, like a default router or a DNS server.
(more…)

Advertisements

Opennebula, ZFS and Xen – Part 3 (Oracle VM server)

September 27, 2010

Read OpenNebula, ZFS and Xen, Part 1
Read OpenNebula, ZFS and Xen, Part 2

Oracle VM server is Oracle’s virtualization platform. As with most Oracle Linux offerings it’s essentially a Red Hat Enterprise Linux, which however bundles a more recent version of Xen.

# more /etc/enterprise-release
Oracle VM server release 2.2.1
# /usr/sbin/xm info | egrep 'major|minor|extra'
xen_major              : 3
xen_minor              : 4
xen_extra              : .0

Installation

Installing the Oracle VM server is pretty straightforward. Download Oracle VM server, burn the ISO image to a CD/DVD and boot the hypervisor host from it. The setup is pretty straightforward if you’ve installed RHEL or CentOS before, the only catch being that you don’t really get to select any packages to install.

Integrating with Opennebula

Integrating Oracle VM server with our Opennebula setup is similarly simple.

  • Add Oracle’s public repo to YUM
  • Use yum to install ruby
  • Remove requiretty from the sudo configuration
  • Add an entry for the Shared FS to /etc/fstab
  • # grep cloud /etc/fstab
    storage-server:/export/home/cloud /srv/cloud       nfs4    auto          0 0
    
  • Tweak rcpidmapd.conf so that it matches the NFSv4 domain of the storage server
  • Create the oneadmin user and cloud group

Opennebula, ZFS and Xen – Part 2 (Instant cloning)

September 26, 2010

The basic NFS powered setup suggested in the OpenNebula documentation set works but has a few deficiencies:

  • It doesn’t allow for a decoupled storage and frontend nodes, unless one mixes multiple protocols (i.e. iSCSI between the storage and frontend nodes, NFS between cluster and frontend nodes)
  • In case of a decoupled storage node it can be quite slow. Consider for example the time it takes to copy a 24G sparse image over a 1Gbps LAN.
frontend-node$ ls -lh disk.0
-rw-r--r-- 1 oneadmin cloud 25G Sep 14 17:45 disk.0
frontend-node$ ls -sh disk.0
803M disk.0
frontend-node$ time cp disk.0 disk.test
real    37m49.798s
user    0m7.141s
sys     16m51.376s
  • It is grossly inefficient. If each VM is a mostly identical copy of a master image, you shouldn’t need another 800GB for the VMs.

One may alleviate the above problems with a ridiculously expensive setup that combines FCP or iSCSI over a dedicated 10Gbps storage VLAN and an expensive SAN with deduplication capabilities. Alternately, one may leverage the power of ZFS, which provides for lightweight clones at zero performance cost.

Intro

Instant cloning relies on the relevant ZFS capabilities, which allows creating a new ZFS dataset (clone) based on an existing snapshot. Consider for example a ZFS dataset that contains the disk image of the OpenNebula sample VM:

# zfs list rpool/export/home/cloud/images/ttylinux
NAME                                      USED  AVAIL  REFER  MOUNTPOINT
rpool/export/home/cloud/images/ttylinux  28.0M   109G  27.9M  /srv/cloud/images/ttylinux

One can grab a snapshot of this ZFS dataset in a less than a second.

# time zfs snapshot rpool/export/home/cloud/images/ttylinux@test

real    0m0.477s
user    0m0.004s
sys     0m0.008s
root@qa-x2100-3:~# zfs list rpool/export/home/cloud/images/ttylinux@test
NAME                                           USED  AVAIL  REFER  MOUNTPOINT
rpool/export/home/cloud/images/ttylinux@test      0      -  27.9M  -

And create a clone of this snapshot again in an instant:

# time zfs clone rpool/export/home/cloud/images/ttylinux@test \
   rpool/export/home/cloud/one/var/testclone

real    0m0.294s
user    0m0.024s
sys     0m0.048s
# zfs list rpool/export/home/cloud/one/var/testclone
NAME                                        USED  AVAIL  REFER  MOUNTPOINT
rpool/export/home/cloud/one/var/testclone     1K   109G  27.9M  /srv/cloud/one/var/testclone

Snapshots and clones besides being “instant” share a couple of extra interesting properties:

The above properties make ZFS the optimal choice for a VM storage subsystem.

Preparing the storage node

It should be clear by now that in a ZFS based OpenNebula setup the optimal storage choice is to create a snapshot for each “master image”, which will be cloned for each VM template based upon it.

1. Delegate the appropriate ZFS permissions to the oneadmin user

storage-node# zfs allow oneadmin clone,create,mount,share,sharenfs rpool/export/home/cloud

2. Create a separate ZFS dataset for each master image you need to support. The following example creates one dataset for a Solaris 10 Update 8 master image and one for the OpenNebula sample VM:

storage-node# zfs create rpool/export/home/cloud/images/S10U8
storage-node# zfs create rpool/export/home/cloud/images/ttylinux

3. Copy the master disk image under the dataset directory using a “disk.0” filename. It is important not to use a different filename.

storage-node# ls -lh /srv/cloud/images/S10U8/
total 803M
-rw-r--r-- 1 oneadmin cloud 670 Aug 16 20:17 S10U8.template
-rw-r--r-- 1 oneadmin cloud 25G Sep 14 17:45 disk.0
-rw-r--r-- 1 oneadmin cloud 924 Sep 23 15:50 s10u8.one

4. Grab a “golden snapshot” of your image once it’s ready

zfs snapshot rpool/export/home/cloud/images/S10U8@golde

5. Create the ZFS dataset that will store the ZFS clones.

storage-node# zfs create rpool/export/home/cloud/one/var

6. Make sure that the oneadmin user can do a non-interactive login from the frontend to the storage node with SSH key-based authentication.


oneadmin@frontend-node$ ssh ${storage-node) echo > /dev/null && echo success
success

Instant cloning

Having prepared the storage server, it’s time to customize the NFS transfer driver, in order to create a ZFS clone to create the VM instance image instead of a cp(1) over NFS. The NFS driver essentially takes two arguments similar to the following:

  • frontend-node:/srv/cloud/images/S10U8/disk.0
  • cluster-node:/srv/cloud/one/var/${VMID}/images/disk.0

and after some straightforward parsing executes the following commands:

oneadmin@frontend$ mkdir -p /srv/cloud/one/var/${VMID}/images/
oneadmin@frontend$ cp /srv/cloud/images/S10U8/disk.0 /srv/cloud/one/var/${VMID}/images/disk.0

Essentially we want to tweak the cloning script to run the following commands instead:

oneadmin@frontend$ ssh storage-server zfs create rpool/export/home/cloud/var/${VMID}
oneadmin@frontend$ ssh storage-server zfs create rpool/export/home/cloud/var/${VMID}/images
oneadmin@frontend$ ssh storage-server zfs clone rpool/export/home/cloud/images/S10U8@gold \
                   rpool/export/home/cloud/var/${VMID}/images

Unfortunately the above commands will not work. The reason is that OpenNebula uses mkdir(1) to create “/srv/cloud/var/${VMID}” before calling the cloning script (not really certain if I should file a bug for it), hence creating a dataset under an already existing directory, something that may lead to funny behavior.

This is the reason we created a separate dataset to host our clones. Having done that we can slightly revise the above commands:

oneadmin@frontend$ ssh storage-server zfs clone rpool/export/home/cloud/images/S10U8@gold \
                   rpool/export/home/cloud/var/images/${VMID}
oneadmin@frontend$ ln -s /srv/cloud/one/var/images/${VMDIR} /srv/cloud/one/var/${VMDIR}/images

As simple as that. It does add an extra parsing logic to figure out the ZFS dataset path but the results as evidenced by the oned.log are astounding:


Thu Sep 23 11:06:15 2010 [TM][D]: Message received: LOG - 73 tm_clone.sh: opennebula.sil.priv:/srv/cloud/images/S10U8/disk.0 10.8.3.218:/srv/cloud/one/var/73/images/disk.0
Thu Sep 23 11:06:15 2010 [TM][D]: Message received: LOG - 73 tm_clone.sh: Cloning ZFS rpool/export/home/cloud/images/S10U8@gold to
Thu Sep 23 11:06:15 2010 [TM][D]: Message received: LOG - 73 tm_clone.sh: Executed "chmod a+w /srv/cloud/one/var/73/images/disk.0".

Thu Sep 23 11:06:15 2010 [TM][D]: Message received: TRANSFER SUCCESS 73 -

Cloning speed of one second. ONE!

The tm_clone.sh that implements the above commands follows:

#!/bin/bash

# -------------------------------------------------------------------------- #
# Copyright 2002-2009, Distributed Systems Architecture Group, Universidad   #
# Complutense de Madrid (dsa-research.org)                                   #
#                                                                            #
# Licensed under the Apache License, Version 2.0 (the "License"); you may    #
# not use this file except in compliance with the License. You may obtain    #
# a copy of the License at                                                   #
#                                                                            #
# http://www.apache.org/licenses/LICENSE-2.0                                 #
#                                                                            #
# Unless required by applicable law or agreed to in writing, software        #
# distributed under the License is distributed on an "AS IS" BASIS,          #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.   #
# See the License for the specific language governing permissions and        #
# limitations under the License.                                             #
#--------------------------------------------------------------------------- #

SRC=$1
DST=$2

ZFS_HOST=10.8.30.191
ZFS_POOL=rpool
ZFS_BASE_PATH=/export/home/cloud        ## this is the path that maps to /srv/cloud
ZFS_IMAGES_PATH=one/var/images          ## relative one ZFS_BASE_PATH
ZFS_CMD=/usr/sbin/zfs
ZFS_SNAPSHOT_NAME=gold

if [ -z "${ONE_LOCATION}" ]; then
    TMCOMMON=/usr/lib/one/mads/tm_common.sh
else
    TMCOMMON=$ONE_LOCATION/lib/mads/tm_common.sh
fi

. $TMCOMMON

get_vmdir

function arg_zfs_path
{
    dirname `echo $1 | sed -e "s#^/srv/cloud#${ZFS_POOL}${ZFS_BASE_PATH}#"`
}
function zfs_strip_pool
{
    echo $1 | sed -e "s/${ZFS_POOL}//"
}

function get_vm_path
{
    dirname `dirname $1`
}

SRC_PATH=`arg_path $SRC`
DST_PATH=`arg_path $DST`
VM_PATH=`get_vm_path $DST_PATH`
VM_ID=`basename $VM_PATH`

fix_paths

ZFS_SRC_PATH=`arg_zfs_path $SRC_PATH`
TMPPP=`arg_zfs_path $DST_PATH`
ZFS_DST_MNT_PATH=`zfs_strip_pool $TMPPP`
ZFS_DST_PATH=${ZFS_POOL}${ZFS_BASE_PATH}/${ZFS_IMAGES_PATH}/${VM_ID}

DST_DIR=`dirname $DST_PATH`

log "Cloning ZFS ${ZFS_SRC_PATH}@${ZFS_SNAPSHOT_NAME} to ${ZFS_DST_CLONE_PATH}"
exec_and_log "ssh ${ZFS_HOST} ${ZFS_CMD} clone ${ZFS_SRC_PATH}@${ZFS_SNAPSHOT_NAME} ${ZFS_DST_PATH}"
exec_and_log "ln -s ${VMDIR}/images/${VM_ID} ${VMDIR}/${VM_ID}/images"

exec_and_log "chmod a+w $DST_PATH"

VM deletion
Once a VM is teared down one may dispose of its image files. The NFS transfer driver does so by disposing of the images directory altogether, executing a command similar to the following:

oneadmin@frontend-node$ rm -rf /srv/cloud/one/var/${VMID}/images

This kind of works but is suboptimal, since the ZFS clone dataset holding the VM instance image remains around. Ideally the transfer driver deletion script should run the following command instead:

oneadmin@frontend-node$ ssh storage-server destroy rpool/export/home/cloud/one/var/images/${VMID}

The script implementing the above follows:


#!/bin/bash

# -------------------------------------------------------------------------- #
# Copyright 2002-2009, Distributed Systems Architecture Group, Universidad   #
# Complutense de Madrid (dsa-research.org)                                   #
#                                                                            #
# Licensed under the Apache License, Version 2.0 (the "License"); you may    #
# not use this file except in compliance with the License. You may obtain    #
# a copy of the License at                                                   #
#                                                                            #
# http://www.apache.org/licenses/LICENSE-2.0                                 #
#                                                                            #
# Unless required by applicable law or agreed to in writing, software        #
# distributed under the License is distributed on an "AS IS" BASIS,          #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.   #
# See the License for the specific language governing permissions and        #
# limitations under the License.                                             #
#--------------------------------------------------------------------------- #

SRC=$1
DST=$2

ZFS_HOST=10.8.30.191
ZFS_POOL=rpool
ZFS_BASE_PATH=/export/home/cloud        ## this is the path that maps to /srv/cloud
ZFS_IMAGES_PATH=one/var/images          ## relative one ZFS_BASE_PATH
ZFS_CMD=/usr/sbin/zfs
ZFS_SNAPSHOT_NAME=gold

if [ -z "${ONE_LOCATION}" ]; then
    TMCOMMON=/usr/lib/one/mads/tm_common.sh
else
    TMCOMMON=$ONE_LOCATION/lib/mads/tm_common.sh
fi

. $TMCOMMON

get_vmdir

function arg_zfs_path
{
    echo $1 | sed -e "s#^/srv/cloud#${ZFS_POOL}${ZFS_BASE_PATH}#"
}
function zfs_strip_pool
{
    echo $1 | sed -e "s/${ZFS_POOL}//"
}

function get_vm_path
{
    dirname `dirname $1`
}

fix_src_path

log $SRC_PATH
VM_ID=`basename \`dirname $SRC_PATH\``
DST_PATH=${VMDIR}/images/${VM_ID}
ZFS_DST_PATH=`arg_zfs_path ${DST_PATH}`

log "Destroying ${ZFS_DST_PATH} dataset"
exec_and_log "ssh ${ZFS_HOST} ${ZFS_CMD} destroy ${ZFS_DST_PATH}"

Miscellaneous notes

There are probably various miscellaneous enhancements that one can do in the above scripts (for instance the various ZFS variables and functions should be set in a single location). They are provided as-is without any commitments that they will work in your environment (though they probably will with minimal changes).

References

Opennebula, ZFS and Xen – Part 1 (Get going)

ZFS admin guide

Opennebula, ZFS and Xen – Part 1 (Get going)

September 26, 2010

I’ve been reading about and wanting to try OpenNebula for months. 10 days ago I managed to get going a basic setup and thought I’d document things a little bit on my blog as a reference.

Disclaimer: This is not an OpenNebula how-to but rather a series of posts capturing my own customizations. Hence it is intended as a supplement to the official documentation, not a replacement or a step-by-step HOW-TO.

For my setup I opted for Centos & OpenNebula 1.4 for the frontend node (2.0 entered RC just a couple of days ago so I may be upgrading soon), OpenSolaris Illumos for provision of the Shared FS (hereby storage node) and Xen for the cluster nodes.

Installing the storage node

The storage node setup is pretty straightforward.

  • Grab a decent 64-bit system with reasonably enough RAM, since ZFS is well known to be memory hungry. For my setup I chose a Sun X2100 M2 server lying unused in the datacenter with a Dual-Core AMD Opteron 1214 CPU and 8GB RAM.
  • Install the latest development build of OpenSolaris, b134. One may either install OpenSolaris 2009.06 and upgrade from the dev repository or just pick up the latest ISO image from genunix.org
  • Upgrade to Illumos; while not really necessary this will mean that you will have the latest and greatest ZFS bits on your system
  • Create the ZFS datasets required for OpenNebula (Solaris die-hards may prefer “pfexec zfs” as a non-root user ;)) and assign ownership to the oneadmin user


# zfs create rpool/export/home/cloud
# zfs set mountpoint=/srv/cloud rpool/export/home/cloud
# zfs create rpool/export/home/cloud/images
# zfs create rpool/export/home/cloud/one
# zfs create rpool/export/home/cloud/one/var
# chown -R oneadmin:cloud /srv/cloud

  • Create the cloud group and oneadmin user. Make sure that you note down the uid and groupid so that these are identical in your frontend and cluster nodes.
  • Share the top-level cloud ZFS dataset. One should share the top-level dataset as read-write only for the front-end node and the “one/var” sub-dataset as read-write for the cluster nodes. Keeping things simple this example showcases a read-write share for the entire cluster subnet


# zfs set sharenfs='rw=@192.168.1.0/24'

The above should be enough to get things going for the default out-of-the-box NFS setup that OpenNebula uses from a storage perspective, the only difference being that the Shared FS does not “live” on the frontend but an independent system.

Installing the frontend and cluster nodes

Once the storage node is properly setup follow the OpenNebula documentation to setup the frontend and cluster nodes, using an NFS transfer driver.

The only “catch” is that the Shared FS lives on a Solaris NFS4 server, and for that matter one that behaves funnily with NFS3 clients. Hence, you need to mount the Shared FS as NFS4 as such in /etc/fstab of your frontend and cluster nodes and make sure that the NFSv4 domain of the NFS4 clients and server match:

frontend-node$ grep cloud /etc/fstab
nfs-server-IP:/srv/cloud /srv/cloud/ nfs4 noauto 0 0
frontend-node$ cat /etc/idmapd.conf
[General]

Verbosity = 0
Pipefs-Directory = /var/lib/nfs/rpc_pipefs
Domain = mydomain.priv

[Mapping]

Nobody-User = nobody
Nobody-Group = nobody

# [Translation]
# Method = nsswitch
storage-node$ cat /var/run/nfs4_domain
mydomain.priv

Note that the NFS4 domain picked by the server is normally the DNS domain.

Validating the setup

Once the setup is complete, one may use the sample VM to validate it. Just:

  • copy it under “/srv/cloud/images/ttylinux” on the storage server
  • use onevm create on the frontend node to create a new instance of it
  • figure out the cluster node it got deployed to with “onevm list”
  • use virt-viewer on the cluster node to verify the VM has been properly launched and running