NetScaler fun with OpenStack keys and userdata

April 17, 2016

One of the things that’s been bugging me about NetScaler and OpenStack is the lack of basic integration. Its management network is configured via DHCP on first boot, or via config drive and userdata if DHCP is not available, but it doesn’t import SSH keys or runs userdata scripts for its initial configuration.

Thankfully, the above limitation maybe easily alleviated using the nsbefore.sh and nsafter.sh boot-time configuration backdoors. Here is a sample nsbefore.sh, based on the OpenStack docs, for VPX that can handle import of SSH keys:

root@ns# cat /nsconfig/nsbefore.sh
#!/usr/bin/bash
# Fetch public key using HTTP
ATTEMPTS=10
FAILED=0
while [ ! -f /nsconfig/ssh/authorized_keys ]; do
  curl -f http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key > /tmp/metadata-key 2>/dev/null
  if [ $? -eq 0 ]; then
    cat /tmp/metadata-key >> /nsconfig/ssh/authorized_keys
    chmod 0600 /nsconfig/ssh/authorized_keys
    rm -f /tmp/metadata-key
    echo "Successfully retrieved public key from instance metadata"
    echo "*****************"
    echo "AUTHORIZED KEYS"
    echo "*****************"
    cat /nsconfig/ssh/authorized_keys
    echo "*****************"
  else
    FAILED=`expr $FAILED + 1`
    if [ $FAILED -ge $ATTEMPTS ]; then
      echo "Failed to retrieve public key from instance metadata after $FAILED attempts, quitting"
      break
    fi
    echo "Could not retrieve public key from instance metadata (attempt #$FAILED/$ATTEMPTS), retrying in 5 seconds..."
    ifconfig 0/1
    sleep 5
  fi
done

Courtesy of the RedHat documentation a simple nsafter.sh that can retrieve and run a userdata is the following:

#!/usr/bin/bash

# Fetch userdata using HTTP
ATTEMPTS=10
FAILED=0
while [ ! -f /nsconfig/userdata ]; do
  curl -f http://169.254.169.254/openstack/2012-08-10/user_data > /tmp/userdata 2>/dev/null
  if [ $? -eq 0 ]; then
    cat /tmp/userdata >> /nsconfig/userdata
    chmod 0700 /nsconfig/userdata
    rm -f /tmp/userdata
    echo "Successfully retrieved userdata"
    echo "*****************"
    echo "USERDATA"
    echo "*****************"
    cat /nsconfig/userdata
    echo "*****************"
    /nsconfig/userdata
  else
    FAILED=`expr $FAILED + 1`
    if [ $FAILED -ge $ATTEMPTS ]; then
      echo "Failed to retrieve public key from instance metadata after $FAILED attempts, quitting"
      break
    fi
    echo "Could not retrieve public key from instance metadata (attempt #$FAILED/$ATTEMPTS), retrying in 5 seconds..."
    sleep 5
  fi
done

Simple enough. Now to put these to the test:

  1. Create a simple HEAT template
  2. # more template
    ################################################################################
    heat_template_version: 2015-10-15
    
    ################################################################################
    
    description: >
      Simple template to deploy a NetScaler with floating IP
    
    ################################################################################
    
    resources:
      testvpx:
        type: OS::Nova::Server
        properties:
          key_name: mysshkey
          image: NS_userdata
          flavor: m1.vpx
          networks:
            - network: private_network
          user_data_format: "RAW"
          user_data:
            get_file: provision.sh
    
      testvpx_floating_ip:
        type: OS::Neutron::FloatingIP
        properties:
          floating_network: external_network
    
      testvpx_float_association:
        type: OS::Neutron::FloatingIPAssociation
        properties:
          floatingip_id: { get_resource: testvpx_floating_ip }
          port_id: {get_attr: [testvpx, addresses, private_network, 0, port]}
    
  3. Import in Glance a NetScaler image with the above changes for nsbefore.sh and nsafter.sh; name it NS_userdata
  4. Create a simple test provisioning script
  5. # cat provision.sh
    #!/usr/bin/bash
    
    echo foo
    touch /var/tmp/foobar
    echo bar >> /var/tmp/foobar
    
    nscli -U :nsroot:nsroot add ns ip 172.16.30.40 255.255.255.0
    
  6. Create a stack and identify the NetScaler floating IP address
  7. # heat stack-create -f template vpx__userdata
    +--------------------------------------+------------------+--------------------+---------------------+--------------+
    | id                                   | stack_name       | stack_status       | creation_time       | updated_time |
    +--------------------------------------+------------------+--------------------+---------------------+--------------+
    | 540cb3d2-3b21-443c-a43b-10c745d28498 | vpx__userdata    | CREATE_IN_PROGRESS | 2016-04-17T16:49:49 | None         |
    +--------------------------------------+------------------+--------------------+---------------------+--------------+
    # # nova list | grep testvpx
    | 77388ebc-97e8-4a74-b863-40e822cb88c7 | vpx__userdata-testvpx-t3r3avxl7unc        | ACTIVE | -          | Running     | private_network=192.168.100.200, 10.78.16.139
    

This should be it. In order to verify everything went smoothly SSH into the instance using your private SSH key and run “sh ns ip” to verify that the provisioning script properly executed.

# ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i privatekey.pem nsroot@10.78.16.139
Warning: Permanently added '10.78.16.139' (RSA) to the list of known hosts.
###############################################################################
#                                                                             #
#        WARNING: Access to this system is for authorized users only          #
#         Disconnect IMMEDIATELY if you are not an authorized user!           #
#                                                                             #
###############################################################################

Last login: Sun Apr 17 16:51:06 2016 from 10.78.16.59
 Done
> sh ns ip
        Ipaddress        Traffic Domain  Type             Mode     Arp      Icmp     Vserver  State
        ---------        --------------  ----             ----     ---      ----     -------  ------
1)      192.168.100.200  0               NetScaler IP     Active   Enabled  Enabled  NA       Enabled
2)      172.16.30.40     0               SNIP             Active   Enabled  Enabled  NA       Enabled

Raspberry Pi: fun with docker

January 3, 2016

Today’s fun with docker

  1. Start with the resin.io Raspbian image
  2. docker pull resin/rpi-raspbian

  3. Install transmission and cleanup
  4. docker run -i -t resin/rpi-raspbian /bin/bash
    apt-get update
    apt-get install transmission-daemon vim-tiny
    mkdir /var/lib/transmission-daemon/incomplete
    rm -rf /var/lib/apt/lists/*
    apt-get clean
    exit

  5. Create a new image (note: use “docker ps -a” to identify your container ID)
  6. docker commit c072cafc6d18 mperedim/rpi-raspbian-transmission

  7. Fire up a new docker instance based on the image created. Replace the “media/bluedisk” paths below with the ones where you want your transmission downloads and incomplete files to be located
  8. docker run -d -p 9092:9091 --name downloads -v /media/bluedisk/transmission-daemon/downloads:/var/lib/transmission-daemon/downloads --name incomplete -v /media/bluedisk/transmission-daemon/incomplete:/var/lib/transmission-daemon/incomplete mperedim/rpi-raspbian-transmission /bin/bash -c "/usr/bin/transmission-daemon -a *.*.*.* -f --download-dir /var/lib/transmission-daemon/downloads --incomplete-dir /var/lib/transmission-daemon/incomplete --log-error"

  9. Hit port 9092 of your Raspberry Pi and start downloading stuff

TODO: autocreate from docker file, autostart on boot

Netscaler VPX: DHCP support

February 4, 2015

This is a quick recipe for enabling DHCP for your Netscaler VPX on KVM:

  1. Boot the KVM VPX instance per the instructions on the citrix site.
  2. Create /nsconfig/nsbefore.sh

  3. #!/bin/sh

    mkdir /var/db # to store lease files
    mkdir /var/empty
    /sbin/dhclient -l /var/db/dhclient.leases.1 0/1

  4. Create /nsconfig/nsafter.sh

  5. #!/bin/sh

    ADDR=`grep fixed-address /var/db/dhclient.leases.1 | awk '{print $NF}' | sed -e 's/;//' | uniq | tail -1`
    SUBNET=`grep subnet-mask /var/db/dhclient.leases.1 | awk '{print $NF}' | sed -e 's/;//' | uniq | tail -1`
    GATEWAY=`grep routers /var/db/dhclient.leases.1 | awk '{print $NF}' | sed -e 's/;//' | uniq | tail -1`

    grep "$ADDR" /nsconfig/ns.conf
    if [ $? != 0 ]; then
    nscli -U :nsroot:nsroot "set ns config -IPAddress $ADDR -netmask $SUBNET"
    nscli -U :nsroot:nsroot "savec"
    sleep 5
    yes | nscli -U :nsroot:nsroot "reboot"
    fi

    grep "$GATEWAY" /nsconfig/ns.conf

    if [ $? != 0 ]; then
    nscli -U :nsroot:nsroot "add route 0.0.0.0 0.0.0.0 $GATEWAY"
    nscli -U :nsroot:nsroot "savec"
    fi

  6. Power off your Netscaler VPX and save the disk image as a DHCP-enabled template.

2012 Nexus 7 with F2FS

January 9, 2015

This turns out to have been easier than anticipated, having followed at large the guide available at gadgetreactor. Some quick notes on fun facts I ran into:

  1. If you were thinking of using a VM and USB passthrough, think again. It may or may not work. Do yourself a favor and don’t waste a few hours trying with Virtualbox, VMWare player or whatever trying to make it work.
  2. If you are a decent cloud netizen, most of your stuff is already in Dropbox/GDrive/GMail/etc. Thus, if your tablet is predominantly a couch computing device you probably don’t need a backup in the first place. Worst case scenario, you’ll lose your angry birds high scores.
  3. TWRP, once booted, seems to expose a USB media device for read/write purposes. Hence, you don’t really need a USB OTG thumbdrive. Just make sure that TWRP is installed, boot into it and try to create/copy a file prior to wiping everything.
  4. There is some kind of issue with the current slim Gapps which requires the mini-gapps to be flushed before full-gapps. If you just install the full-gapps Google Play Store and Services will be unavailable and your N7 largely useless.

This was it. On retrospect it was rather easy, most of the time and effort spend on trying to make USB passthrough work rather than using my spare Windows laptop.

Device is quite responsive with the latest firmware. I really like the new “cards” layout when I view the open apps but other than that the new interface is a little bit heavy on visuals. Overall it’s probably a welcome upgrade.

Greek private sector

September 22, 2014

29/7: παραγγέλνεις καινούριο laptop (via Amplus)
12/9: επιτέλους το laptop φτάνει αργά το απόγευμα στα γραφεία της εταιρίας.
16/9: το παραλαμβάνει η Γενική Ταχυδρομική για λογαριασμό της UPS. Το παραδίδει στη UPS στην Αθήνα μετά από τρεις μέρες. Σε επικοινωνία μαζί μου ορκίζονται ότι ήταν μεμονωμένο περιστατικό (αλλά φυσικά δεν κάνανε καμιά προσπάθεια να καλύψουν το όποιο κόστος παράδοσης μες στο Σαββατοκύριακο ούτε προσφέραν κάποια επιστροφή χρημάτων σε αναγνώριση του κάκιστου επιπέδου υπηρεσιών)
22/9: μετά από ένα τιτανοτεράστιο mixup η UPS επειδή είμαστε καλοί πελάτες το παραδίδει τη Δευτέρα και όχι την Τρίτη που θα έφτανε κανονικά.
22/9: ανακαλύπτω πως οι box movers^W^W^W η εταιρία πληροφορικής όχι μόνο έκανε 45 μέρες να φέρει ένα laptop, αλλά η εξτρά μνήμη που παραγγέλθηκε ουδέποτε τοποθετήθηκε.

Όλα τα παραπάνω θα είχαν πολύ πλάκα αν δεν ήταν τυπικός ελληνικός ιδιωτικός τομέας και αν εγώ είχα επιτέλους μετά από 55 μέρες το laptop που παρήγγειλλα.

ceph: a quick critique

November 21, 2013

The other day I went ahead and had a short rant on Ceph at twitter:

This prompted a response by Ian Colle and I somehow managed to get myself to write a short blog post explaining my thoughts.

A good place to start is the ceph-deploy tool. I think this tweet sums up how I feel about the existence of the tool in the first place:

Now the tool itself could be great (more on that later). And it’s OK to involve it in a quick start guide of sorts. But I would have hoped that the deep dive sections provided some more insight on what is happening under the hood.

That said, the ceph guys have decided to go ahead with ceph-deploy. Maybe it cut the docs size by half (bundle what used to be 10+ steps in a single ceph-deploy invocation), maybe it makes user errors fewer and support much easier. So I bit the bullet and went ahead with it. Installed Ubuntu 13.10, typed “apt-get install ceph*” on my admin and my two test nodes and tried to start away hacking. 1 day later I was nowhere more near to having a working cluster working, my monitor health displaying 2 OSDs, 0 in, 0 up. It wasn’t a full day of work but it was frustrating. At the end of the day I gave up and decided to declare the Ubuntu Saucy packages broken.

Now I appreciate that InkTank may have nothing to do with the packages in the default Ubuntu repos. It may not provide them, it may not test against them. In fact most of their guides recommend using the repositories at ceph.com. But they’re there. And if something is in the repo, people expect for it to work.

Having finally bit the bullet I decided to go ahead with the “official” ceph-deploy and packages. This was not without its problems. Locating the packages for Ubuntu saucy took a little bit more time than it had to. Having resolved that even that I kept running into issues. Turns out that if at any point “you want to start over” purgedata is not enough. Turns out that this is a known problem too. “apt-get install –reinstall” fixed things for me and voila, I had a ceph cluster.

Neat. “ceph health” indicated my 2 OSDs up and running, I could mount the pool from a client, etc. Let me take a look at ceph.conf:


# cat /etc/ceph/ceph.conf
[global]
fsid = 2e36c280-4b7f-4474-aa87-9fe317388060
mon_initial_members = foo
mon_host = W.X.Y.Z
auth_supported = cephx
osd_journal_size = 1024
filestore_xattr_use_omap = true

This is it. No sections for my one monitor, my one MDS, my 2 OSDs. If you have read Configuring Ceph congrats. You still are non-the-wiser of where all these configuration settings are stored. I’ll find out. Eventually.

Was this the end of my problems? Almost. I went ahead, rebooted my small test cluster (2 servers; 1x MON/MDS/OSD, 1x OSD) and noticed the following:


ceph> health
HEALTH_WARN mds cluster is degraded

Thankfully that was an easy one. Clickety-click:


osd-0# mount /dev/sdb3 /var/lib/ceph/osd/ceph-0
osd-1# mount /dev/sdb3 /var/lib/ceph/osd/ceph-1
# ceph
ceph> health
HEALTH_OK

Does it work? Like a charm. But the frustration in the process over what seems to be silly bugs was constantly mounting. And it shouldn’t have. This is the simplest setup one could possibly come-up with, with just a couple of nodes. I was using the latest Ubuntu, not some niche distribution like Gentoo or Arch nor a distro with outdated packages like CentOS/RHEL/Debian-stable. I should have this up and running in an hour not a couple of days, so that I can hack at stuff of more direct interest to me.

Getting back to my original tweet: I exaggerated. You can certainly grab an expert from InkTank and help you set up Ceph. Or you can invest some time on your own. But I still wanted this to be simpler.

my history of linux

November 6, 2013
  • Tried various linux distros between 1997-2003; being a mostly Windows guy back at the time (yeah I did start out as a Windows sysadmin) I ended up not investing any time trying to figure out why the mouse would not work, X didn’t start etc.
  • I ended up installing Gentoo stage 1 back in 2004. Having to manually configure pretty much everything in the system essentially forced me to learn a bunch of new stuff, allowing me to actually land a job that required Unix-fu a year later.
  • I ended up uninstalling Gentoo from my desktop system a year later when I decided that wasting 3 hours to fix my LVM setup was too much for what should be a simple emerge update.
  • In $dayjob I’ve been dutifully running, maintaining and otherwise working with a bunch of Linux systems since 2005, CentOS, Fedora, Ubuntu, RHEL 5.x and RHEL 6.x, OpenSUSE, CoreOS and maybe others I forget. Having been a mostly Solaris fanboy from 2005 onwards I have a love/hate relationship with it but I am willing to admit that it gets the job done most of the time.
  • I still think that Linux on the desktop is not worth my time and effort. If it’s worth yours then great.
  • Oh, I had an Android phone for a year or so and still love my Google gen-1 Nexus 7. Do Android devices count? 🙂

There, happy @ebalaskas?

Netscaler 10.x, XenServer 6.2 and Cloudstack

November 1, 2013

Quick tech note. After my Cloudstack testbed upgrade my Netscaler VMs no longer booted. The issue was similar to the one described in CTX135226 but affected both Netscaler 9.3 and 10.1 VMs.

After wasting a few good hours it turns out that this is caused by the presence of a DVD drive in the VM. The mere presence of a DVD drive, even if no ISO is loaded, somehow messes up with the Netscaler boot process under XenServer 6.2. The workaround is trivial:

  1. 1. Spot the DVD drive (hdd) using the vbd-list command

  2. # xe vbd-list vm-name-label=NetScaler\ Virtual\ Appliance
    uuid ( RO) : 5e218b95-68e9-37b2-860f-8b97678e2c02
    vm-uuid ( RO): ab229b8f-2b18-4931-b80b-30ab8265b843
    vm-name-label ( RO): NetScaler Virtual Appliance
    vdi-uuid ( RO): 5bcf6d76-1d2f-411a-9dd7-469aad0f86ae
    empty ( RO): false
    device ( RO): hdd

    uuid ( RO) : 9d6ff62c-93fa-3af5-b4b5-c5c66a0556e4
    vm-uuid ( RO): ab229b8f-2b18-4931-b80b-30ab8265b843
    vm-name-label ( RO): NetScaler Virtual Appliance
    vdi-uuid ( RO): 7b1fa1ff-7388-4e5a-8c67-00e4d943a470
    empty ( RO): false
    device ( RO): hda

  3. Destroy it

  4. # xe vbd-destroy uuid=5e218b95-68e9-37b2-860f-8b97678e2c02

  5. 3. Power on the Netscaler VM again; this time it should work like a charm

This is good enough if you have access to the XenServer running the VM. Not as good of a workaround for a Cloudstack environment which requires you to jump a few extra hoops. Specifically, if you are using XenServer you can update the StartAnswer method to selectively invoke the createVbd function in CitrixResourceBase.java: if the guest OS type evaluate the guest OS type matches “Other install media” and the disk type is ISO skip VBD creation. Here is the relevant sample code snippet:


public StartAnswer execute(StartCommand cmd) {
...
String guestOsTypeName = getGuestOsType(vmSpec.getOs(), vmSpec.getBootloader() == BootloaderType.CD);
for (DiskTO disk : vmSpec.getDisks()) {
if (type == Volume.Type.ISO && gguestOsTypeName.toLowerCase().contains("other install media")) {
s_logger.debug("mperedim: skipping VBD creation");
} else {
createVbd(conn, disk, vmName, vm, vmSpec.getBootloader());
}
}
...

One can improve this workaround by mapping Netscaler VPXs to a dedicated OS template, so that DVD drives are still created for other VMs that get mapped to the “Other install media” XS template.

Xenserver: fake xen tools for Solaris 10 guests

March 31, 2013

Note: Also, you hopefully appreciate that this is completely unsupported.

Xenserver doesn’t enable Shutdown / Reboot buttons for VMs that don’t have the XenServer tools installed. This is an issue for my Solaris 10 guests since tools are not available for this platform. Which has been bugging me for some time.

So I went ahead and dug into the XenServer tools for linux. Turns out that the only thing they’re doing is updating a bunch of parameters on XenStore. However, Solaris 10 doesn’t have a device path for XenStore, putting us back into square one. Or not?

Not really. Turns out that Xen Tools installation is like orgasm memory. Sure, it’s a lot better if one has them, but if not one can modify the appropriate XenStore parameters from dom0 and fake it. XenServer couldn’t care less how the parameters were modified, as long as they are tweaked in the proper order the Suspend/Reboot/Shutdown buttons are enabled. So just get the dom-id of your Solaris VM:


[root@dom0 log]# xe vm-list name-label=i-2-323-VM params=dom-id
dom-id ( RO) : 154

and

[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/Installed 1
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MajorVersion 6
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MinorVersion 1
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MicroVersion 0
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/BuildVersion 59235
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/class "SunOS"
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/major "5"
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/minor "10"
[root@dom0 log]# xenstore-write /local/domain/154/data/updated 1

The above is enough to enable the shutdown/reboot/suspend buttons. Unfortunately in the process it also sets the “platform: viridian: true” parameter which doesn’t play nicely with Solaris VMs.

[root@dom0 log]# xe vm-list name-label=i-2-323-VM params=uuid
uuid ( RO)    : 5dc51848-bc9c-dd70-b670-2c7d263a7fe5
[root@dom0 log]# xe vm-param-remove param-name=platform param-key=viridian uuid=[...]

… and see the “Force shutdown”, “Force reboot” buttons disappearing.

So what works?

  1. Reboot: this does a clean reboot of the Solaris 10 domU
  2. Live migrate: not extensively tested, but a test VM does keep network connectivity after a single live migration.

Unfortunately shutdown only kind of works. Hitting the button does initiate a clean shutdown of the Solaris domU but the guest never seems to do an ACPI poweroff and gets stuck at “Press any key to reboot”. This is proving a slightly more touch nut to crack.

Update 2012/04/01: I’ve wasted a few too many hours on “shutdown” not working. Maybe I’ll revisit this in the future but calling it quits for now.

Cloudstack: OS type & xenserver templates

March 6, 2013

I’ve been using cloudstack for circa a month now for virtualising Solaris workloads. It has been mostly working like a charm, once I applied the appropriate workarounds (cf. my relevant findings, courtesy of my IoannisB citrix identity). However one thing has been bugging me for some time:


# xe vm-list name-label=i-2-271-VM params=name-description
name-description ( RW) : Template which allows VM installation from install media

My Solaris VMs are launched using the generic Xenserver template. This is not really to my liking for two reasons. Firstly, I have to apply the viridian:false modification to the default template. Secondly, there is no reason to appreciate whether a VM is a Solaris one or not using the Xenserver CLI.

The fix is to have Cloudstack using the “Solaris 10 (experimental)” template for my Solaris workloads.

  1. Download the cloudstack source code and uncompress to a folder of your choice.
  2. Apply a rather simple diff to the CitrixHelper.java file:
    $ diff CitrixHelper.java.orig CitrixHelper.java
    425a426
    > _xenServer600GuestOsMap.put("Sun Solaris 10(64-bit)", "Solaris 10 (experimental)");
    542a544
    > _xenServer602GuestOsMap.put("Sun Solaris 10(64-bit)", "Solaris 10 (experimental)");
  3. Build the JAR files, per the instructions in the Cloudstack installation guide page 16. No need to build DEB or RPM packages
  4. Replace /usr/share/java/cloud-plugin-hypervisor-xen.jar with cloud-plugin-hypervisor-xen-4.0.0-incubating-SNAPSHOT.jar that was built in the step above
  5. Restart the management server.

Slashdot geeks may want to add a Step-6: Profit. Launch again a Solaris 10 64-bit template and enjoy:


# xe vm-list name-label=i-2-272-VM params=name-description
name-description ( RW) : Clones of this template will automatically provision their storage when first booted and then reconfigure themselves with the optimal settings for Solaris 10 (experimental).