Posts Tagged ‘sysadmin’

Raspberry Pi: fun with docker

January 3, 2016

Today’s fun with docker

  1. Start with the resin.io Raspbian image
  2. docker pull resin/rpi-raspbian

  3. Install transmission and cleanup
  4. docker run -i -t resin/rpi-raspbian /bin/bash
    apt-get update
    apt-get install transmission-daemon vim-tiny
    mkdir /var/lib/transmission-daemon/incomplete
    rm -rf /var/lib/apt/lists/*
    apt-get clean
    exit

  5. Create a new image (note: use “docker ps -a” to identify your container ID)
  6. docker commit c072cafc6d18 mperedim/rpi-raspbian-transmission

  7. Fire up a new docker instance based on the image created. Replace the “media/bluedisk” paths below with the ones where you want your transmission downloads and incomplete files to be located
  8. docker run -d -p 9092:9091 --name downloads -v /media/bluedisk/transmission-daemon/downloads:/var/lib/transmission-daemon/downloads --name incomplete -v /media/bluedisk/transmission-daemon/incomplete:/var/lib/transmission-daemon/incomplete mperedim/rpi-raspbian-transmission /bin/bash -c "/usr/bin/transmission-daemon -a *.*.*.* -f --download-dir /var/lib/transmission-daemon/downloads --incomplete-dir /var/lib/transmission-daemon/incomplete --log-error"

  9. Hit port 9092 of your Raspberry Pi and start downloading stuff

TODO: autocreate from docker file, autostart on boot

ceph: a quick critique

November 21, 2013

The other day I went ahead and had a short rant on Ceph at twitter:

This prompted a response by Ian Colle and I somehow managed to get myself to write a short blog post explaining my thoughts.

A good place to start is the ceph-deploy tool. I think this tweet sums up how I feel about the existence of the tool in the first place:

Now the tool itself could be great (more on that later). And it’s OK to involve it in a quick start guide of sorts. But I would have hoped that the deep dive sections provided some more insight on what is happening under the hood.

That said, the ceph guys have decided to go ahead with ceph-deploy. Maybe it cut the docs size by half (bundle what used to be 10+ steps in a single ceph-deploy invocation), maybe it makes user errors fewer and support much easier. So I bit the bullet and went ahead with it. Installed Ubuntu 13.10, typed “apt-get install ceph*” on my admin and my two test nodes and tried to start away hacking. 1 day later I was nowhere more near to having a working cluster working, my monitor health displaying 2 OSDs, 0 in, 0 up. It wasn’t a full day of work but it was frustrating. At the end of the day I gave up and decided to declare the Ubuntu Saucy packages broken.

Now I appreciate that InkTank may have nothing to do with the packages in the default Ubuntu repos. It may not provide them, it may not test against them. In fact most of their guides recommend using the repositories at ceph.com. But they’re there. And if something is in the repo, people expect for it to work.

Having finally bit the bullet I decided to go ahead with the “official” ceph-deploy and packages. This was not without its problems. Locating the packages for Ubuntu saucy took a little bit more time than it had to. Having resolved that even that I kept running into issues. Turns out that if at any point “you want to start over” purgedata is not enough. Turns out that this is a known problem too. “apt-get install –reinstall” fixed things for me and voila, I had a ceph cluster.

Neat. “ceph health” indicated my 2 OSDs up and running, I could mount the pool from a client, etc. Let me take a look at ceph.conf:


# cat /etc/ceph/ceph.conf
[global]
fsid = 2e36c280-4b7f-4474-aa87-9fe317388060
mon_initial_members = foo
mon_host = W.X.Y.Z
auth_supported = cephx
osd_journal_size = 1024
filestore_xattr_use_omap = true

This is it. No sections for my one monitor, my one MDS, my 2 OSDs. If you have read Configuring Ceph congrats. You still are non-the-wiser of where all these configuration settings are stored. I’ll find out. Eventually.

Was this the end of my problems? Almost. I went ahead, rebooted my small test cluster (2 servers; 1x MON/MDS/OSD, 1x OSD) and noticed the following:


ceph> health
HEALTH_WARN mds cluster is degraded

Thankfully that was an easy one. Clickety-click:


osd-0# mount /dev/sdb3 /var/lib/ceph/osd/ceph-0
osd-1# mount /dev/sdb3 /var/lib/ceph/osd/ceph-1
# ceph
ceph> health
HEALTH_OK

Does it work? Like a charm. But the frustration in the process over what seems to be silly bugs was constantly mounting. And it shouldn’t have. This is the simplest setup one could possibly come-up with, with just a couple of nodes. I was using the latest Ubuntu, not some niche distribution like Gentoo or Arch nor a distro with outdated packages like CentOS/RHEL/Debian-stable. I should have this up and running in an hour not a couple of days, so that I can hack at stuff of more direct interest to me.

Getting back to my original tweet: I exaggerated. You can certainly grab an expert from InkTank and help you set up Ceph. Or you can invest some time on your own. But I still wanted this to be simpler.

Xenserver: fake xen tools for Solaris 10 guests

March 31, 2013

Note: Also, you hopefully appreciate that this is completely unsupported.

Xenserver doesn’t enable Shutdown / Reboot buttons for VMs that don’t have the XenServer tools installed. This is an issue for my Solaris 10 guests since tools are not available for this platform. Which has been bugging me for some time.

So I went ahead and dug into the XenServer tools for linux. Turns out that the only thing they’re doing is updating a bunch of parameters on XenStore. However, Solaris 10 doesn’t have a device path for XenStore, putting us back into square one. Or not?

Not really. Turns out that Xen Tools installation is like orgasm memory. Sure, it’s a lot better if one has them, but if not one can modify the appropriate XenStore parameters from dom0 and fake it. XenServer couldn’t care less how the parameters were modified, as long as they are tweaked in the proper order the Suspend/Reboot/Shutdown buttons are enabled. So just get the dom-id of your Solaris VM:


[root@dom0 log]# xe vm-list name-label=i-2-323-VM params=dom-id
dom-id ( RO) : 154

and

[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/Installed 1
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MajorVersion 6
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MinorVersion 1
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/MicroVersion 0
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/BuildVersion 59235
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/class "SunOS"
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/major "5"
[root@dom0 log]# xenstore-write /local/domain/154/attr/PVAddons/os/minor "10"
[root@dom0 log]# xenstore-write /local/domain/154/data/updated 1

The above is enough to enable the shutdown/reboot/suspend buttons. Unfortunately in the process it also sets the “platform: viridian: true” parameter which doesn’t play nicely with Solaris VMs.

[root@dom0 log]# xe vm-list name-label=i-2-323-VM params=uuid
uuid ( RO)    : 5dc51848-bc9c-dd70-b670-2c7d263a7fe5
[root@dom0 log]# xe vm-param-remove param-name=platform param-key=viridian uuid=[...]

… and see the “Force shutdown”, “Force reboot” buttons disappearing.

So what works?

  1. Reboot: this does a clean reboot of the Solaris 10 domU
  2. Live migrate: not extensively tested, but a test VM does keep network connectivity after a single live migration.

Unfortunately shutdown only kind of works. Hitting the button does initiate a clean shutdown of the Solaris domU but the guest never seems to do an ACPI poweroff and gets stuck at “Press any key to reboot”. This is proving a slightly more touch nut to crack.

Update 2012/04/01: I’ve wasted a few too many hours on “shutdown” not working. Maybe I’ll revisit this in the future but calling it quits for now.

Solaris + xenserver + ovswitch

February 28, 2013

This has troubling me for quite some time, hopefully someone else can save a few hours by bumping in this post.

For some reason my Solaris 10 Virtual Machines on Xenserver failed when the Distributed Virtual Switch Controller was also running. I didn’t really troubleshoot the issue until recently since I could live without cross-server private networks. This no longer being the case I decided to look into it again.

Scroll forward a couple of hours and after losing quite some time on trying various tricks on the VM (disabling NIC checksum offload, lower MTUs etc) to no avail I concluded that it must be a hypervisor issue. Digging into the openvswitch tools revealed something interesting.


[root@xenserver ~]# ovs-vsctl list-ports xapi25
vif42.0
tap47.0
vif47.0
vif6.0

Specifically, for my Linux VMs only a vifX.Y interface was being added to the bridge, while for my Solaris ones both a tapX.Y and a vifX.Y. Clickety-click.


[root@xenserver ~]# ovs-vsctl del-port xapi25 tap47.0

Et voila! Network connectivity to the Solaris VM works like a charm. Now to make this change permanent:


[root@xenserver ~]# diff /etc/xensource/scripts/vif.orig /etc/xensource/scripts/vif
134c134,138
if [[ $dev != tap* ]]; then
> $vsctl --timeout=30 -- --if-exists del-port $dev -- add-port $bridge $dev $vif_details
> else
> echo Skipping command $vsctl --timeout=30 -- --if-exists del-port $dev -- add-port $bridge $dev $vif_details
> fi

I am not really certain of the ugly side-effects that this may have. But it does the trick for me.

Update 2013/03/10: A better workaround is to have the above behavior apply only to Solaris VMs. For example, assuming that these are based on the “Solaris 10 (experimental)” template, the following snippet skips the offending command only for the Solaris VMs:

if [[ $dev != tap* ]]; then
    $vsctl --timeout=30 -- --if-exists del-port $dev -- add-port $bridge $dev $vif_details
else
    xe vm-list dom-id=$DOMID params=name-description | grep 'Solaris 10' 2>&1 >/dev/null || \
        $vsctl --timeout=30 -- --if-exists del-port $dev -- add-port $bridge $dev $vif_details
fi

Solaris, VMWare & VGT mode

February 16, 2012

Today I had the strangest of problems. In a VMWare based testbed with a bunch of mixed systems (F5 Virtual appliances, a Linux host, 3 Solaris servers) I was facing severe connectivity issues with the Solaris hosts. Specifically, with all systems connected on VLAN 162 (L3 addressing: 172.16.2.0/24) anything TCP related from the Solaris hosts failed. F5 and linux Virtual machines had no problem whatsoever.

I quickly fired up my trusted tcpdump tool to figure out what’s wrong. Then I issued a simple ICMP from a Solaris host to the load balancer to see what happens:

solaris-1# ping 172.16.2.1
172.16.2.1 is alive


[root@loadbalancer-1:Active] config # tcpdump -q -n -i ingress-lb not arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ingress-lb, link-type EN10MB (Ethernet), capture size 108 bytes
19:43:05.048167 IP 172.16.2.21 > 172.16.2.1: ICMP echo request, id 12766, seq 0, length 64
19:43:05.048215 IP 172.16.2.1 > 172.16.2.21: ICMP echo reply, id 12766, seq 0, length 64

Nice. ICMP works. Everything looks nice in the packet capture. Now let’s try some TCP traffic for a change:

solaris-1:/root# telnet 172.16.2.1 22
Trying 172.16.2.1...


[root@loadbalancer-1:Active] config # tcpdump -q -n -i ingress-lb not arp
19:44:06.816663 IP 172.16.233.49.windb > 172.16.2.1.ssh: tcp 0
19:44:07.949006 IP 172.16.233.48.windb > 172.16.2.1.ssh: tcp 0
19:44:10.215576 IP 172.16.233.47.windb > 172.16.2.1.ssh: tcp 0
19:44:14.730324 IP 172.16.233.46.windb > 172.16.2.1.ssh: tcp 0
19:44:23.739898 IP 172.16.85.195.windb > 172.16.2.1.ssh: tcp 0

Du-oh. The packet reaches the load balancer alright but the source IP is corrupted. Googling didn’t really help, other people have run into this or similar issues but no solution. Pinging my skilled colleague Leonidas didn’t help either, he was similarly baffled at what was happening as I was. And then it hit me.

solaris-1# echo "Clickety-click; disabling checksum offload" && echo "set ip:dohwcksum=0" >> /etc/system
Clickety-click; disabling checksum offload

solaris-1:/root# telnet 172.16.2.1 22
Trying 172.16.2.1...
Connected to 172.16.2.1.
Escape character is '^]'.
SSH-2.0-OpenSSH_4.3

Uh! The joy! Too bad that 2′ after I figured this out Leonidas had signed off for the day and I can only brag about this in my blog 🙂

Apparmor (synonyms: selinux, crap)

February 8, 2012

Today’s fun was with apparmor. What was a simple MySQL statement to load a bunch of data from a file to a database:


mysql> LOAD DATA INFILE '/var/tmp/some_log_file'
-> INTO TABLE entries
-> FIELDS TERMINATED BY ',';
ERROR 29 (HY000): File '/var/tmp/cosmote.ro.osn1z0.web_access.log.0.4' not found (Errcode: 13)

… was constantly failing for no good reason. It took something like 30′ of pointless online searching until it hit me:


# tail -0f /var/log/syslog
Feb 8 19:11:44 hs21-a kernel: [15359.215686] type=1400 audit(1328721104.742:113): apparmor="DENIED" operation="open" parent=1 profile="/usr/sbin/mysqld" name="/var/tmp/cosmote.ro.osn1z0.web_access.log.0.4" pid=15623 comm="mysqld" requested_mask="r" denied_mask="r" fsuid=105 ouid=0

Well I guess it’s just like SELinux. There is a parallel universe out there where apparmor just works. Just not this one.

ARGH!

(Prev)

Selinux & POLA

July 21, 2011

Selinux is crap.Sorry redhat fun boys but its true.Not even in redhat’s documentation doesnt have enough info.

via E.Balaskas

My own experience with SELinux today? A Virtual Machine with a forgotten root password. OK, that’s easy, boot in single user mode, type passwd(1), enter the new root password, reboot. I mean the process is documented in a shitload of pages (example) and has been working like that since … I don’t know 1996? Should be a piece of cake, right?

NOOOOOOOOOOOOOOOOO!

You see this is SELinux. There are procedures to follow, “passwd root” just won’t work in single user mode and will exit immediately without a prompt. A well-defined procedure that has been working for ages is now broken. Oh well …


# echo 0 >/selinux/enforce
# passwd root
Changing password for user root.
New password:

Oh-well I am fairly certain that there is one out of more than a billion parallel universes where SELinux just works. Just one though.

References: POLA

vpnc & windows 7: sleep a little bit

February 18, 2011

For quite some time I’ve been using vpnc within Cygwin to connect to the aging Cisco VPN 3000 Series Concentrator at dayjob (thank you Cisco for not supporting 64-bit users as Ilias points out in the comments Cisco has finally added partial support for Windows 7 64-bit). However, I’ve been running into the erratic problem where my split tunnels were created eratically and didn’t work. Specifically, once a VPN connection got created route print indicated routes similar to the following:

#route print
Network Destination        Netmask          Gateway       Interface  Metric
         10.0.0.0        255.0.0.0      10.8.11.245     192.168.1.65     31

instead of the proper one:

Network Destination        Netmask          Gateway       Interface  Metric
         10.0.0.0        255.0.0.0         On-link       10.8.11.245     31

I’ve conveniently ignored the problem for some time, using a custom script to tear down and re-create the problematic routing entries, till today. Some well placed “echos” in /etc/vpnc/vpnc-script-win.js indicated that vpnc properly constructed the required route add commands, yet the routing table entries were still wrong. Clickety-click:

$ diff /etc/vpnc/vpnc-script-win.js /etc/vpnc/vpnc-script-win-BEDC.js
$ diff -U 1 /etc/vpnc/vpnc-script-win.js /etc/vpnc/vpnc-script-win-BEDC.js
--- /etc/vpnc/vpnc-script-win.js        2010-09-18 13:13:25.778339100 +0300
+++ /etc/vpnc/vpnc-script-win-BEDC.js   2011-02-18 21:35:53.279264500 +0200
@@ -80,2 +80,4 @@
         if (env("CISCO_SPLIT_INC")) {
+               echo("sleeping a little bit; don't ask why but this is needed");
+               run("sleep 5");
                for (var i = 0 ; i < parseInt(env("CISCO_SPLIT_INC")); i++) {

Seems that a timing issue of some sort causes these route add commands to run prematurely, before the TAP tunnel interface is properly configured, resulting in a problematic configuration. Holding them back for just 5 seconds consistently does the trick for me.

Update: if generally interested in configuring VPNC with Windows, check out Alessio Molteni’s detailed post.
Update 2: Corrected status of the official Cisco VPN client.

Opennebula: dhcpd contextualization magic

February 17, 2011

One of the most frequent questions on the Opennebula lists relates to network contextualization of Virtual Machines (VMs). Specifically, contrary to Eucalyptus or Nimbus, Opennebula does not directly manage a DHCP server. Instead Opennebula:

  • suggests using a simple rule for extracting the IPv4 address from the MAC address within the VM
  • manages just MAC addresses

This moves the burden of IPv4 configuration to the VM operating system, which has to dynamically calculate the IPv4 address details based on each interface MAC address. While Opennebula provides a relevant sample VM template and script to do this, it comes up a little bit short. Specifically, the script is linux specific, it will probably not work with other Unix O/S like Solaris or FreeBSD, let alone Windows. In addition, extra work is required in order to configure additional but required network parameters, like a default router or a DNS server.
(more…)

Solaris: cloning an iSCSI LUN

October 21, 2010

While I nailed down on a combination of ramdisk and golden Solaris container images for a diskless boot architectural prototype I had to implement for dayjob, I did toy around initially with iSCSI.

I ended up rejecting iSCSI mainly due to the additional requirements placed on the storage subsystem. A single ramdisk may be used by multiple nodes in the cluster, each client loads the ramdisk and then self-customizes the filesystem for host-specific parameters in the local RAM. Contrast this with iSCSI which requires a separate iSCSI LUN per client. The cost is not just about extra storage (which could be minimal in the presence of cloning and deduplication), there is an increased management cost (maintain 10 LUNs vs. a single ramdisk) as well as an increased CAPEX and OPEX due to the presence of an extra SAN. Specifically, you can’t really expect to have a highly available iSCSI solution with non-dedicated h/w, whereas a similar HA solution with ramdisks is trivial to setup and just needs two DHCP + TFTP servers (coupled with NIC bonding for extra redundancy).

The above said I thought I’d write some high level notes with regards to the pain of cloning an iSCSI LUN containing a Solaris installation. I can use them as a reference in the future or (if I’m lucky) someone will run into this blog post and suggest a more graceful approach.

  1. Setup an iSCSI LUN: it doesn’t really matter how you’ll do it. For my setup I used the Solaris iSCSI target (greetz to @c0t0d0s0 for yet another excellent tutorial)
  2. Install Solaris on the iSCSI LUN: Captain Jack provides a thorough step-by-step guide with screenshots with the relevant steps (I will admit wondering whether one can automate the process with Jumpstart and pre-install scripts but I never got there)
  3. Boot the newly installed node for the first time, make any site-specific changes you need and then shut it down. Forget this LUN from now on, it will be your “golden image”
  4. Clone the iSCSI LUN to a new one: This step is really dependent on your SAN. If you are using ZFS the steps are probably something as simple as the following:
  5. # zfs snapshot rpool/iscsi/lun0@golden
    # zfs clone rpool/iscsi/lun0@golden rpool/iscsi/lun1
    
  6. Add the LUN to an existing or new iSCSI target and get its GUID
  7. # iscsitadm create target -u 1 -b /dev/zvol/rdsk/rpool/iscsi/lun1 -t mytarget
    # iscsitadm list target -v mytarget
    Target: mytarget
        iSCSI Name: iqn.1986-03.com.sun:02:9c23130f-1d8e-6b20-8e95-a6ab8a227924.mytarget
        Connections: 1
            Initiator:
                iSCSI Name: iqn.1986-03.com.sun:01:ba78c2f3ffff.49b911ad
                Alias: unknown
        ACL list:
        TPGT list:
        LUN information:
    ...
            LUN: 1
                GUID: 600144f04caf16fb00000c29324dee00
                VID: SUN
                PID: SOLARIS
                Type: disk
                Size: 4.0G
                Backing store: /dev/zvol/rdsk/rpool/iscsi/lun1
                Status: online
    ...
    
  8. Configure a new system to boot from your newly created iSCSI LUN. Here is how a DHCP reservation for gPXE looks like:
  9. host  {
      hardware ethernet ;
      fixed-address                   ;
      option routers                  ;
      option subnet-mask              ;
      option domain-name-servers      ;
      filename                      "";
      # iscsi root-path format        iscsi::[protocol]:[port]:[LUN]:
      option root-path
        "iscsi::::1:iqn.1986-03.com.sun:02:9c23130f-1d8e-6b20-8e95-a6ab8a227924.mytarget;
    }
    
    

Neat. You installed Solaris in a LUN and you cloned the LUN. One would expect that you can repeat this process as many times as necessary and by changing just the LUN id in gPXE boot as many Solaris systems as you want, right? WRONG!

Turns out that the Solaris installer “burns” the iSCSI boot device identifier in the root filesystem during installation. In fact it does a pretty good job of “burning” it all over the place to make your life miserable when it comes to cloning an iSCSI LUN and re-using it for another system. So you got to jump through some extra hoops, otherwise you will just get a nice kernel panic. The following steps assume that you are using UFS (don’t ask!) but they would probably work similarly with ZFS as well.

  1. Mount the newly cloned iSCSI LUN from a Solaris system. This could be the iSCSI target itself if you are using Solaris for that task. Do notice the slight difference between the iSCSI target device and the device we are actually mounting.
  2. # iscsiadm modify discovery -t enable
    # iscsiadm list target -S
    Target: iqn.1986-03.com.sun:02:9c23130f-1d8e-6b20-8e95-a6ab8a227924.mytarget
            Alias: asmrootufs
            TPGT: 1
            ISID: 4000002a0000
            Connections: 1
            LUN: 0
                 Vendor:  SUN
                 Product: SOLARIS
                 OS Device Name: /dev/rdsk/c2t600144F04CADE09C00000C29324DEE00d0s2
            LUN: 1
                 Vendor:  SUN
                 Product: SOLARIS
                 OS Device Name: /dev/rdsk/c2t600144F04CAF16FB00000C29324DEE00d0s2
    ...
    # ls -l /dev/rdsk/c2t600144F04CAF16FB00000C29324DEE00d0s2
    lrwxrwxrwx  -> ../../devices/scsi_vhci/disk@g600144f04caf16fb00000c29324dee00:c,raw
    # mount /devices/scsi_vhci/disk\@g600144f04caf16fb00000c29324dee00\:a /mnt/foo/
    
  3. keep a note of the disk path above: “/devices/scsi_vhci/disk@g600144f04caf16fb00000c29324dee00:a”. You’re going to need it
  4. Edit the files ./boot/solaris/bootenv.rc, etc/path_to_inst and etc/vfstab. In them you will find references to the iSCSI LUN0 device which was used as our golden image (cf. the iscsiadm command above). Change these to the “/devices” path corresponding to our iSCSI LUN 1.
  5. Do a recursive grep (find /mnt/foo -type f | xargs grep) for any other occurences of the old iSCSI LUN. I think the above step covers everything but I played it from an old note and it may miss something.
  6. Update the boot archive in the new LUN.
  7. # bootadm list-archive -R /mnt/foo
    
  8. Manually create the required symlink under /dev/dsk
  9. # cd /mnt/foo/dev/dsk
    # ln -s ../../devices/scsi_vhci/disk\@g600144f04caf16fb00000c29324dee00\:a c2t600144F04CAF16FB00000C29324DEE00d0s0
    
  10. Unmount “/mnt/foo” and reboot your target node; now everything should work like a charm
  11. Profit!