Jan 31 2010

The abomination known as IPV6_V6ONLY

Let me say it straight away: I strongly disagree with section 5.3 of RFC 3493 that IPV6_V6ONLY should be disabled by default. With this option disabled it's impossible to create a server socket in a truly protocol-independent way.

When RFC 2553 was published in 1999 it described a set of functions which can be used to enumerate and resolve socket addresses in a protocol-independent way. But with the release of RFC 3493, and its introduction of IPV6_V6ONLY, the protocol-independent aspect of these functions has been largely lost. The following example shows why:

One of the functions described in RFC 2553 is getaddrinfo. This function can be used to get a list of socket addresses suitable for bind'ing a server-side socket. On dual-stacked hosts it will return two addresses: one for IPv4 and one for IPv6. But when IPV6_V6ONLY is disabled, you can't have two sockets bound to the two addresses, the second bind will fail with EADDRINUSE. You can either find out which addresses of those returned are IPv6 and use only those, or you can enable IPV6_V6ONLY. But as you see both of these solutions aren't really protocol-independent anymore.

From what I gathered most *BSD distributions have IPV6_V6ONLY enabled by default, only Solaris and Linux don't. At least Debian is starting to break the ice and enabled IPV6_V6ONLY in their latest netbase package (4.40). That has broken a few other packages, but as the bug headline says, those are buggy and thus need to be fixed anyway.


Dec 29 2009

What are the monkeys at Sun smoking?

The root pool in my OpenSolaris box is only 12GB. All the important data, as well as everything that needs lots of disk space (home directories, mysql database, mangos, web server etc) is on a separate pool. 12GB is plenty I thought, the root partitions on my Linux boxes are between six and eight GB, and that includes the full Gnome desktop environment.

But the free space was getting smaller because OpenSolaris keeps regular snapshots, so I thought I could free a bit space by removing unnecessary packages. After all, the box acts as a server and doesn't need Firefox, Gnome nor the xserver. First I started with something simple: uninstall Firefox. But first let's check what pkg wants to uninstall so it doesn't remove something I need.

$ pfexec pkg uninstall -rnv SUNWfirefox
None            
Package version changes:
pkg://opensolaris.org/SUNWfirefox@0.5.11,5.11-0.129:20091205T083114Z -> None
pkg://opensolaris.org/SUNWlibproxy@0.5.11,5.11-0.129:20091205T110309Z -> None
pkg://opensolaris.org/SUNWsvn@1.6.5,5.11-0.129:20091205T124829Z -> None
pkg://opensolaris.org/SUNWsvn-perl@1.6.5,5.11-0.129:20091205T124852Z -> None
pkg://opensolaris.org/SUNWneon@0.29.0,5.11-0.129:20091205T113450Z -> None
Actuators:
      restart_fmri: svc:/application/desktop-cache/gconf-cache:default
     reboot-needed: false

Uhm, WTF? It wants to remove Subversion? SUNWsvn depends on SUNWfirefox? Apparently Sun engineers thought that would be a good idea, so here we are. And it gets even more fucked up if you try to remove SUNWxorg-server (the xserver). Amongst all the gnome packages (which have no reason to depend on the xserver, but I'm fine with removing those) you'll find SUNWbash, SUNWsshd, SUNWgzip, SUNWmysql51 - yes, you did read that right, in OpenSolaris all these packages depend on the xserver. What are the monkeys at Sun smoking? Whatever it is, it must be really good!


Dec 03 2009

Why you should use alpha software

Quite often I run into problems when trying to build software on my OpenSolaris box. Usually it has to do with either the build system (gnu autotools, gettext etc) or with the compiler (gcc or Sun's own compiler). Today again. And even though the bug is known - and shockingly trivial to fix (basically a packaging error) - Sun wasn't able to fix the package since now almost one year.

Another similar issue is the git package that Sun supplies with OpenSolaris. The version is stuck at v1.5.6.5 which was released about 14 months ago. In case you don't follow the git development, v1.5 is regarded to be ancient and, more importantly, is lacking many useful features. I know stability is very important for Sun, but:

  • Sun explicitly set the Interface Stability attribute of git to Uncommitted, which means that Sun doesn't guarantee source or binary compatibility (see man attributes).
  • Sun has no problems with releasing a new ON (kernel) build every two weeks, but they didn't release a new git package in over a year. There have been various regressions in ON builds in since I started using OpenSolaris, but I have yet to come across a single issue when using git which I'm compiling directly from its source repository every few days.

Sometimes the cost of using alpha software is far less than having to put up with ancient versions. This is especially true when using open source software in an otherwise locked down environment. Even though you might need to fix one or two bugs before you can compile and install the software, it could give you an advantage later on due to the new features that the software provides. And if you don't want to fix it yourself, just submit a bug report, most projects respond rather quickly and fixes are pushed into the repositories quite fast.

My OpenSolaris box already feels like gentoo because I have to compile so many packages from source. Git, cvsps, mediatomb, notmuch, perl, python, ruby, rrdtool, screen, just to name a few that I have installed in ~/local. Today I had to add autoconf, automake and gettext which I installed into /opt/sbe. SBE stands for Sane Build Environment, which, in contrast to the OpenSolaris Common Build Environment, actually tries to create a sane build environment for my software.


Jul 05 2009

DTrace parser for gprof2dot

In an earlier blog post I mentioned that I use gprof2dot to visualize data generated with dtrace. I've just released the code and it's available from the github repository. Use the supplied dtrace script (gprof2dot.d) to start the application you want to profile and when you are done, feed the output to gprof2dot.py:

$ ./gprof2dot.d -c 'git log -C -M -M master' -o /tmp/dtrace.out
$ ./gprof2dot.py -f dtrace -o graph.dot /tmp/dtrace.out

Dtrace is very flexible and allows many different forms of profiling. The script I wrote creates a probe for every single function and records who called that function, how many times that function was called and how much time was spent in it. This kind of profile is very accurate but also lowers the performance of the application because a considerable amount of time is spent in dtrace. However, if that is an issue, it would be trivial to change the dtrace script to use the profile provider or limit the amount of probes in some other form.


Jun 24 2009

MacOSX Time Machine meets ZFS/iSCSI

I would much rather use ZFS than Time Machine, but Apple seems to have decided to remove ZFS from their next Mac OSX release. Though one feature in Time Machine is particularly intriguing: backup over wireless network. Apple Time Capsule is a wireless access point with a builtin harddrive. Whenever your Mac sees the Time Capsule, it will send the backup there. But there is another way to back up Time Machine wirelessly: iSCSI.

There are quite a few blog posts and discussions about how to set up Time Machine to back up to ZFS/iSCSI. Setting up iSCSI on OpenSolaris was surprisingly easy. I only had to install one package (SUNWiscsitgt) and then set shareiscsi=on on the ZFS volume. And voila, I was able to mount the disk from my MacBook using the globalSAN iSCSI Initiator. However I was really disappointed by the performance. After starting the initial Time Machine backup the throughput was less than 1 MByte/s, but iSCSI should be much faster! It turns out that iSCSI issues many small sync writes and it makes ZFS very slow on conventional hard drives. Some folks in #opensolaris on irc.freenode.net suggested that I should put the ZFS Intent Log (ZIL) on a faster disk such as a SSD or a RAM disk. I know enterprise grade storage systems do this because of performance reasons, but I don't really have space in the server to put another disk in (it's a small Shuttle barebone). My solution was to temporarily disable the ZIL. With ZIL disabled I was getting 8-10 MBit/s which is the limit of my 100 Mbit/s network. It's unfortunate that OpenSolaris needs such workaround, I was hoping for ZFS/iSCSI to be usable without enterprise-grade hardware setups.