<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>the back burner</title>
	<atom:link href="http://jang.blogs.ilrt.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://jang.blogs.ilrt.org</link>
	<description></description>
	<lastBuildDate>Wed, 28 Oct 2009 16:28:20 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Oracle calendar to iPhone external calendar (.ics format)</title>
		<link>http://jang.blogs.ilrt.org/2009/10/28/oracle-calendar-to-iphone-external-calendar-ics-format/</link>
		<comments>http://jang.blogs.ilrt.org/2009/10/28/oracle-calendar-to-iphone-external-calendar-ics-format/#comments</comments>
		<pubDate>Wed, 28 Oct 2009 16:02:45 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[calendar]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[oracle calendar]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=59</guid>
		<description><![CDATA[More here:
https://svn.cse.bris.ac.uk/svn/jan/trunk/calendar/
This represents a bit of a rejigging of the oracle_calendar.py to give Events the ability to turn themselves into iCalendar-style records. They also pick up vAlarms.

The Event output isn&#8217;t (yet) complete; most importantly, attendees aren&#8217;t re-serialized. However, it suffices to be able to provide a one-way output to an unjailbroken iPhone &#8211; or anything [...]]]></description>
			<content:encoded><![CDATA[<p>More here:</p>
<p><a href="https://svn.cse.bris.ac.uk/svn/jan/trunk/calendar/">https://svn.cse.bris.ac.uk/svn/jan/trunk/calendar/</a></p>
<p>This represents a bit of a rejigging of the <code>oracle_calendar.py</code> to give Events the ability to turn themselves into iCalendar-style records. They also pick up vAlarms.<br />
<span id="more-59"></span><br />
The Event output isn&#8217;t (yet) complete; most importantly, attendees aren&#8217;t re-serialized. However, it suffices to be able to provide a one-way output to an unjailbroken iPhone &#8211; or anything else that can consume the <code>.ics</code> format.</p>
<p>This is coupled with a little flup-based AJP service. I originally used the flup fcgi WSGIServer instead; however, I was having trouble configuring it with Apache 2.2 and mod_fastcgi (I had multiple FastCGI services, only some of which I wanted to pass the HTTP_AUTHORIZATION through to. Looks like mod_fcgid manages all this a little better) so I wound up using the AJP connector instead &#8211; without a hitch.</p>
<p>The service requires HTTP-Basic authentication, which it passes through sight unseen to the Oracle Calendar &#8220;soap&#8221; service.</p>
<p>I&#8217;ve probably complained about that before now. It&#8217;s not really a SOAP service. It&#8217;s a mostly-incomplete access point that will let a user see only their own calendar (no proxy or delegate authentication options, no way to choose another visible calendar) that accepts a precisely-formatted sublanguage of valid SOAP requests; mostly by dint of being half-implemented with a hand-written parser, as far as I can tell. The parser, amongst other things, is extremely sensitive to having namespace declarations in the right place with the right prefixes, and so on. Oh, and if you see an empty calendar (and your server&#8217;s throwing <code>expat</code> exceptions) that&#8217;s because Oracle calendar won&#8217;t always return well-formed XML. Mmm, taste the quality.</p>
<p>Anyway, this is enough to get simple calendar details slurpable through to an <em>external calendar</em> on the iPhone. It&#8217;s set up as follows:</p>
<pre>
&lt;Proxy balancer://fcgi&gt;
    BalancerMember ajp://cmjg.localhost:8642
&lt;/Proxy&gt;
</pre>
<p>and then</p>
<pre>
    ProxyPass /some/url/path balancer://fcgi/some/url/path
    &lt;Location /some/url/path&gt;
       Allow from all
    &lt;/Location&gt;
</pre>
<p>The <code>calendar.fcgi</code> is launched externally to Apache.</p>
<p>I&#8217;ve run one of these up for testing and it suffices for an aide memoire; the iPhone turns vAlarms (of any kind, the ACTION:EMAIL ones that Oracle produces in particular) into bleeps and dings.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2009/10/28/oracle-calendar-to-iphone-external-calendar-ics-format/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Solaris cluster, MPxIO, Zpools.</title>
		<link>http://jang.blogs.ilrt.org/2009/06/26/solaris-cluster-mpxio-zpools/</link>
		<comments>http://jang.blogs.ilrt.org/2009/06/26/solaris-cluster-mpxio-zpools/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 11:24:49 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[mpxio]]></category>
		<category><![CDATA[san]]></category>
		<category><![CDATA[satabeast]]></category>
		<category><![CDATA[solaris]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=56</guid>
		<description><![CDATA[This is remarkably straightforward.
Two nodes, alike in dignity. We preconfigured MPxIO on them prior to installing the cluster software. We also allocated a small LUN that was visible to both nodes prior to the installation, and made sure they could see it: this was intended to act as a quorum device.

That just worked. Did the [...]]]></description>
			<content:encoded><![CDATA[<p>This is remarkably straightforward.</p>
<p>Two nodes, alike in dignity. We preconfigured MPxIO on them prior to installing the cluster software. We also allocated a small LUN that was visible to both nodes prior to the installation, and made sure they could see it: this was intended to act as a quorum device.<br />
<span id="more-56"></span><br />
That just worked. Did the install on the first node (<em>moose</em>); it allocated a DID to the multipathed device, no problems. Bring in the second node (<em>moron</em>) and <code>scinstall</code> wound up identifying the to-be-quorum device as the same DID.</p>
<p>Finished the setup using <code>scsetup</code> to drop out of installmode and we were done. Next steps.</p>
<p>We zoned up the two nodes to be able to see our SATABeast pair that the first data LUNs would be coming off. (Note: I&#8217;d prefer that the quorum device actually hold some data too, but we were getting started first.)</p>
<p>We allocated a single LUN and presented that to <em>moose</em> only. On that host:</p>
<ol>
<li>Use fcinfo to scan for the new LUNs and get their multipathed device nodes created; <code>/nfs/bin/luns</code> is a handy script that does this for you.</li>
<li>Use <code>cldev refresh</code> to allocate a new DID.</li>
</ol>
<p>Fine so far. At that point (we were going a step at a time to check this) we LUN masked <em>moron</em> to be able to see the new LUN too, and repeated the process there. The existing DID now showed its availability on both nodes.</p>
<p>This worked so smoothly that in future I think we could simply mask all nodes to see the LUN, run <code>/nfs/bin/luns</code> on each node, then use <code>cldev refresh</code> all in one go.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2009/06/26/solaris-cluster-mpxio-zpools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Zenoss (general Zope) behind an Apache proxy.</title>
		<link>http://jang.blogs.ilrt.org/2009/05/28/zenoss-general-zope-behind-an-apache-proxy/</link>
		<comments>http://jang.blogs.ilrt.org/2009/05/28/zenoss-general-zope-behind-an-apache-proxy/#comments</comments>
		<pubDate>Thu, 28 May 2009 13:53:01 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[proxy]]></category>
		<category><![CDATA[zenoss]]></category>
		<category><![CDATA[zope]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=51</guid>
		<description><![CDATA[Zenoss&#8217; web server lives on Zope. This doesn&#8217;t sit perfectly well behind apache because we&#8217;ve configured it to listen to localhost:8080; that is the address it will default to stashing in its response pages.

That means that the Zenoss login page contains a form that submits to localhost:8080 (ie, a server that isn&#8217;t running on the [...]]]></description>
			<content:encoded><![CDATA[<p>Zenoss&#8217; web server lives on Zope. This doesn&#8217;t sit perfectly well behind apache because we&#8217;ve configured it to listen to localhost:8080; that is the address it will default to stashing in its response pages.<br />
<span id="more-51"></span><br />
That means that the Zenoss login page contains a form that submits to localhost:8080 (ie, a server that isn&#8217;t running on the same machine as the web browser, usually) and you don&#8217;t get very far.</p>
<p>Zope (modern zope) comes with a configured object that lets your proxy supply alternative server host and port settings. The following sufficies in /etc/httpd/conf.d/zenoss-proxy.conf:</p>
<p><code>      Order deny,allow<br />
      Deny from all<br />
      Allow from 137.222.12.<br />
      Allow from 127.0.0.1<br />
      ProxyPass http://127.0.0.1:8080/VirtualHostBase/http/hita.cse.bris.ac.uk:80/VirtualHostRoot/</code></p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2009/05/28/zenoss-general-zope-behind-an-apache-proxy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Braindump of FC multipathing on Linux</title>
		<link>http://jang.blogs.ilrt.org/2009/03/27/braindump-of-fc-multipathing-on-linux/</link>
		<comments>http://jang.blogs.ilrt.org/2009/03/27/braindump-of-fc-multipathing-on-linux/#comments</comments>
		<pubDate>Fri, 27 Mar 2009 11:35:01 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[multipathing]]></category>
		<category><![CDATA[san]]></category>
		<category><![CDATA[scsi]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=44</guid>
		<description><![CDATA[Start with the cards:
# /sbin/lspci
04:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
04:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
05:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
05:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel [...]]]></description>
			<content:encoded><![CDATA[<p>Start with the cards:</p>
<p><code># /sbin/lspci</p>
<p>04:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)<br />
04:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)<br />
05:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)<br />
05:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)</code></p>
<p>Two dual-port cards, effectively.</p>
<p>Kernel module for these:</p>
<p><code># /sbin/lsmod | grep ql<br />
qla2xxx              1079969  0<br />
scsi_transport_fc      73673  1 qla2xxx<br />
scsi_mod              188665  7<br />
sg,qla2xxx,scsi_transport_fc,mptsas,mptscsih,scsi_transport_sas,sd_mod</code></p>
<p>qla2xxx, already loaded.</p>
<p>To find WWNs, on RHEL 5.x (centos 5.x),</p>
<p>WWN information and other FC stuff is under<br />
        /sys/class/scsi_host/hostN/device/fchost:hostN/port_name<br />
for values of N.</p>
<p><code># ls /sys/class/scsi_host/host*/device/fc_host*/port_name<br />
/sys/class/scsi_host/host1/device/fc_host:host1/port_name<br />
/sys/class/scsi_host/host2/device/fc_host:host2/port_name<br />
/sys/class/scsi_host/host3/device/fc_host:host3/port_name<br />
/sys/class/scsi_host/host4/device/fc_host:host4/port_name</code></p>
<p>(there&#8217;s a host0 on this box which is the onboard SAS controller)</p>
<p><code># cat /sys/class/scsi_host/host*/device/fc_host*/port_name<br />
0x2100001bxxxxxxxx<br />
0x2101001bxxxxxxxx<br />
0x2100001dxxxxxxxx<br />
0x2101001dxxxxxxxx</code></p>
<p>So these are the addresses that need zoning.</p>
<p>Scanning for LUNs:</p>
<p><code># echo "- - -" &amp;gt; /sys/class/scsi_host/host1/scan<br />
[wait a bit]<br />
# more /proc/scsi/scsi</code><br />
[new paths to luns show up, mpaths appear under /dev/mapper if that's configured]</p>
<p>After scanning for luns (so they show up in /proc/scsi/scsi)&#8230;</p>
<p>- <code>/sbin/chkconfig multipathd on</code><br />
- edit /etc/multipath.conf to look like the one on bastet. I made the following changes:</p>
<p><code>  blacklist {<br />
        devnode "*"<br />
  }<br />
  blacklist_exceptions {<br />
        devnode "^sd[b-z].*"<br />
  }</code></p>
<p>(since sda was always the local SAS root device) and modified the defaults {} section to use &#8220;failover&#8221; rather than &#8220;multipath&#8221;, &#8220;multibus&#8221; etc, which appears to be fine.</p>
<p>After you&#8217;ve done this you should be able to do the following:</p>
<p><code>/sbin/multipath -v2 -d</code></p>
<p>-d is for dry-run (make no changes). It&#8217;ll tell you that it&#8217;ll make mpath0, mpath1, mpathxxx, tell you the ID of that volume (a long hex string) and what paths are available to it, what SCSI vectors those use, etc.</p>
<p>You can then add a multipath{} section to /etc/multipath.conf which lists that WWID and gives it an alias (ie, doesn&#8217;t use &#8220;mpath0&#8243; etc) &#8211; we used &#8220;sb2cc-lun5&#8243; as an example alias.</p>
<p>Start multipathd if it&#8217;s not running: <code>/etc/init.d/multipathd start</code> and (with a possible <code>/sbin/multipath -v2</code> to search for new paths and make them available or to rescan multipath.conf) you&#8217;ll find /dev/mapper/sb2cc-lun5, etc, as new block devices.</p>
<p>We labelled these with e2label, made an ext3 filesystem (takes several minutes for a multi-TB filesystem) on there and put entries into <code>/etc/fstab</code> as follows:</p>
<p><code>/dev/mapper/sb2cc-lun5  /sb2cc-lun5             ext3    defaults        1 3</code></p>
<p>You probably don&#8217;t want to use ext3 for a large filesystem.</p>
<p>These come back happily AND IN A STABLE FASHION on a reboot, which is just as well because the raw &#8220;path&#8221; devices, sdb..sdi were juggled on the reboot &#8211; this is just down to how fast the various HBA scans come back.</p>
<p>- Listing current multipath settings: <code>multipath -v2 -l</code><br />
- Listing what would be changed (no changes made, dry run): <code>multipath -v2 -d</code><br />
- Making those changes live: <code>multipath -v2</code></p>
<p>Getting rind of unwanted LUNs:</p>
<p><code># /sbin/multipath -v3 -d<br />
[nothing changes, need to rescan]<br />
# echo "- - -" &amp;gt; /sys/class/scsi_host/host1/scan<br />
[wait a bit]<br />
# more /proc/scsi/scsi</code><br />
[new paths to luns show up, mpaths appear under /dev/mapper</p>
<p>At that point I noticed that the <em>backup</em> host group on sb3cc has access to some stuff it shouldn't; turned that off. Now need to flush the paths to the LUNs that've disappeared:</p>
<p><code># /sbin/multipath -F<br />
# /sbin/multipath -v2 -l</code></p>
<p>only two paths show, scan for the LUNs on the other controller...<br />
<code># echo "- - -" &amp;gt; /sys/class/scsi_host/host2/scan</code></p>
<p>Now you can create the multipath devices.<br />
<code># /sbin/multipath -v2<br />
# /sbin/multipath -v2 -l</code></p>
<p>two large LUNs show up...</p>
<p><code>mpath6 (36000402001fc475761ee919c00000000) dm-5 NEXSAN,SATABeast<br />
[size=6.4T][features=0][hwhandler=0]</code><br />
^^^ this is sb3mvb (...4757 in WWID)</p>
<p><code>mpath5 (36000402001fc46db60ef903200000000) dm-2 NEXSAN,SATABeast<br />
[size=6.4T][features=0][hwhandler=0]</code><br />
^^^ this is SB3cc (....46db in WWID)</p>
<p>At that point you I edit <code>/etc/multipath.conf</code> to give these useful device names, and run <code>/sbin/multipath -v2</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2009/03/27/braindump-of-fc-multipathing-on-linux/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Kerberos patch finally in the JRE.</title>
		<link>http://jang.blogs.ilrt.org/2008/12/11/kerberos-patch-finally-in-the-jre/</link>
		<comments>http://jang.blogs.ilrt.org/2008/12/11/kerberos-patch-finally-in-the-jre/#comments</comments>
		<pubDate>Thu, 11 Dec 2008 10:42:14 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[bug]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[patch]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=43</guid>
		<description><![CDATA[http://secunia.com/advisories/32991/
]]></description>
			<content:encoded><![CDATA[<p>http://secunia.com/advisories/32991/</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/12/11/kerberos-patch-finally-in-the-jre/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using oracle_calendar from within Zope.</title>
		<link>http://jang.blogs.ilrt.org/2008/08/15/using-oracle_calendar-from-within-zope/</link>
		<comments>http://jang.blogs.ilrt.org/2008/08/15/using-oracle_calendar-from-within-zope/#comments</comments>
		<pubDate>Fri, 15 Aug 2008 09:21:14 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[zope]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=42</guid>
		<description><![CDATA[Zope (2) wraps a proxy around objects retured from an External Method (or behaves in a similar way) that protects member access from other python scripts.
This is exactly not what I&#8217;m after. Apparently this is fixable by adding
__allow_access_to_unprotected_subobjects__ = 1
to the class definition. It&#8217;s irritating to have to modify external methods in order to make [...]]]></description>
			<content:encoded><![CDATA[<p>Zope (2) wraps a proxy around objects retured from an External Method (or behaves in a similar way) that protects member access from other python scripts.</p>
<p>This is exactly not what I&#8217;m after. Apparently this is fixable by adding</p>
<p><code>__allow_access_to_unprotected_subobjects__ = 1</code></p>
<p>to the class definition. It&#8217;s irritating to have to modify external methods in order to make them Zope-ready; at least the fix is (apparently) a one-liner.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/08/15/using-oracle_calendar-from-within-zope/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>oracle_calendar in python.</title>
		<link>http://jang.blogs.ilrt.org/2008/08/13/oracle_calendar-in-python/</link>
		<comments>http://jang.blogs.ilrt.org/2008/08/13/oracle_calendar-in-python/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 15:35:39 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[calendar]]></category>
		<category><![CDATA[oracle calendar]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[soap]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/?p=41</guid>
		<description><![CDATA[There are several blog posts out there decrying the woeful state of SOAP in the Python world. I shall not echo these in detail; suffice to say, they&#8217;re absolutely right.
Mind you, there is also a bunch of lack in the Oracle calendar SOAP interface: no WSDL, no querying of delegate calendars, a limited SOAP implementation [...]]]></description>
			<content:encoded><![CDATA[<p>There are several blog posts out there decrying the woeful state of SOAP in the Python world. I shall not echo these in detail; suffice to say, they&#8217;re absolutely right.</p>
<p>Mind you, there is also a bunch of lack in the Oracle calendar SOAP interface: no WSDL, no querying of delegate calendars, a limited SOAP implementation that appears to have been written in terms of SAX events and that&#8217;s particularly sensitive to namespace declarations, and so on.</p>
<p>Anyway: I got enough of this working that I can query a user&#8217;s calendar in a fairly Pythonic fashion &#8211; <a href="https://svn.cse.bris.ac.uk/svn/jan/trunk/calendar/">https://svn.cse.bris.ac.uk/svn/jan/trunk/calendar/</a> for details. It&#8217;ll suffice to reimplement the guts of Shirley, the ILRT&#8217;s automatic receptionist.</p>
<p>This also marked the first time I&#8217;ve tried using new-style classes to augment the Python <em>str</em> class with a secondary attribute.</p>
<p>Having tested this tonight, we can indeed pull out a list of <em>ilrt-visitor</em> meetings, together with their organiser&#8217;s details (<em>mailto</em> and <em>cn</em>).</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/08/13/oracle_calendar-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>On the production status of the Departmental Filestore</title>
		<link>http://jang.blogs.ilrt.org/2008/06/10/on-the-production-status-of-the-departmental-filestore/</link>
		<comments>http://jang.blogs.ilrt.org/2008/06/10/on-the-production-status-of-the-departmental-filestore/#comments</comments>
		<pubDate>Tue, 10 Jun 2008 19:14:09 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[project management]]></category>
		<category><![CDATA[rant]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/2008/06/10/on-the-production-status-of-the-departmental-filestore/</guid>
		<description><![CDATA[The Google TechTalk on the subject of Scrum, given by Ken Schwaber, contains one of my favourite quotes. You can see the whole thing here; and if you haven&#8217;t, it&#8217;s worthwhile devoting an hour to watching it. Can&#8217;t be bothered? Then don&#8217;t bother reading on. And the quote? To paraphrase,
our discipline has a tried and [...]]]></description>
			<content:encoded><![CDATA[<p>The Google TechTalk on the subject of Scrum, given by Ken Schwaber, contains one of my favourite quotes. You can see <a href="http://http://video.google.com/videoplay?docid=-7230144396191025011">the whole thing here</a>; and if you haven&#8217;t, it&#8217;s worthwhile devoting an hour to watching it. Can&#8217;t be bothered? Then don&#8217;t bother reading on. And the quote? To paraphrase,</p>
<blockquote><p>our discipline has a tried and tested way of going faster. Cut corners, cut quality. That way you can produce more crap.
</p></blockquote>
<p>So, how does this relate to the DFS?<br />
<span id="more-40"></span><br />
Well, I&#8217;m coming under pressure to declare that the DFS is production-ready. This isn&#8217;t a technical thing, this is purely one of PR. Where does this pressure come from? From &#8220;on high&#8221; &#8211; ie, from someone who is one hop away from seeing anything other than that we have a working cluster, what&#8217;s holding things up? (Bob&#8217;s actually pretty reasonable about this &#8211; he&#8217;s stuck between a rock and an opinionated git.)</p>
<p>There have been unavoidable supplier and technical delays involved in getting this far. The trouble is that dates have been randomly selected on no basis whatsoever (actually, on the basis of having a meeting and me saying, &#8220;it will take <em>n</em> days of uninterrupted work by all involved with nothing else getting in the way, assuming no impediments, no unforseen hitches, and the continued availability of the emotional energy required to sustain that velocity by all involved&#8221; &#8211; and then <em>n</em> being added to the date of that meeting); and then those dates have been missed because, for example, we require additional FC ports in order to plug in our development array and the vendor arbitrarily cancelled our order and didn&#8217;t tell us about it. That kind of thing. I work hard on it, at a rate that I consider to be sustainable &#8211; I&#8217;ve already managed to get to the point where I looked like a corpse and couldn&#8217;t focus my eyes whilst fixing the mess that the previous kit had put us in. It&#8217;s about expectation management. So stuff slips.</p>
<p>So, I&#8217;ve repeatedly resisted that pressure. Why?</p>
<p>First, I should point out that there is no difference in what we do <em>now</em> compared to what we would do with a system that <em>is</em> production &#8211; <em>providing nothing goes wrong</em>. The difference happens when something <em>does</em> go wrong.</p>
<p>What would happen now is that we would run ourselves ragged making stuff up on the fly, recovering the system as quickly as possible, but basically choosing our path on the basis of our best expectation (which would be reasonable except that empirically, we&#8217;ve come to understand that Windows clustering seldom meets our best expectations).</p>
<p>In a production system, we would have already simulated that problem, developed and practised the recovery process, and have it documented and understood by at least two people.</p>
<p>That&#8217;s the difference.</p>
<p>Now, in a recent meeting, I was challenged with this:</p>
<blockquote><p>If we had held any of our current production systems up to the same standards of delivery, they wouldn&#8217;t be in production.</p></blockquote>
<p>That may be true, but it isn&#8217;t a reason to cut corners and produce crap. It&#8217;s a comment on the other services that leak into production, not the state of the DFS.</p>
<p>Why is this the case?</p>
<ul>
<li><strong>A lack of project management.</strong> We lack well-defined milestones. We&#8217;re not in the habit of setting them. Instead, our teams tend to operate in silos, interrupt-driven, incrementally doing development work when the stoking of production systems isn&#8217;t in the way. We don&#8217;t run clean iterations with clean, measurable, achievable milestones.</li>
<li><strong>A lack of project teams.</strong> We&#8217;re not geared up to fix the first problem because of the way that our department is organised. People live under fixed organisational structures. It means that sorting out the logistics (find a DBA; find a sysadmin; find some kit; find a project manager; etc) is difficult because you are naturally trying to squeeze attention out of a small number of vital people who don&#8217;t directly live in the same organisational branch that you do.</li>
<li><strong>A lack of capability to fix the above.</strong> The kind of short-term, well-defined, project-related work needs effort from middle-management. It needs a willingness to devote people to well-identified pieces of work for a fixed period, to let them get on. It needs up-front planning and project-management skills. It needs a bit of vision and a bit of courage.</li>
</ul>
<p>So that&#8217;s where we are and what I think is wrong. We&#8217;re departmentally in a rut. It&#8217;s the role of the departmental directors to sort this out. I&#8217;m not sure it&#8217;s perceived as a problem; but it&#8217;s certainly the case that we could use people better, in more varied ways, and identify the key resource shortages if we were to stop spreading those key people so thinly. We should be giving everyone a more varied, more rewarding experience at work.</p>
<p>And if, when it comes to resourcing a project milestone, 60 people are left in the room because the key people have already been earmarked for working on a particular thing that month: well, then that&#8217;s a result, not a failure. Better to identify that problem than to settle for a working practice that permits you to ignore it.</p>
<p>The good news is that entropy is letting me fill in addition sections in my list of &#8220;what to do when part X craps out&#8221; anyway: we had a fairly hard switch failure the last week and that went swimmingly well; although obviously the nodes that lost paths to the array behind it needed a reboot to find them again afterwards.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/06/10/on-the-production-status-of-the-departmental-filestore/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ZFS snapshotting and mirroring</title>
		<link>http://jang.blogs.ilrt.org/2008/06/02/zfs-snapshotting-and-mirroring/</link>
		<comments>http://jang.blogs.ilrt.org/2008/06/02/zfs-snapshotting-and-mirroring/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 14:37:26 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[solaris]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/2008/06/02/zfs-snapshotting-and-mirroring/</guid>
		<description><![CDATA[Although ZFS seamlessly will support synchronous mirrors over multiple backend storage arrays, there are some advantages in keeping the mirror process asynchronous.

This is a pretty trivial process to script up. Here are the basics:
To set up the initial mirror of the source/root ZFS to the dest/root ZFS:
# zfs snapshot -r source/root@0001
# zfs send source/root@0001 &#124;
 [...]]]></description>
			<content:encoded><![CDATA[<p>Although ZFS seamlessly will support synchronous mirrors over multiple backend storage arrays, there are some advantages in keeping the mirror process asynchronous.<br />
<span id="more-39"></span><br />
This is a pretty trivial process to script up. Here are the basics:</p>
<p>To set up the initial mirror of the source/root ZFS to the dest/root ZFS:</p>
<p><code># zfs snapshot -r source/root@0001<br />
# zfs send source/root@0001 |<br />
    zfs receive -d dest</code></p>
<p>This&#8217;ll create a <code>dest/root</code> and a <code>dest/root@0001</code> snapshot.</p>
<p>To update the mirror:</p>
<p><code># zfs snapshot -r source/root@0002<br />
# zfs send -i 0001 source/root@0002 |<br />
    zfs receive -Fd dest</code></p>
<p>This sends an incremental set of changes and applies them to the <code>dest/root</code> snapshot. The <code>-F</code> first rolls back <code>dest/root</code> to the latest snapshot &#8211; resetting any local changes (including atimes). The result of this is <code>dest/root</code> with two snapshots &#8211; the latest, 0002, and the first, 0001.</p>
<p>We can remove the earlier snapshots once the mirror is complete:</p>
<p><code># zfs destroy -r dest/root@0001<br />
# zfs destroy -r source/root@0001</code></p>
<p>This leaves just the latest mirror snapshot available at both ends, permitting the next incremental to run using the same pattern.</p>
<p>Note: there is no recursive option for <code>zfs send</code>; where there are multiple dependent filesystems under <code>source/root</code>, we need to send them one after the other. Assuming the initial filesystem transfer has been done, that looks like this:</p>
<p><code># zfs snapshot -r source/root@0003<br />
# zfs send -i 0002 source/root@0003 |<br />
    zfs receive -Fd dest<br />
# zfs send -i 0002 source/root/x@0003 |<br />
    zfs receive -Fd dest<br />
# zfs send -i 0002 source/root/y@0003 |<br />
    zfs receive -Fd dest</code></p>
<p>The recursive flag on <code>zfs destroy</code> takes care of all the dependent snapshots in one go.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/06/02/zfs-snapshotting-and-mirroring/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ZFS haul-over in the face of a box shutdown.</title>
		<link>http://jang.blogs.ilrt.org/2008/05/06/zfs-haul-over-in-the-face-of-a-box-shutdown/</link>
		<comments>http://jang.blogs.ilrt.org/2008/05/06/zfs-haul-over-in-the-face-of-a-box-shutdown/#comments</comments>
		<pubDate>Tue, 06 May 2008 12:53:59 +0000</pubDate>
		<dc:creator>jang</dc:creator>
				<category><![CDATA[haste]]></category>
		<category><![CDATA[solaris]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://jang.blogs.ilrt.org/2008/05/06/zfs-haul-over-in-the-face-of-a-box-shutdown/</guid>
		<description><![CDATA[I&#8217;m now using default zpool paths (which implies an automatic import and mount on reboot).
I just imported the bb-archive.isys zpool onto the first host, then rebooted it. After it had shut down, I forcibly imported the zpool onto the second host. Now waiting for the first to come back up. It should, perhaps, complain that [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m now using default zpool paths (which implies an automatic import and mount on reboot).</p>
<p>I just imported the <code>bb-archive.isys</code> zpool onto the first host, then rebooted it. After it had shut down, I forcibly imported the zpool onto the second host. Now waiting for the first to come back up. It should, perhaps, complain that the zpool is owned by someone else, but should not do a forcible reimport&#8230;</p>
<p>And alas, that&#8217;s not what happens. So:</p>
<p>I&#8217;m going to use altroots for all zpools. Unfortunately, this forces the subdirs to appear only mounted under the altroot. Not quite the combination I was after.</p>
]]></content:encoded>
			<wfw:commentRss>http://jang.blogs.ilrt.org/2008/05/06/zfs-haul-over-in-the-face-of-a-box-shutdown/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->