Interfaces as Ball of Mud protection

A response to, where Edmund Kirwan hypothesizes that using interfaces delays the onset of Mud.

A few observations:

* Any well-factored system will have more direct dependencies than methods. More methods than direct dependencies indicates that code re-use is very low.

* For any well-structured system, the relationship between direct dependencies and indirect dependencies will be linear, not exponential. The buttons and string experimental result is not surprising, but would only apply to software systems where people really do create interconnections at random. The whole purpose of modular program structure is explicitly to prevent this.

* Abstract interfaces are in no way a necessary condition for modular programming.

* Finally, the notion that interfaces act as a termination point for dependencies seems a little odd. An interface merely represents a point at which a dependency chain becomes potentially undiscoverable by static analysis. Unquestionably the dependencies are still there, otherwise your call to that method at runtime wouldn’t do anything.

So I suspect that what Edmund has discovered is a correlation between the use of interfaces and modular program structure. But that is just a correlation. A few years back there was an unfortunate vogue for creating an interface for each and every class, a practice which turned out to be entirely compatible with a Big Ball of Mud architecture. The button and string experiment provides an interesting support for modular programming, but I don’t know that it says much about interfaces.

Blackboard and software complexity

A comment on Blackboard’s complexity problems.

If either the author of this article or the otherwise knowledgeable Feldstein have ever worked in software development, it’s not apparent from this article and the ensuing comments thread. The list of architectural scare factors – multiple deployment environments, wide use of 3rd party libraries, legacy code – is simply business as usual for any substantial software product. And the assertion that “few other companies support this sheer raw complexity of configuration combinations” is just plain wrong. Many, many companies deal with exactly this. Cross-platform release engineering is a demanding but well-understood discipline.

To pick on a couple more representative points: “All enterprise software ages poorly”. No, all software ages. Whether it ages poorly or well depends on whether it’s worth the vendor’s time to manage its aging. Go and ask the IBM shops running 1960’s-vintage System 360 applications on modern virtualized environments whether they’re happy with 50 years of ROI on those applications. And then: “Microsoft control their entire ecosystem”. Please, please, go and talk to a Microsoft release test engineer about how controlled their release targets are. Make sure you have a very comfortable seat and lots of beer money, because you’ll be buying and you’ll be there for a looong time.

I don’t challenge the author’s underlying premise that Blackboard has mismanaged its software assets – I don’t have the inside knowledge to confirm or deny that. And the notion that Blackboard, like every software developer, needs to actively manage and reduce complexity is incontestable. But I don’t accept the notion that the architectural factors listed are any kind of indicator. I would bet that inside Blackboard there are some very frustrated developers who know exactly how to support that range of configurations, led by a management group who is telling them not to spend time refactoring and reducing technical debt, but rather to crack on with adding to the feature list smorgasbord. As if that’s an either/or choice.

Gradle – copy to multiple destinations

TL:DR (edited);

def deployTargets = ["my/dest/ination/path/1","my/other/desti/nation"]
def zipFile = file("${buildDir}/distributions/")

task deploy (dependsOn: distZip) {
	inputs.file zipFile
	deployTargets.each { outputDir ->
		outputs.dir outputDir
	doLast {
		deployTargets.each { outputDir ->
			copy {
				from zipTree(zipFile).files
				into outputDir

My specific use case is to copy the jars from a java library distribution to tomcat web contexts, so you can see the distZip dependency in there, along with zip file manipulation.

The multiple destination copy seems to be a bit of FAQ for gradle newcomers like myself. Gradle has a cool copy task, and lots of options to specify how to copy multiple sources into one destination. What about copying one source into multiple destinations? There’s a fair bit of confusion around the fact that the copy task supports multiple “from” properties, but only one “into” property.

The answers I’ve found seem to fall into one of two classes. The first is to just do the copy imperatively, like so:

task justDoit << {
  destinations.each { dest ->
    copy {
      from 'src'
      to dest

which gives up up-to-date checking. The solution I’ve settled on fixes that by using the inputs and outputs properties. Unlike the copy task type’s “into” property, a generic task can have multiple outputs.

The other advice given is to create multiple copy tasks, one for each destination. The latter seems to be a little unsatisfactory, and un-dynamic. What if I have 100 destinations? Must I really clutter up my build script with 100 copy tasks? The following is my attempt to handle it dynamically.

def deployTargets = ["my/dest/ination/path/1","my/other/desti/nation"]
def zipFile = file("${buildDir}/distributions/")

task deploy

// Set up a copy task for each deployment target
deployTargets.eachWithIndex { outputDir, index ->
	task "deploy${index}" (type: Copy, dependsOn: distZip) {
		from zipTree(zipFile).files
		into outputDir
	deploy.dependsOn tasks["deploy${index}"]

This one suffers from the problem that it will not execute on the same build when the zip file changes, but it will execute on the next build. So in sequence:

  • Change a source file
  • Run “gradle deploy”
  • Sources compile, distZip executes, zip file is produced, but deploy tasks do not execute
  • Run “gradle deploy” again
  • Deploy tasks execute

Why is this so? I don’t know. This thread seems to imply that there could be some race condition in gradle, but beyond that – *shrug*. The multiple copy task approach is recommended by a lot of smart people, so I assume there’s a better way to do it, but for now the single custom task is working for me.

Java logging fun facts

I finally bit the bullet. I pulled out all the nice System.out.println() calls from my slideshow app and set up proper Java logging. I didn’t expect it to be easy – in all the hours I’ve spent debugging Java frameworks, the hardest thing is always trying to work out how to get things to actually appear in logs – and, sure enough, I accumulated a little list of things that were non-obvious. I’ve used slf4j as the logging fa├žade, and java.util.logging (aka JDK 1.4 logging) as the implementation.

Handler levels vs logger levels

If you have:

.level = SEVERE              = FINE
java.util.logging.FileHandler.level = INFO

what’s the actual log level? The first line gives the default log level. For packages in, this is overridden by the second line. The third line then gives the finest level that will be logged by the file handler. So in this case, the log file will actually only contain messages of level INFO and coarser. Another handler might accept the FINE messages that will be emitted by my app.

Logger specifications don’t need wildcards

Line 2 in the above snippet sets the log level for all classes in package and all it’s subpackages. So, if I want to get FINE logging in*:



# OR THIS (if I don't mind being that inclusive) = FINE

# WRONG!!*.level = FINE does NOT use slf4j level names

If you use SLF4J, you use calls like logger.debug(),, etc., to send strings to the logger. If you use java.util.logging (JDK 1.4 logging) as the logging provider, you configure the logger using a file.

These guys do NOT use the same log level names. Doing a bunch of logger.debug() calls? In the file:

# This shows logger.debug() messages
.level = FINE

# This is an error
.level = DEBUG

See here for the full translation between slf4j and j.u.l. log levels.

Specifying in Eclipse

When launching from the command line, I specify the properties file in the first command line parameter, like this:

java -Djava.util.logging.config.file=./ -cp .:./*:lib/*

When launching in Eclipse, -Djava.util.logging.config.file goes into the VM arguments box of the Run configuration, NOT the application arguments box.

Where on earth is the log file?

OK, this is in the documentation, but I can tell you that if you google “java logging log file location” it will be many, many pages before you find an answer you can use. Before you get there, you’ll have to wade through gems of the documentation writer’s art such as:

java.util.logging.FileHandler.pattern: The log file name pattern.

So here’s a hot tip: java.util.logging.FileHandler.pattern actually is the log file name. It’s not a pattern, at least not in the regex sense. There are some handy placeholders for variables that can be interpolated into it, but you don’t have to use any of them. Just type the path you want. If you want to know about the placeholders, have a look at the Javadoc for FileHandler.

What’s wrong with photo slideshow apps?

I’ve been dissatisfied with the photo slideshow applications I’ve been using. Like most people, I take a lot of photos, especially when I’m travelling. Unlike most people, I use a high-res camera with good lenses, not a camera phone. That means my photos have a lot of detail, and are worth looking at for a while (for me, anyway). And because there are a lot of them, I often find myself wondering exactly where and when an image was taken. So, here’s my feature wish list for a slideshow program:

  1. Recursive directory searching. I don’t have time to put together special collections. Even if I did, 10,000 files is too many for one folder. I just want to point the slideshow at a large folder tree and have it find everything.
  2. Configurable delay. I like to look at a photo for a while, focussing on different details. I took one photo of Dunedin Harbour with an entire penguin colony in one corner, that I didn’t see until I’d looked at it for several minutes.
  3. Metadata display. I don’t have time to caption every photo, but I tag pretty much everything with at least the occasion (e.g. “Christmas 2005”) and the place (e.g. “Dunedin”). So when I see a ten-year-old photo pop up on my screen saver, I’d like to see that metadata so I have some clue as to what I’m looking at.
  4. Forward and back controls. How often do you catch a great photo out of the corner of your eye and think “Wow, what’s that?” just as the slideshow transitions to the next photo. If you’re on shuffle in a collection of tens of thousands of files, I guarantee you’ll never find that photo again. Wouldn’t it be nice to just hit a key and get it back? Or, conversely, if you’ve got a nice long delay time so you can savour every detail, you’ll occasionally spend two minutes staring at a photo of a lens cap that you forgot to delete. Unless you can just hit a key and skip to the next photo.
  5. No fussy transitions. In fact, I really want to be able to turn transitions off. I pay a fair bit of attention to framing, so having my photos sliding and zooming around the place isn’t my cup of tea. I can live with a fade-in fade-out, but I really don’t need to see Grandma spinning off into space on the side of a cube. Slideshows that recrop 4:3 photos to 16:9 are a no-no as well.

Now, I have by no means done an exhaustive search of all slideshow applications. However, it’s not a crowded category. I suspect it’s one of those software categories where the tools bundled with the operating system, while inadequate, are still functional enough to take all the oxygen out of the market. For example, the Windows 8 lock screen slideshow is a pretty nice looking slideshow, but it doesn’t include a single one of my wishlist features. Still, given that it’s there, how many people have even gone looking for something better? I have, and I can tell you that the Windows Photo Gallery slideshow changes photos way too fast (non-configurable), Photo Slideshow has no forward/back and no metadata display, some other app I forget only lets you have one non-recursive photo folder – etc, etc.

So – what else? – I wrote my own. Here it is. Fair warning, though, it’s nothing like production quality code, and it’s a java program so you’ll need to have java installed. Check out the readme for more details, and stay tuned for a future post on the technical nitty gritty.

Java image scaling performance

I’ve been homebrewing a slideshow application – more on that later – and learning lots about Java’s AWT graphics libraries. This is a post on a rookie error I made with image scaling.

TL;DR – The real performance hit is in creating image buffers and moving data between them. The actual calculations are almost trivial in comparison. So, creating one image buffer and then just drawing everything into it will be fastest.

OK, some code. First some common setup code:

		BufferedImage img = ... // Get an image from somewhere
		BufferedImage after = new BufferedImage(w, h, BufferedImage.TYPE_INT_ARGB);
		Graphics g = after.getGraphics();
		// This just defines the transform, it doesn't apply it
                AffineTransform at = new AffineTransform();
		at.setToTranslation(translateX, translateY);
		at.scale(1/ratio, 1/ratio);
		AffineTransformOp scaleOp = 
		   new AffineTransformOp(at, AffineTransformOp.TYPE_NEAREST_NEIGHBOR);

This is slow:

		scaleOp.filter(img, after);

Why? Because it creates a whole new BufferedImage, then copies it into after.

This is about 100 times faster:

		g.drawImage(img, (int) translateX, (int) translateY, (int) scaledWidth, (int) scaledHeight,, this);

Because the scary affine transform is gone, right? Wrong. This is fast too, about 50 times faster than the first version:

		after = scaleOp.filter(img, null);

And this is just as fast as drawImage():

		((Graphics2D) g).drawRenderedImage(img,at);

How would you choose between them? Well, if you want rotations you’d need to use the one of affine transform versions. Snippet 3 may involve you in color model shenanigans you’d rather avoid (see, so my conclusion is: if you don’t need rotations, use drawImage(), otherwise use drawRenderedImage(). However, I’m sure there are a thousand other ways to do this, and I’m nowhere near having a handle on best practice.

mod_cfml wrangling

In a previous post I talked about how Railo’s default install with mod_cfml can cause problems when you have a lot of virtual hosts. This post will deal with some details of various configurations to deal with those problems. I’ll assume an Apache + Linux install for these config examples.

Config file locations

Tomcat config is in /opt/railo/tomcat/conf
Tomcat context files are in /opt/railo/tomcat/conf/Catalina/<hostname>/ROOT.xml, where “hostname” is the hostname from the Tomcat host configuration. Mod_cfml uses the hostname from the URL for this purpose, but you can define it to be anything.
Apache config is in /etc/apache2

1. Explicitly define Tomcat hosts and aliases

The first thing you can do is to explicitly handle most or all of your Tomcat context creation, rather than let mod_cfml do it for you. Because of the way mod_cfml works, it stays completely out of the loop and adds no overhead for explicitly defined contexts.

For each distinct Railo application, add a <Host> element to Tomcat’s server.xml, like so:

<Host name="" appBase="webapps">

Important: this configuration is in addition to your virtual host setup in your web server.

Then create the context file, which will be called /opt/railo/tomcat/conf/Catalina/, and should contain:

<?xml version='1.0' encoding='utf-8'?>
<Context docBase="/var/www/myapps/lotsahosts_webroot">

TBH I don’t think the WatchedResource element is needed. It’s just come along for the cut-and-paste ride since forever.

To reiterate, you’ll need Host element and a context file for every distinct Railo application. How do you know when you need a new Host element, as opposed to just adding an Alias? It comes down to what your ColdFusion code is expecting. If it’s OK with handling multiple hostnames, then go ahead and use an alias. Otherwise, add a Host element and context file.

2. Disable the mod_cfml Tomcat valve

After the above config change, mod_cfml will stay out of the picture until your web server sends through a host header that you haven’t explicitly handled. At that point, mod_cfml springs into action and creates the context for you.

This can be a reasonable way to operate if you frequently need to provision new virtual hosts that are all just aliases into the one web app. You can let mod_cfml dynamically create the contexts, but keep the total context count down by periodically sweeping them up into your static configuration (i.e. add an <Alias> element to server xml and then just delete the context folder from conf/Catalina). However, if your new virtual hosts are not just aliases, your context count will unavoidably increase, and you’ll run into mod_cfml’s startup overhead.

So, to disable the mod_cfml Tomcat valve, just comment out these lines in Tomcat’s server.xml:

	<Valve className="mod_cfml.core"

Once you’ve done that (and restarted Railo), any host header that you haven’t explicitly handled will result in a 404 error.

3. Remove the web server’s mod_cfml component

If you’ve read the mod_cfml documentation (which I’d recommend), you’ll know that mod_cfml is actually a matched pair of components, one on the web server side and one on the Tomcat side. The web server component works differently depending on which web server you run. On Apache, it’s a very lightweight component that runs on top of the usual mod_proxy or mod_jk setup and adds some headers to help the Tomcat valve know how to set the docBase for new contexts.

I’m not sure why you’d need or want to remove the web server component, as the startup and memory overheads are all on the Tomcat side. But, for completeness, here’s how to do it for Apache 2.2: simply remove the PerlRequire, PerlHeaderParserHandler, and PerlSetVar directives from /etc/apache2/apache2.conf. Note that on older Apaches those directives might be in httpd.conf.

On IIS, mod_cfml uses the Boncode connector. If you want to remove that, you’ll have to replace it with another connector, but I’m no IIS guru, so I’ll leave it at that.

Railo and mod_cfml

The way Railo works with Tomcat is quite different to Adobe ColdFusion (ACF), and the differences can be pretty important if you have a lot of virtual hosts.

I don’t want to go into the gory details about how Tomcat works, but a few background points:

  • Tomcat has a virtual host concept analogous to web server virtual hosts, where there’s a default host and then specific host configurations tied to defined hostnames.
  • Tomcat’s virtual host configuration is completely independent of the front end web server (e.g. Apache or IIS)
  • Within each virtual host is at least one “context”, which essentially maps to a classloader and a set of resources (e.g. a directory on the filesystem).

All of this is quite different to the way webservers work, so Railo and ACF both try to hide it from you, in different ways.

ACF sets up Tomcat with just the default virtual host and a single default context. The default host will handle any request not bound to some other host, so with no further configuration every hostname ends up in the default Tomcat host and context. Simple and easy. In effect, your entire ColdFusion environment, multiple hostnames, multiple applications, the whole shebang, lives inside a single Tomcat context.

This is decidedly not idiomatic from a Java servlet container point of view, where each application gets at least its own context. Among other things, this allows the application to define its own classpath, thus isolating applications from issues like JAR conflicts with other applications. So Railo takes this path – each ColdFusion application maps to a Tomcat application, with its own context. That means Tomcat must be configured for each and every hostname and application folder.

But wait – ColdFusion is supposed to hide all unpleasant details, isn’t it? CF developers shouldn’t need to know or care about Java-specific stuff like servlet container configuration, right? So Railo introduces another piece, mod_cfml. The Railo installer configures this by default, and what it does is watch for unrecognised hostnames and create new Tomcat contexts for them on the fly. Pretty neat trick really, and it makes Railo just as seamless and noob-friendly as ACF. Until…

Until you migrate an environment with hundreds of virtual hosts from ACF to Railo. At which point, several things might start to cause problems:

  1. Context memory overhead: it may be only 1-2MB per context, but that’s enough that your site that hummed along nicely in a 512MB JVM is now completely unresponsive with memory stress.
  2. Context startup time: on my anemic dev box, it takes mod_cfml 30 seconds to create a new context, and over a minute to validate an existing context after a server restart. Even a fairly beefy staging box only brings that down to 10 seconds per context. Multiply that by 500, and you’ve got a problem.
  3. Context creation throttling: because of these overheads, mod_cfml throttles context creation to avoid becoming a DoS vector. By default, you get one context creation per 30 seconds up to a maximum of 200 contexts per 24 hour period. You’ll simply get a 503 error on every virtual host after the first 200.
  4. Context restart: this deserves a separate dot point – mod_cfml “creates” every context the first time it is hit after a restart, even if it already exists. That means that even if you have fewer than 200 virtual hosts, you can hit the limit simply by restarting the Railo service. Sites that worked before the restart suddenly become unresponsive after the restart.

What to do? There are, as always, options:

You can beef up your environment. Make sure you have enough memory, enough CPU, adjust the mod_cfml throttling settings, and preload all the contexts by hitting them once after each server restart (so your users don’t get the delay). Of course you’re now busy fiddling with servlet container configuration, so mod_cfml isn’t earning its keep from a simplification point of view.

You can just ditch mod_cfml and use mod_proxy or whatever-it-is-on-IIS, and then configure Tomcat manually. This lets you group virtual hosts into shared contexts (as per ACF), thus avoiding the issues that come with uncontrolled proliferation of contexts. To be honest, when I installed Railo I took one look at mod_cfml and said “No thanks” – and that was before I knew about the gotchas listed above. I reckon if you can configure a web server, you can configure a servlet container, so mod_cfml just isn’t solving any problem that I have.

Or, you can manually configure Tomcat to recognise all your existing virtual hosts, but keep mod_cfml around to pick up any new hostnames. This can be handy if you add hostnames (say, one hostname per client) on a daily basis. The new ones will all get a context each, but to control this you can periodically add the new hostnames to your shared context and delete the standalone contexts.

Leave a comment if you want me to post a detailed how-to for any of these options.

Edit: more detail on configuration tweaks

Woocommerce – testing if a product is in a descendent category

To test whether a WordPress post is within a particular category or any of its subcategories, there’s a handy snippet in the WordPress codex here:

For WordPress noobs like me (and what environment has more noobs than WordPress?), though, it’s not obvious how to make that work for Woocommerce product categories. There are two key things you need to know:

  1. Woocommerce product categories do not use the normal category system. They use a custom taxonomy called ‘product_cat’
  2. The in_category function used in that snippet only works with the normal categories. For custom taxonomies, you have to use has_term() instead.

The revised snippet looks like this:

if ( ! function_exists( 'post_is_in_descendant_product_cat' ) ) {
	function post_is_in_descendant_product_cat( $cats, $_post = null ) {
		foreach ( (array) $cats as $cat ) {
			// get_term_children() accepts integer ID only
			$descendants = get_term_children( (int) $cat, 'product_cat' );
			if ( $descendants && has_term($descendants, 'product_cat', $_post) )
				return true;
		return false;

jQuery .validate – what does optional really mean?

When you look customising validation rules and/or handlers for the jQuery validate plugin, you run across a lot of code that looks like this:

return this.optional( element ) || !/Invalid|NaN/.test( new Date( value ).toString() );

So obviously, that means “Don’t bother checking this field if it’s optional”, right? Wrong!

There are two important things you need to know about jQuery validate’s optional() function:

  1. It is NOT telling you whether the field is optional. It is just telling you whether the field is empty.
  2. It returns false if the field is NOT empty, and it returns the string “dependency-mismatch” if the field IS empty.

If you’re quietly thinking “WTF?” at this point, you’re not alone. The semantics do make a bent kind of sense once you know what’s going on, though. Obviously there’s no point checking any validation but “required” on an empty field. So the code above is saying “If this field is OK, it will be either because its empty but optional, or because this rule passes”. Note that “If” – it’s not saying the field IS optional, it’s saying it will have to be optional to be OK. The actual optional check is left up to the “required” validation handler.

You can stop now and just remember the two points above. But if you’re keen, let me bend your mind a little further. If you dive into the plugin source code, you see the implementation of optional looks like this:

optional: function( element ) {
			var val = this.elementValue( element );
			return !$ this, val, element ) && "dependency-mismatch";

Surely the first part of line 3 is testing whether the field is required? Actually, no it’s not. It’s just calling the same function that the real “required” rule would call – in other words (leaving aside complications with dependencies), testing whether the field is empty.

Summary: in this code, “optional” means “empty”, “required” may mean “empty” or “required” depending on the context, and “dependency-mismatch” means “true”. In the immortal words of Phil Karlton