mod_cfml wrangling

In a previous post I talked about how Railo’s default install with mod_cfml can cause problems when you have a lot of virtual hosts. This post will deal with some details of various configurations to deal with those problems. I’ll assume an Apache + Linux install for these config examples.

Config file locations

Tomcat config is in /opt/railo/tomcat/conf
Tomcat context files are in /opt/railo/tomcat/conf/Catalina/<hostname>/ROOT.xml, where “hostname” is the hostname from the Tomcat host configuration. Mod_cfml uses the hostname from the URL for this purpose, but you can define it to be anything.
Apache config is in /etc/apache2

1. Explicitly define Tomcat hosts and aliases

The first thing you can do is to explicitly handle most or all of your Tomcat context creation, rather than let mod_cfml do it for you. Because of the way mod_cfml works, it stays completely out of the loop and adds no overhead for explicitly defined contexts.

For each distinct Railo application, add a <Host> element to Tomcat’s server.xml, like so:

<Host name="lotsahosts.me" appBase="webapps">
   <Alias>myalias1.lotsahosts.me<Alias>
   <Alias>myalias2.lotsahosts.me<Alias>
   <Alias>myalias3.lotsahosts.me<Alias>
</Host>

Important: this configuration is in addition to your virtual host setup in your web server.

Then create the context file, which will be called /opt/railo/tomcat/conf/Catalina/lotsahosts.me/ROOT.xml, and should contain:

<?xml version='1.0' encoding='utf-8'?>
<Context docBase="/var/www/myapps/lotsahosts_webroot">
  <WatchedResource>WEB-INF/web.xml</WatchedResource>
</Context>

TBH I don’t think the WatchedResource element is needed. It’s just come along for the cut-and-paste ride since forever.

To reiterate, you’ll need Host element and a context file for every distinct Railo application. How do you know when you need a new Host element, as opposed to just adding an Alias? It comes down to what your ColdFusion code is expecting. If it’s OK with handling multiple hostnames, then go ahead and use an alias. Otherwise, add a Host element and context file.

2. Disable the mod_cfml Tomcat valve

After the above config change, mod_cfml will stay out of the picture until your web server sends through a host header that you haven’t explicitly handled. At that point, mod_cfml springs into action and creates the context for you.

This can be a reasonable way to operate if you frequently need to provision new virtual hosts that are all just aliases into the one web app. You can let mod_cfml dynamically create the contexts, but keep the total context count down by periodically sweeping them up into your static configuration (i.e. add an <Alias> element to server xml and then just delete the context folder from conf/Catalina). However, if your new virtual hosts are not just aliases, your context count will unavoidably increase, and you’ll run into mod_cfml’s startup overhead.

So, to disable the mod_cfml Tomcat valve, just comment out these lines in Tomcat’s server.xml:

	<Valve className="mod_cfml.core"
		loggingEnabled="true"
		waitForContext="60"
		maxContexts="200"
		timeBetweenContexts="1"
		/>	

Once you’ve done that (and restarted Railo), any host header that you haven’t explicitly handled will result in a 404 error.

3. Remove the web server’s mod_cfml component

If you’ve read the mod_cfml documentation (which I’d recommend), you’ll know that mod_cfml is actually a matched pair of components, one on the web server side and one on the Tomcat side. The web server component works differently depending on which web server you run. On Apache, it’s a very lightweight component that runs on top of the usual mod_proxy or mod_jk setup and adds some headers to help the Tomcat valve know how to set the docBase for new contexts.

I’m not sure why you’d need or want to remove the web server component, as the startup and memory overheads are all on the Tomcat side. But, for completeness, here’s how to do it for Apache 2.2: simply remove the PerlRequire, PerlHeaderParserHandler, and PerlSetVar directives from /etc/apache2/apache2.conf. Note that on older Apaches those directives might be in httpd.conf.

On IIS, mod_cfml uses the Boncode connector. If you want to remove that, you’ll have to replace it with another connector, but I’m no IIS guru, so I’ll leave it at that.

Railo and mod_cfml

The way Railo works with Tomcat is quite different to Adobe ColdFusion (ACF), and the differences can be pretty important if you have a lot of virtual hosts.

I don’t want to go into the gory details about how Tomcat works, but a few background points:

  • Tomcat has a virtual host concept analogous to web server virtual hosts, where there’s a default host and then specific host configurations tied to defined hostnames.
  • Tomcat’s virtual host configuration is completely independent of the front end web server (e.g. Apache or IIS)
  • Within each virtual host is at least one “context”, which essentially maps to a classloader and a set of resources (e.g. a directory on the filesystem).

All of this is quite different to the way webservers work, so Railo and ACF both try to hide it from you, in different ways.

ACF sets up Tomcat with just the default virtual host and a single default context. The default host will handle any request not bound to some other host, so with no further configuration every hostname ends up in the default Tomcat host and context. Simple and easy. In effect, your entire ColdFusion environment, multiple hostnames, multiple applications, the whole shebang, lives inside a single Tomcat context.

This is decidedly not idiomatic from a Java servlet container point of view, where each application gets at least its own context. Among other things, this allows the application to define its own classpath, thus isolating applications from issues like JAR conflicts with other applications. So Railo takes this path – each ColdFusion application maps to a Tomcat application, with its own context. That means Tomcat must be configured for each and every hostname and application folder.

But wait – ColdFusion is supposed to hide all unpleasant details, isn’t it? CF developers shouldn’t need to know or care about Java-specific stuff like servlet container configuration, right? So Railo introduces another piece, mod_cfml. The Railo installer configures this by default, and what it does is watch for unrecognised hostnames and create new Tomcat contexts for them on the fly. Pretty neat trick really, and it makes Railo just as seamless and noob-friendly as ACF. Until…

Until you migrate an environment with hundreds of virtual hosts from ACF to Railo. At which point, several things might start to cause problems:

  1. Context memory overhead: it may be only 1-2MB per context, but that’s enough that your site that hummed along nicely in a 512MB JVM is now completely unresponsive with memory stress.
  2. Context startup time: on my anemic dev box, it takes mod_cfml 30 seconds to create a new context, and over a minute to validate an existing context after a server restart. Even a fairly beefy staging box only brings that down to 10 seconds per context. Multiply that by 500, and you’ve got a problem.
  3. Context creation throttling: because of these overheads, mod_cfml throttles context creation to avoid becoming a DoS vector. By default, you get one context creation per 30 seconds up to a maximum of 200 contexts per 24 hour period. You’ll simply get a 503 error on every virtual host after the first 200.
  4. Context restart: this deserves a separate dot point – mod_cfml “creates” every context the first time it is hit after a restart, even if it already exists. That means that even if you have fewer than 200 virtual hosts, you can hit the limit simply by restarting the Railo service. Sites that worked before the restart suddenly become unresponsive after the restart.

What to do? There are, as always, options:

You can beef up your environment. Make sure you have enough memory, enough CPU, adjust the mod_cfml throttling settings, and preload all the contexts by hitting them once after each server restart (so your users don’t get the delay). Of course you’re now busy fiddling with servlet container configuration, so mod_cfml isn’t earning its keep from a simplification point of view.

You can just ditch mod_cfml and use mod_proxy or whatever-it-is-on-IIS, and then configure Tomcat manually. This lets you group virtual hosts into shared contexts (as per ACF), thus avoiding the issues that come with uncontrolled proliferation of contexts. To be honest, when I installed Railo I took one look at mod_cfml and said “No thanks” – and that was before I knew about the gotchas listed above. I reckon if you can configure a web server, you can configure a servlet container, so mod_cfml just isn’t solving any problem that I have.

Or, you can manually configure Tomcat to recognise all your existing virtual hosts, but keep mod_cfml around to pick up any new hostnames. This can be handy if you add hostnames (say, one hostname per client) on a daily basis. The new ones will all get a context each, but to control this you can periodically add the new hostnames to your shared context and delete the standalone contexts.

Leave a comment if you want me to post a detailed how-to for any of these options.

Edit: more detail on configuration tweaks

Adobe ColdFusion 10 to Railo 4

I’ve recently ported a moderately large ColdFusion application from Adobe ColdFusion 10 (ACF) to Railo 4.0.2. The whole process took about two days, including regression testing and fixes. Overall, I was very impressed with the level of compatibility, with almost all the issues arising from one of two things: the app server configuration, and very recent code written to the latest CFML standards. Even the latter was by no means bad. For example, anonymous functions declared in struct literals, something even ACF can be a little twitchy about, gave Railo no problems at all.

By far the most time-consuming issue was Apache mod_alias issue listed first. I spent the whole of a long day on this, with many a blind alley due to my imperfect understanding of how Apache and Tomcat work together. I finally pulled the plug when I caught myself about to download the Tomcat source code. I learnt a lot, but at the end of the day my preferred approach (which is to define all virtual directories in just one place, the Apache configuration) simply doesn’t work with Railo. Plan B works just fine, so all’s well that ends well.

So, here’s the list of (mostly minor) issues I encountered. Some of these may become bug reports when they grow up, some are just for noting.

Apache mod_alias

Apache Alias directives are ignored by Railo – which is completely expected behavior for an application server. Less understandably, Tomcat context aliases are also ignored, which leaves us with just Railo mappings. In ACF you can manage everything with Apache Alias directives (or IIS virtual directories), and ACF then uses these for all path resolution. Which is a pretty neat trick, really. I’d love to know how ACF does it, but I’m guessing it goes back and queries the apache virtual host for its mappings. In effect, ACF completely hides the distinction between file-oriented web serving, and URI-oriented application serving.

Function variables implicit toString()

ACF can interpolate function variables into a string. Railo crashes.

Use case: I use this for writing a trace log of what controller functions are invoked, useful for debugging routing.


	



	
	myfunc = test;
	
	writeoutput ("my func is #myfunc#"); // crash on Railo, on ACF prints something like "my func is cftest2ecfm1234911533$funcTEST@95a943"


Fix: use #myfunc.toString()#. ACF output is unchanged, Railo output is “test()”.

Nested functions

Railo crashes with a cryptic java.lang.ArrayIndexOutOfBoundsException, evidently straight out of the compiler.
Use case: nested functions are handy because they have access to the enclosing function’s arguments scope and local variables.


	
		
	
	
	



#test()#


Fix: refactor by promoting the nested functions to top-level functions and explicitly passing local variables to them as arguments.

Component search paths

ACF seems to search for components in the customtags directories. Railo by default does not.
Use case: no real use case. I suspect for most of us, the various flavours of name and path resolution are something we don’t comprehensively understand, we just tweak until things work.

Fix: There may be a way to get Railo to do this, but it’s just as easy (and a lot less magical) to add a cfimport tag.

“Is” test with java objects

ACF allows you to use java objects in “is” statements where strict type checking would suggest they can’t be used.
Use case: One way to deal with optional arguments is to set a default value of “”, and then check to see if the argument passed is “” or not.





	a string

	not a string


ACF will give “not a string”. Railo throws the “can’t compare complex object types as simple value” error. Note that for some built-in Java types (e.g. java.util.ArrayList) ACF will respond with the same error as Railo, no doubt because those types have special meaning in CFML.

Fix: Do the test using isObject(tmp) instead of tmp is "". Or, stop using empty string to mean undefined.

Equality of java objects

Even where java objects are type compatible, Railo won’t allow the use of “is”.


a = CreateObject("java", "com.medeserv.meseduframe.InteractionType").init("test");
b = a;

if (a is b) {
	writeoutput("same");
}

Crashes in Railo but works in ACF.

Fix: use a.equals(b) instead.

For/in loop with list


	for (i in "a,b,c") {
		writeoutput(i);
	}

Crashes in Railo, works in ACF 10.

Fix: Replace "a,b,c" with ListToArray("a,b,c")

Non-standard syntax for function argument default values


public void function test(
	required string a,
	any b default=""
) {
	variables.a = arguments.a;
	variables.b = arguments.b;
}

test("a");

if (variables.b is "") {
	writeoutput("empty");
}

Crashes in Railo, works in ACF. This is not the documented syntax for ACF, so the fix is obvious.

Fix: use the documented syntax i.e. string b="" rather than string b default="".

Whitespace in functions

(Added 30/5/2014)

If I do, for example:


   
   
   


#blah()#

Railo will output all the whitespace in the function. That is, the actual output will be:

.. 
.. 
..bogus

(using dots for spaces so you can see them). ACF by default will just output “bogus”. Add the output=”false” attribute to get rid of the whitespace. See https://issues.jboss.org/browse/RAILO-142 for some discussion of whose bug this is.

This is no biggy, but it can be very hard to debug if you don’t know about it. Log the return value of the function – no whitespace. Assign the return value the function to a variable then output the variable – no whitespace. Just to mess with your head even more, if you do this:

#Len(blah())#

the output is

.. 
.. 
..5

So basically, functions are potentially doing two things at once: the usual function-y things; and, if they are called in a cfoutput-like context, emitting bytes to the browser output. I’m a little surprised that being evaluated inside a parameter list of another function can be regarded as a cfoutput-like context, but really I only found this one because I got slack with my output attributes.

ColdFusion: selectively handling missing methods

If you invoke a method on a ColdFusion object (aka CFC) that doesn’t exist, you get a handy exception telling you so, as you’d expect.

If, however, you implement the onMissingMethod method in your CFC (see the cfcomponent doc page, right down the bottom), all invocations of non-existent methods will be routed to it. You will never see another missing method exception for that object.

That’s fine and dandy, but what if you only want to handle specific method names, and have all others throw the missing method exception?

The short answer is that you’re on your own. Specifically:

  1. There is no equivalent of rethrow (remissing?)
  2. There is no way to manually recreate exactly the same exception and stack trace that the default missing method handler would give you.

So, if you don’t want to handle a method name, you can either throw an error of your choice, or fail silently.

BTW, I don’t think #2 is a big deal. I mention this only because what little discussion exists on this topic tends to suggest throwing the “same exception” that the ColdFusion engine does, and as far as I know that’s hard to impossible.

ColdFusion 9 behaviour with java.lang.Error

It seems that a java.lang.Error can cause problems for a CF9 server.

The test case looks like this:



Couple interesting things happen when I run this:

Firstly, I get no error page. No response body at all, just a response header with a status of “500 blah”. Fair enough, java.lang.Error is supposed to mean a serious unrecoverable error so when receiving it ColdFusion is quite right in just stopping. File that under “things I should have known but for some reason never came across”.

Secondly, and more worryingly, if I refresh the page a couple of times I pretty quickly run into this:

Server Error
The server encountered an internal error and was unable to complete your request.

Application server is busy. Either there are too many concurrent requests or the server still is starting up.

with the accompanying entry in the Apache error log

returning error page for Jrun too busy or out of memory

In fact I might get that “server busy” error for any other page on the server, not just a refresh of my test page. After a while it “fixes” itself, as in no more server busy errors. Until the next time I run my test page.

What does this all mean? My guess would be that Jrun quite reasonably decides to nuke the thread that has reported the serious and unrecoverable error, but then quite unreasonably sends the next request into the post-nuke radioactive wasteland of that thread. The thread, in a fit of post-holocaust dudgeon, refuses to respond at all, and Apache can do nothing but convey the bad news to the long-suffering user. Once the thread has expired out of the thread pool the problem vanishes.

It’d be interesting to know if this happens in CF10. In the meantime, I’m off to work out why my java code is throwing a java.lang.Error in the first place.

Environment stuff: CF 9.02 running whatever version of Apache2 is current for Ubuntu 12.04; CF9.01 on ditto ditto Apache2 ditto Ubuntu 9.04

Speaking at cfObjective(ANZ) 2011

I’ve had the honour of being accepted as a speaker at this year’s cfObjective(ANZ) conference in Melbourne. My topic will be “Why bother with OOP?”, which is a question that needs to be asked from time to time. By the way, in case you think I might be either a procedural Luddite or a functional zealot, I think we should bother with OOP – but we should know why we are doing it.

It’s a live issue for ColdFusion in a way that doesn’t apply to, say, Java, because in ColdFusion we have some very effective ways to write simple but powerful apps without writing any OO code at all. Object orientation, like most software design techniques, is a way to manage complexity. What if your platform has abstracted away so much of the complexity that there’s not much left to manage? That’s the situation some simple ColdFusion apps are in.

If you can’t make it to the conference, I’ll blog a bit more about the talk after the fact (i.e. once I’ve written it).

Model-Glue

This is a response to Jeffry Houser’s critique of Model-Glue. You should read Jeffry’s post before this one, as I directly respond to some of his points. To cut to the chase – me too, Jeffry, me too!

I’ve used M-G for two medium size projects. Like Jeffry, I can’t see a case where I’d use it again.

I couldn’t agree more about the event/view structure – this is just a global scope by another name, which as a way of passing variables takes us back about 40 years in programming language evolution. Yes, there are intelligent ways to use it, but the fact is that a robust mechanism for defining APIs already exists in the language (public function parameter lists) and a really great argument needs to be mounted for disregarding it. So does the rest of M-G mount that argument?

It seems to me that the heart of M-G is the implicit invocation mechanism. Essentially this is an event-driven programming model, and like all event-driven programming it supports very strong decoupling. At the point where you raise an event, you have no control [but see Brian’s comment below] over who will handle the event or what they will do with it. This is a powerful technique with applicability where system behaviour needs to remain loosely specified until load-time or even run-time (this is why you can change the menu structure in Microsoft Word while it is running). The tradeoff is increased opacity, increased debugging difficulty, and greater emphasis on good design – or to put it conversely, it’s much easier to make a mess of it.

As a programming model, it absolutely is not what I want when I’m setting up an average web app. 99% of the time I know exactly what controller function I want to invoke, I know exactly what data I need, and stating that with clarity is good design. Adding several layers of indirection adds no value at all – rather it greatly increases the risk of regression during future changes. As mentioned above, M-G tends to obscure the APIs of the various components rather than help define them. This is not to say everything should be hardwired. My beef with M-G is that it pervades the whole application, unlike techniques such as dependency injection and aspect-oriented programming, which let me introduce extra abstraction and complexity only where I get the payoff.

As a piece of software, M-G is a great achievement. It’s just the wrong tool for pretty much every job I have. The tragic thing is that, even if I did have to write a complex event-driven GUI, I’m pretty sure I wouldn’t be using ColdFusion to do it.

P.S.
A minor disagreement – I don’t think there’s anything wrong with the view having a dependency on the model. In fact it’s kind of absurd to think that a view can avoid having a dependency on the data it is representing. The important thing is that the model doesn’t have a dependency on the view. (Trygve Reenskaug’s original MVC pattern is instructive in this regard, although it’s not directly applicable to the web). So having to pipe all data via the controller is another layer of useless indirection. Having said that, there’s a fair bit of confusion about where the boundaries between the controller and the view are, so maybe this is just an issue of definition.

Speaking at cf.objective (ANZ)

I’m delighted to have been accepted to speak at cf.objective (ANZ) again. The topic will be that timeless old favourite, design patterns. Timeless? Well, since about turns around to check jacket of GoF book 1994, anyway. Not entirely coincidentally, the centre where I work has got into design patterns in the education space, so I’ll be able to draw on that to show just how widely relevant the design pattern approach is.
Excited! Come and heckle if you’re in Melbourne Nov 18 and 19.

More fun with the (ColdFusion) truth

In my last post I discovered that ColdFusion

true

is the string

"true"

. You can compare it to a java boolean true using “is”, but not using .equals(). Fair enough, one is a string and the other isn’t. “is” is ColdFusion magic, and .equals() is just poor dumb java. And if you’re wondering, “eq” is just as magic as “is”.

Does that mean all ColdFusion booleans are actually strings. No! The result of a boolean expression in ColdFusion is actually a java boolean.


a = true; // this is a string
b = "true"; // this is the same string - literally the same in-memory object
c = a is true; // this is a boolean
d = b is true; // this is literally the same boolean

a is c; // this is true
a.equals(c); // this is false
// etc.

So it turns out that

true

is not, as I thought, a built-in boolean constant, it’s actually a special syntactical case for creating a string – that can then be used in ColdFusion boolean expressions. This is reminding me of my perl days (that’s a good thing – I loved perl).

If you never had to interoperate with Java none of this would matter. The booleans and strings intermix seamlessly within ColdFusion. If you do, it’s important to know that ColdFusion actually does not have any built-in boolean constants that Java will recognise.


myJavaFunc(true);  // this won't work

myJavaFunc(true is true); // Somewhat disturbingly, this will

myJavaFunc(javacast("boolean", true)); // so will this

myJavaFunc(CreateObject("java","java.lang.Boolean").init("true")); // and so will this

I’d tend to go with the third form using javacast. Use the second form only if you want to mess with your junior programmers heads 🙂 Reminds me of those crazy funsters who put “and 1=1” at the end of all their SQL queries.