Running Tomcat from daemontools

I’m a big fan of the daemontools server framework for quite a few reasons. For one, it’s incredibly stable. So stable, I use it to watch Apache and restart it when it crashes because daemontools never crashes. “Depend in the direction of stability” is my mantra.

Your server might crash, but daemontools will ruthlessly restart it, and since it runs from inittab, the OS will restart daemontools if it ever crashes. But, like I said, daemontools never crashes.

Another reason I love daemontools is that it does automatic logging and log rotation, so you can create as many servers you need and you don’t have to worry that the one last server you threw in there in a hurry won’t get its logs rotated, causing your hard drive to fill up.

Tomcat, on the other hand, has never been very friendly to the system administrator. It creates many different log files at once and rotates them by renaming them with date stamps; this makes it really annoying when you’re trying to tail the log files. Log messages tend to be large and span multiple lines, so it’s very hard to see what’s going on with all the noise. Killing Tomcat requires running a shell script that sends a shutdown command to a socket, completely breaking UNIX convention for no good reason.

I decided to bite the bullet and get Tomcat to run under daemontools instead, and I’ve never been happier (well, given that I’m still talking about Java here, let’s just say I’ve never been less unhappy…). Here’s how I did it:

My run script looks like this:

#!/bin/sh
exec 2>&1
exec envdir ./env setuidgid tomcat /usr/local/tomcat/bin/catalina.sh run

In my env directory I have two files that set up my environment: JAVA_HOME and CLASSPATH. JAVA_HOME contains the following:

/usr

CLASSPATH contains this (I’ll explain why in a bit):

/usr/local/tomcat/bin/tiny-formatter.jar

I had to comment out the following line at the top of tomcat/bin/setclasspath.sh to keep it from clobbering the CLASSPATH environment variable:

# First clear out the user classpath
# CLASSPATH=

Now, tiny-formatter.jar is a little hack that makes Tomcat’s logger use only one line per log message and removes the datestamp, since multilog adds one already. It contains a class file, TinyFormatter.class, generated by compiling the following source file, TinyFormatter.java:

import java.io.PrintWriter;
import java.io.StringWriter;

import java.util.logging.Formatter;
import java.util.logging.LogRecord;

public class TinyFormatter extends Formatter
{
static final String lineSep = System.getProperty(“line.separator”);

public String format(LogRecord record)
{
StringBuffer buf = new StringBuffer(180);
buf.append(record.getLevel());
buf.append(“: “);
buf.append(formatMessage(record));
buf.append(” (“);
buf.append(record.getSourceClassName());
buf.append(‘.’);
buf.append(record.getSourceMethodName());
buf.append(‘)’);
buf.append(lineSep);

Throwable throwable = record.getThrown();
if (throwable != null) {
StringWriter sink = new StringWriter();
throwable.printStackTrace(new PrintWriter(sink, true));
buf.append(sink.toString());
}

return buf.toString();
}
}

Okay, maybe it’s not so tiny. That’s Java for ‘ya.

To install this custom formatter, I edited tomcat/conf/logging.properties and replaced its contents with the following:

handlers = java.util.logging.ConsoleHandler
.handlers = java.util.logging.ConsoleHandler

java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = TinyFormatter

The run script for the daemontools logger is pretty standard:

#!/bin/sh
exec setuidgid tomcat multilog t /var/multilog/tomcat

I can now start and stop Tomcat with “svc -u /service/tomcat” and “svc -d /service/tomcat”, and restarting is the usual “svc -t /service/tomcat”. To learn all the details I’ve left out, I highly recommend reading the djb way. I’m out of time. ;)

PMA Scanbots

I can’t think of any good reason why you’d want to put your phpMyAdmin installation in any of the following locations:

  1. /MYADMIN/
  2. /MYadmin/
  3. /MyAdmin/
  4. /PHPMYADMIN/
  5. /PHPMYadmin/
  6. /PHPmyadmin/
  7. /PMA/
  8. /PhPmYaDmIn/
  9. /admin/
  10. /admin/mysql/
  11. /admin/phpmyadmin/
  12. /admin/pma/
  13. /db/
  14. /dbadmin/
  15. /myADMIN/
  16. /myadmin/
  17. /mysql-admin/
  18. /mysql/
  19. /mysqladmin/
  20. /pHpMyAdMiN/
  21. /phpMYadmin/
  22. /phpMyAdmin-2.2.0/
  23. /phpMyAdmin-2.2.3/
  24. /phpMyAdmin-2.2.6/
  25. /phpMyAdmin-2.2.7-pl1/
  26. /phpMyAdmin-2.2.7/
  27. /phpMyAdmin-2.5.1/
  28. /phpMyAdmin-2.5.4/
  29. /phpMyAdmin-2.5.6/
  30. /phpMyAdmin-2.6.4-pl4/
  31. /phpMyAdmin-2.6.4/
  32. /phpMyAdmin-2.7.0-pl2/
  33. /phpMyAdmin-2.7.0/
  34. /phpMyAdmin-2.8.1/
  35. /phpMyAdmin-2.8.2.1/
  36. /phpMyAdmin-2.8.2.2/
  37. /phpMyAdmin-2.8.2.4/
  38. /phpMyAdmin-2.9.0.1/
  39. /phpMyAdmin-2.9.0.2/
  40. /phpMyAdmin-2.9.0/
  41. /phpMyAdmin-2.9.1/
  42. /phpMyAdmin/
  43. /phpmyADMIN/
  44. /phpmyadmin/
  45. /phpmyadmin2/
  46. /pma/
  47. /pmamy/
  48. /web/phpMyAdmin/

It’s a jungle out there.

I did my taxes with MySQL

For the second year now, I used MySQL to do my taxes. I find that even with GainsKeeper, it’s tedious to do investment taxes with TurboTax because it takes a considerable amount of research (even with a small portfolio like mine) to provided the needed information to calculate the cost basis: When did I buy this stock or fund? Did I buy multiple lots? Were there reinvested dividends? Was this the first sale? If not, when else did I sell?

As geeky as it sounds, SQL is a powerful tool to answer these kinds of ad-hoc questions. It can transform investment taxes from a several-week project into something you can do in half a day (which I did, by the way). Here are the tables I use, which are pretty close to E*Trade’s CSV export format:

CREATE TABLE `security` (
`cusip` varchar(32) NOT NULL,
`symbol` varchar(32) NOT NULL,
`description` varchar(255) NOT NULL,
PRIMARY KEY (`cusip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `trade` (
`trade_date` date NOT NULL,
`order_type` enum(‘BUY’,'SELL’) NOT NULL,
`cusip` varchar(32) NOT NULL,
`description` varchar(255) NOT NULL,
`quantity` decimal(16,8) NOT NULL,
`executed` decimal(16,8) NOT NULL,
`commision` decimal(16,8) NOT NULL,
`net_amount` decimal(16,8) NOT NULL,
PRIMARY KEY (`trade_date`,`order_type`,`cusip`,`quantity`,`executed`),
KEY `cusip` (`cusip`),
CONSTRAINT `trade_ibfk_1` FOREIGN KEY (`cusip`) REFERENCES `security` (`cusip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

I also have a couple of views that show just buys or sells for convenience:

CREATE VIEW `buys` AS select `t`.`trade_date` AS `trade_date`,`t`.`order_type` AS `order_type`,`t`.`cusip` AS `cusip`,`t`.`description` AS `description`,`t`.`quantity` AS `quantity`,`t`.`executed` AS `executed`,`t`.`commision` AS `commision`,`t`.`net_amount` AS `net_amount`,`s`.`symbol` AS `symbol`,`s`.`description` AS `symbol_description` from (`trade` `t` join `security` `s` on((`t`.`cusip` = `s`.`cusip`))) where (`t`.`order_type` = ‘BUY’)

CREATE VIEW `sells` AS select `t`.`trade_date` AS `trade_date`,`t`.`order_type` AS `order_type`,`t`.`cusip` AS `cusip`,`t`.`description` AS `description`,`t`.`quantity` AS `quantity`,`t`.`executed` AS `executed`,`t`.`commision` AS `commision`,`t`.`net_amount` AS `net_amount`,`s`.`symbol` AS `symbol`,`s`.`description` AS `symbol_description` from (`trade` `t` join `security` `s` on((`t`.`cusip` = `s`.`cusip`))) where (`t`.`order_type` = ‘SELL’)

Finding all the purchases I made for a particular stock is as simple as this:

mysql> select * from buys where cusip = ’007903107′ order by trade_date \G
*************************** 1. row ***************************
trade_date: 2005-12-06
order_type: BUY
cusip: 007903107
description: ADV MICRO DEVICES
quantity: 5.00000000
executed: 27.59000000
commision: 12.99000000
net_amount: 150.94000000
symbol: AMD
symbol_description: ADV MICRO DEVICES
1 row in set (0.00 sec)

I decided to use CUSIP numbers as primary keys for securities since they are more stable and work for bonds as well as stocks. This makes things a little bit annoying because I always have to look the CUSIP up from the security table. That’s what I get for trying to “do the right thing” I guess. =)

Ten things that XML-RPC does… that REST leaves unspecified

  1. Standard, cross-language, typeful serialization of data
  2. User-defined error codes and messages
  3. “Boxcarring” of requests to reduce overhead
  4. Serialization of binary content
  5. Serialization of date-time values
  6. Standardized parameter passing
  7. Introspection allowing for straightforward code generation
  8. High-level APIs for just about every language
  9. No manual parsing of XML, ever
  10. Only three lines to call a function in Python and several other languages:

>>> import xmlrpclib
>>> s = xmlrpclib.Server(‘http://localhost/xmlrpc’)
>>> s.demo.addTwoNumbers(3, 4)
7

Not that the REST doesn’t have its benefits, but someone ought to be saying this. XML-RPC isn’t complicated like SOAP, it runs just about everywhere, and it lets you get on with your work rather than arguing about semicolons versus slashes or XML versus JSON or countless other things. Besides, when your goal is to support as many languages as possible, you want to minimize the amount of code you write for each language. As far as I’ve seen, nothing else accomplishes literally no-code binding like XML-RPC.

Date by example

Here’s a script I wrote awhile ago. It parses an example of how you want your date/time values formatted, and builds a format string you can pass to date() or strftime(). Soon after I wrote it, I posted it to reddit, and I don’t think anyone understood it because they all modded me down for it. I ended up removing it to save my karma. As if that matters. Anyway, it is useful, and I still use it from time to time. Much less work than consulting the manual.

If you want the source code, drop me a line.

spoomusic.com redesign for 2008

I launched the new WordPress-driven spoomusic.com website this month, which was a fun and rewarding experience. The old site was written without a database, using the filesystem for nearly everything. As crazy as it sounds, I think this was exactly the right thing to do at the time because the content was primarily MP3s and images. In addition, since the design was about as simple as it could possibly be, it was rather straightforward to bulk import all of the content (news, interviews, album text, etc.) using WordPress’s XML-RPC interface. (And I did it in OCaml using my library, xmlrpc-light, so I even had fun doing it!)

There are many advantages to using WordPress that I already enjoy: all albums and other content can get comments, which is a feature I wanted for a long time but never had the energy to write from scratch; albums can now be “drafts”, which is so awesome, since we never get albums right the first time, and it’s nice that the public doesn’t have to endure our accidents; artists can edit their own album content; artists each get their own blog… I could go on…

By the way, I’m writing this using the latest SVN trunk of WordPress, which has a totally redesigned administration interface. It looks like it’s still a work in progress, and I’m not sure what I think of it yet. It sure is different. Try it out sometime.

To laugh or cry?

I came up with the following whopper this week. It’s a thread-local LRU cache using Java 5′s generics and two nested anonymous subclasses.

protected static final int MAX_CACHE_ENTRIES = 10;
protected static ThreadLocal<Map<String, List<Doodad>>> doodadCache =
    new ThreadLocal<Map<String, List<Doodad>>>() {
    protected Map<String, List<Doodad>> initialValue() {
        return new LinkedHashMap<String, List<Doodad>>(MAX_CACHE_ENTRIES + 1, 0.75f, true) {
            protected boolean removeEldestEntry(Map.Entry eldest) {
                return size() > MAX_CACHE_ENTRIES;
            }
        };
    }
};

I told Al I didn’t know whether to laugh or cry. He said, “There are some who would find that sort of thing elegant. It’s almost like lisp,” to which I responded, “yeah, like lisp with way too much syntax.”

Easy rotating database backups with logrotate

“logrotate” is a utility that comes with most Linux distributions. Its intended purpose is to rename, compress, and delete old log files. If you look in /var/log, you’ll probably see files like the following:

/var/log/auth.log
/var/log/auth.log.0
/var/log/auth.log.1.gz
/var/log/auth.log.2.gz
/var/log/auth.log.3.gz

This is “logrotate” in action. Recently, I realized that this is the same kind of behavior I’d like for my database backups. With a little creative misuse I was able to turn it into a backup rotator for MySQL backups. Here’s how to do it:

Create the directories /usr/local/mysql-backup and /var/backups/mysql. Paste the following into /usr/local/mysql-backup/mysql-backup.logrotate.conf:

/var/backups/mysql/*.sql {
rotate 7
daily
compress
missingok
nocreate
}

To read about these and other configuration options, type “man logrotate”. Now, create a shell script in /usr/local/mysql-backup/mysql-backup with the following:

#!/bin/sh
cd /usr/local/mysql-backup
/usr/sbin/logrotate -f -s mysql-backup.logrotate.state mysql-backup.logrotate.conf
/usr/bin/mysqldump -uUSERNAME -pPASSWORD -A > /var/backups/mysql/database.sql

If you want weekly backups as well, create a directory called /var/backups/mysql-weekly, and add the following line to the mysql-backup script:

gzip -c /var/backups/mysql/database.sql > /var/backups/mysql-weekly/database-`date +%G-%V`.sql.gz

Now all you need to do is “chmod +x mysql-backup” and “ln -s mysql-backup /etc/cron.daily/mysql-backup” and you’ll get nicely rotated and compressed database backups every day. If you want hourly backups, symlink it to “cron.hourly” instead, and change “rotate 7″ to taste in the logrotate configuration.

Better password hashing for PHP

I’ve been reading and re-reading this article, which explains the problems with MD5 as a password encryption technique and gives alternatives that are more secure. The author declares bcrypt the winner, though it seems to only be practically available on BSD. I did some searching around for a PHP solution and found a library called phpass, which will use the BlowFish-based bcrypt method if available, otherwise it will fall back on a hardened MD5- or crypt-based approach. The library is really easy to use, and I think I will start using it on future projects where I can control the password hashing scheme.

Also, apparently, I’ve had the wrong idea about what a “salt” is. Appending a constant string to the password before encrypting it is not a salt, it just creates a different hashing function that is just as easy to attack with a rainbow table, assuming you know what that constant string is (i.e. security through obscurity). To use a salt correctly you need to generate a random salt each time and store it in the clear along with the encrypted password. This is what crypt has been doing for decades.