Email Boxes need to be stored in DB, but also call IMAP, APIs, etc.

I find myself often modelling the situation that there are rows in the database (e.g. “email boxes” for a user), and these rows represent things that exist elsewhere as well (e.g. IMAP accounts to back up these email boxes). There can be multiple ways of accessing these external resources, e.g. to delete an email box one does an deletes files on some server, to find out how much space is used there is an http-based protocol. And in the case of creation and deletion (and changing of password) these operations should not be done synchronously from the web interface, but are queued. This is not a contrived example, I am programming exactly this right now. All of the above are givens.

To not just stuff all the various API clients and other functionality into one huge class, there needs to be different objects representing:

  • The “email box” row in the database (and a persistence mechanism)
  • A “filesystem” object to represent operations on the filesystem such as “delete email box”. This object knows the directory layout used. This object can be shared between other objects which need to perform filesystem operations, such as a filestore accessible via FTP accounts (in this case). It’s convenient to program all these filesystem operations in one object.
  • A client for the HTTP-based protocol, to find out the box’s used size. In this case the protocol can do other functions, such as finding the space used in the FTP filestore. Again, it’s convenient to put all these operations in one class: one can create private methods to connect to the server, or for common API requirements such as response parsing which will be the same for all the commands, etc.
  • Persistable Queue objects, and QueueProcessor objects representing the programs or tasks to change the password, create/delete the boxes, etc.
  • Some Facade object to simply access to all the above?

Once one has come up with this objects in the system, there are a number of possibilities for how to combine them. E.g.

  • When one asks the HTTP protocol client object to find out the space used for a box, should one pass the parameter (of which box) as a Box object, or the name of the box and password as a String?
  • Should an application program (e.g. web interface) instanciate and use the HTTP protocol client object directly, to find the space used? Or should it call a method on the Box object, which calls the HTTP protocol object? Should both possibilities be available?

On the one hand, to simplify all objects, it would make sense to have the application program talk to the HTTP protocol object, and not to have this code in the Box object at all. And to always pass Box object, as this encourages strict typing.

However, I have found time and time again that the following solution works best:

  • Not have multiple ways of performing the same action.
  • Have a main “Box” object, which acts as a Facade. This represents a particular box. (i.e. not a BoxService stateless facade object, which each time takes a BoxId as a parameter to every function.)
  • Optionally have other objects to delegate to, concerning persistence of the box and its attributes to the database (although I prefer not)
  • A Box object knows the life cycle of a Box, and knows when to write things to queues etc. This will also need to be exposed in its interface (e.g. addCreationRequestToQueue) and explained in the class Javadoc. If this lifecycle changes (e.g. queue introduced for a certain operation) the interface will change and clients will have to be updated. But that’s OK, as probably there will be a requirement in the front-end to display “performing…” as long as the operation is in the queue. So lots will have to change if you change the life cycle.
  • This object also knows how to perform the operations which are normally queued, e.g. “delete”, in terms of simply calling the “filesystem” object. It may also need to update some internal flags to note that the filesystem no longer exists. These methods are normally only called from QueueProcessor objects, but are also handy to call from JUnit test scripts (e.g. in case of “create”), to put the system in some state that is necessary for further tests. The QueueProcessor does not do much, apart from just call the methods on the Box to perform the operation.
  • Applications call Box for all its requests and never call Filesystem. That way if the implementation changes (no longer direct “rm” but now over the HTTP API) the application does not need to change (note that such changes are ones which do not affect the life cycle of the Box, or introduce extra states such as “in queue but not done yet”). But more importantly I just think it’s a lot more readable to say “Box b = getBox(); b.getUsedSizeBytes(); b.deleteFromFilesystem()”.
  • The individual objects such as the “filesystem” object take Strings not Boxes as parameters. This makes those classes marginally simpler. More importantly one doesn’t feel right when there’s a two-way dependency, i.e. Box needs Filesystem (to call it to implement “delete” methods) and Filesystem needs Box (in its method signatures). And the only place that the Filesystem is going to be called is from Box instance methods, and the Box has all the information such as username, password, and any other information, within its instance variables.

Leave a Reply