Several years ago now, I had an idea of how to calculate absolute binding free energies by mutating ligands (or solutes) into water. The method would work by using two simulation boxes, one that contained just bulk water, and one that would contain the solvated protein-ligand complex. The boxes would both be connected to the same temperature, pressure and particle baths, so would both be in perfect equilibrium with one another (think Gibb's ensemble). A lambda coordinate would then allow the ligand to be transferred from the protein box to the water box, while simultaneously, a corresponding volume of water was transferred from the water box to the protein box. This would only work if the water that occupied the same volume as the ligand could be identified, and that identity constrained during the simulation. Lambda would switch off the ligand in the protein box, while switching on the ligand in the water box, while it would also be switching off the identified water in the water box, and switching it on in the protein box. The effect is to use lambda to directly transfer the ligand from the protein to bulk solvent, so to directly calculate the absolute binding free energy. For this to work a simulation program would have to be written to allow multiple boxes with different volumes (spaces) to be run together, to allow an identity constraint to be created that could constrain the identity of the waters that occupied the corresponding volume of the ligand, and multiple forcefield types would be needed to handle the softening of the intermolecular interactions as the ligand and identified waters were swapped. In addition, replica exchange moves would be needed to enhance samping of the protein as the ligand was removed, and flexible parallel libraries were needed to allow large numbers of lambda values to be run efficiently over a cluster. Essentially, I needed to write Sire. Over the last few weeks, I have finally been running these simulations, and Sire has been excellent. As is now normal, the code itself has no idea about the simulation - everything is handled by combining together simulation building blocks within a Python script. This has made method and protocol development really easy, as I can play with the intermolecular interactions and reconfigure the simulation with the minimum of code. Sire is now finally allowing me to develop simulations that are completely beyond anything that can be run in any other package (the method requires three simulation spaces - a periodic box for the protein, a periodic box for the bulk water, and an infinite cartesian space for the ligand and identified waters, all connected together to the same pressure piston, and having forcefields that use different boxes for different interactions - could you imagine trying to implement this in ProtoMS?!?). The dream has been realised, as Sire is now finally the tool I set out to write. While it still needs debugging, using it is now sweet.
While brings me back to ProtoMS. This was the Fortran 77 simulation program I wrote during my PhD. It has aged surprisingly well, with several users and publications (and, as I am sure I have plugged enough, it now has its own website). This week I was in Southampton helping to run a workshop which taught how to run free energy simulations using ProtoMS. This was a lot of fun, and it reminded me how easy the code is to use, and how much life there still is in it. I suppose I perhaps orphaned the code too early as I rushed off to develop Sire, and also because my own research couldn't use ProtoMS (I only last year published my first paper that used ProtoMS, if only tangentally, and I have never really used the code in anger - Julien has always been the major user and now major developer). However, now that Sire is maturing, I have decided to give ProtoMS a little attention to remerge all of the forks, make it fully public, and put it on a course to make it maintanable for the future (even if I am not the one doing the day-to-day maintenance). Basically, I need to give ProtoMS a little TLC.