It's been a busy couple of weeks, but Sire has now been ported to our supercomputer cluster and is running some production jobs! There are still a few issues to sort out, but the code is finally moving towards completion (well, at least beta-type completion).
Along with new Sire, I've also written a new RETI script which uses Sire to run QM/MM RETI free energy simulations running all replicas within a single python script (either sequentially on one node, or in parallel over several nodes using MPI(*)). The script is pretty bullet-proof - neither crashing nor complaining despite now running thousands of iterations. Part of the reason is that I no longer need to use NFS, SSH, network sockets or anything else that is potentially dodgy. I've also written everything to be robust, so if a job fails, something goes wrong, then the code will automatically catch the exception triggered by the failure and it will retry the calculation.
The best thing though is that now that Sire is in production, I'm able to run experiments and am feeling like a scientist again :-)
(*) This script is set up to work sequentially - to run in parallel the following lines are necessary;
nodes = MPINodes()
repexmove.move(nodes, replicas, 1, True)instead of;
repexmove.move(replicas, 1, True)Not too hard ;-)