XSEDE resources are made available by the US National Science Foundation as part of its mission to provide cyberinfrastructure for the advancement of Science. Gateways are unique in offering any scientist the opportunity to easily access some of the best computational resources in the world. While free to users, the XSEDE resources are costly, and they should be used in a manner consistent with conservation as well as with scientific achievement.
PLEASE NOTE: Demand for the CIPRES Science Gateway has been very high. As a result, we have implemented a set of policies to insure equal access to all users. Please review our use policies here.
We have set up some guidelines to help commmunity Systematics codes give the best wall time performance (you get the results faster) while at the same time making efficient use of the machine.
The addition of these new resources will expose you to unfamiliar behaviors when things go wrong. There is no way to avoid it, but we will keep a list of warnings and messages that might seem strange to a regular user. If you see a message you don't understand, let us know. We provide some tools to monitor job progress. If your job fails, the most efficient way to describe the problem to us is shown here:
Configuring Your Job for optimal running:
XSEDE resources improve performance by dividing up jobs into multiple individual processes, and mapping them across many processors. We have benchmarked the codes offered in the CIPRES Science Gateway to determine how jobs of various sizes can be run most efficiently. We have distilled this information into a few user-parameters that describe your data set. At present we rely on our users to manage this manually. To use the resources efficiently, users must enter the following parameters in the interface prior to submitting. Using the correct settings will help insure our ability to continue this work.
Parameters user must set:
Setting the time of your job:
The time you enter in this box sets the number of clock hours the job will be allowed to run. The time you enter is the limit in clock time from the time your job begins to execute. The clock does not start immediately when you submit, but when your job reaches the end of the queue and begins to run. This value is one parameter that impacts how your job is queued. Long jobs usually sit in the queue for longer periods of time. The best strategy is to try to enter the shortest time for your run without having your job “time out.”
How long can I run?:
Most tools may be configured to run for as much as 168 hours. MrBayes may be configured to run for as much as 334 hours. If you need a longer run than the interface allows, please let us know.
How long will my job run? For many programs, you can try to figure out how long your run will take by performing some test runs with the data set you are analyzing. You can, for example, make two test MrBayes runs that require only a few generations (I usually use 1000 and 10000), take the run times, and make a line through them. The slope gives you the time per generation for this data set on this resource. Similarly, you can run bootstrapping tools for a few generations to get an idea of how long the run will take. Use the test queue (see below) for this purpose.
Use the Test Queue to your advantage:
The test queue is a common feature for large HPC resources. Each of the XSEDE resources we use has a test queue. The idea is to allow very short runs just to be sure you have the command line correct. That way, you avoid waiting in the queue for 8 hours, only to receive a trivial error due to a duplicate taxon label, or submission of a Nexus format matrix to RAxML when it finally begins to execute. So, the first time you create a job, set the time to some number less than 0.5 hours and submit. The test queue is (almost) always pretty fast, and you will discover your errors immediately. Once you have it worked out, you can use the “Clone” feature to open the job up, set a long time limit and deploy.