During my last three shifts I worked on a new auto-cruncher,
based on the experience (desperation?) we had with the old one.
Although, it sound a little late for this new tool, it may be
useful after the blast run ends, for re-crunching.
Here are the improvements:
- Prevents multiple 'simultaneous' crunching by checking the
running/submitted processes.
- Provides useful information about the server status
- Provides a history of recent runs to ease the shift personnel's job
- Properly shows (sorted by run-number) the running lrn processes
and number of event being process (and total), hostname, pid
number, percent-done... for each process.
- Fast response; you see what's going on almost instantly;
Reduced the overhead in network usage (does not require
ssh for cruncher status)
- Will require no restart (otherwise bug me!)
Also, as discussed in today's analysis meeting, the following
features are designed but not tested (so do not start to use
those yet.) These are useful mainly for the re-crunch jobs, which
will be more important after the run ends at the end of May.
- Transfers the ANALDIR and DATADIR env to the servers
- Several auto-cruncher controllers can co-exist.
Other than these technical improvements, I'll be available to
debug and improve it if needed. (I think we didn't have a
maintainer for the old one for quite a while.)
Started to test it as an online auto cruncher today.
The usage manual will be put into the elog-sticky region
tomorrow. There is a short note attached to the dblast09 screen
about the usage.
I M P O R T A N T !!!
If you have to use the old one, clean the entries in the
old runlist.txt (~elog/crunch_log/Config/runlist.txt) file,
before starting the old cruncher. Otherwise it will
REsubmit all of the accumulated runs in this file!
-- ---=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--- Taylan Akdogan Massachusetts Institute of Technology akdogan@mit.edu Department of Physics Phn:+1-617-258-0801 Laboratory for Nuclear Science Fax:+1-617-258-5440 Room 26-559 ---=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=---
This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:32 EST