Re: [BLAST_ANAWARE] changes to ANALDIR & DATADIR

From: Tancredi Botto (tancredi@lns.mit.edu)
Date: Thu May 13 2004 - 16:51:52 EDT


Hi guys,
I think the problem is overstated (?) All the spud disks are visible
from any of the buds now. If I'd link /net/data/4/Daq to another
directory right now we could still crunch right away from any spud/bud
There are no problems in changing DATADIR (of course, we'll leave a link)

What I am also trying to say is that the disks are full now. With double
copies we had 6 full disks (of 8) /net/data/4 has room for another 30-40
runs then what ? We zap the crunched data ? or let it grow until the raw
data directory only contains the last 10 runs (!) continuosly moving
data back and forth ?? I am not just being sadistic here....

Chi's concern about speed is more realistic. Yes it turns out that the
spuds are faster and that they can still be used for recrunches Of course
user blast has priority during the run.

If you crunch your own runs on spuds and then move the data to your
area you have more work but are losing nothing of the cpu speed (if left
available)

The idea is that you should really work from behind your bud01-24
Then all the raw data and all the crunched data is available. While
the CVS checkout can stay on scratch, all your config files and libBlasts
should stay in the home directory.

You will only lose if you need more than one cpu to work on the same
file. But now really - if that is a requirement you can work from
scratch24. Or spud4, if we split raw/anal directories in 2 disks.....

P.S.
when we bought the buds cpu and memory were the priorities. Now it turns
out that we are IO limited. Well, you will always be limited by something.

-- 
________________________________________________________________________________
Tancredi Botto,  		phone: +1-617-253-9204  mobile: +1-978-490-4124
research scientist		MIT/Bates, 21 Manning Av    Middleton MA, 01949
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

On Thu, 13 May 2004, Chi Zhang wrote:

> > Hi, just like to share my experience about moving things to buds. > > 1. only bud24 disc is visible to all other buds. So after I moved all my > stuff to my bud10(on the next day I saw the assignment list), I found I > have to move them again to bud24 because I frequently use one or two > other buds. > > 2. buds are still slow in IO to data. I met situations when bud10 CPU > usage by lrn is below 5% due to IO limit. so sometimes I d like to use > spuds or dblast. At the beginning of lrn, it processes epics/scaler which > is very IO intense. doing this from buds sometimes cost more than half an > hour just to finish all the epics and scalers. Crunching on physics events > is CPU intense and is not a big problem on buds. It s that long wait at > the beginning that drives me nuts. > > Note this year we never had back log on spuds because of blast crunching, > I checked a few times, there are always free spuds. all dblast are > essentailly free all the time. So I hope it is not a crime to use spuds. > > 3. none of the bud discs are visible to spuds and dblast. So when I want > to use spuds, I realize I have to maintain a copy of at least BlastLib2 on > spud discs. > > So there I go, three copies of files on bud24, bud10 and spuds. Is it > still possible to make IO to hard drives faster among buds and spuds so > all discs can be cross mounted without huge performance penalties? > > Chi > > On Thu, 13 May 2004, Tancredi Botto wrote: > > > > > Hello, > > I would like to propose to split $ANALDIR and $DATADIR > > > > currently they point to the same disk (net/data/4) which is not a good > > idea. At this rate by the end of the year we will surely need one full > > spud disk to store crunched data alone. Which means less and less raw > > data on DATADIR... it is a vicious cycle. So I propose the following > > > > > > _ Use /net/data/4 for crunched and montecarlo data only > > > > We should makes organize montecarlo(a) files per event generator such > > as /net/data/4/Analysis/blastmc/eep (ep, epi+..). In this way you do > > need to change $ANALDIR > > > > _ Use /net/data/5 > > > > with /net/data/6 for the mirror copy. The mirror copy is kept until all > > tape backups are done. When /net/data/5 is full, data is moved and linked > > to the next data disk (currently we use 3) > > > > In this way $DATADIR should now point to /net/data/5/Daq/data > > > > _ You are kindly urged to move *ALL* of your data from the spuds to the > > buds. > > > > **All of the /net/data/?/scratch directories should disappear** > > > > Again the spuds should really be used primarily for the raw data. All > > students now have 100 Gb of space elsewhere. Also note that we need at > > least one empty disk should we ever need to dump data back on from tape. > > > > > > > > -- > > ________________________________________________________________________________ > > Tancredi Botto, phone: +1-617-253-9204 mobile: +1-978-490-4124 > > research scientist MIT/Bates, 21 Manning Av Middleton MA, 01949 > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > >



This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:31 EST