Re: [BLAST_ANAWARE] buds stuff

From: Tancredi Botto (tancredi@lns.mit.edu)
Date: Fri Apr 09 2004 - 17:25:11 EDT


Hi hauke,
network connectivity has been very slow all-around. I do not know
if this is because of the many people looking at /net/data/4. For
instance, spud2 was at a crawl, with a lot of sendmail clogger (?)
but after reboot it definiteyl became responsive again.
Also in your test, there is a second root process running, already
taking away 20 % CPU, and of course it is a one second snapshot

HOWEVER:

_ this should not affect the montecarlo. We have enough CPU
  power to crunch all data on the spuds, including filters.

_ what chris suggests makes sense anyways: that is the point of giving
  everybody disk space...

I hope there are no more problems.
 
________________________________________________________________________________
Tancredi Botto, phone: +1-617-253-9204 mobile: +1-978-490-4124
research scientist MIT/Bates, 21 Manning Av Middleton MA, 01949
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

On Fri, 9 Apr 2004, Hauke Kolster wrote:

>
> Hi,
> Here is a comparison of the same root macro run at the same time
> on spud7 and bud20. Is this difference in performance cause by the
> network bandwidth as Chris mentioned? Spud7 was able to give
> me the full processor speed, bud20 only between 10% of the total.
>
> Hauke
>
>
> bud20:
> ------
>
> see process 32560
>
> 3:01pm up 9 days, 20:14, 3 users, load average: 2.02, 1.77, 1.36
> 54 processes: 52 sleeping, 2 running, 0 zombie, 0 stopped
> CPU states: 31.2% user, 1.8% system, 0.0% nice, 67.0% idle
> Mem: 903116K av, 887444K used, 15672K free, 0K shrd, 85512K
> buff
> Swap: 2097136K av, 22088K used, 2075048K free 547072K
> cached
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 30238 blast 16 0 35924 32M 10608 D 21.7 3.6 267:08 root.exe
> 32560 blast 11 0 200M 195M 10496 D 10.5 22.1 0:55 root.exe
>
>
> spud7:
> ------
> 3:01pm up 74 days, 4:10, 9 users, load average: 1.39, 0.60, 0.22
> 91 processes: 89 sleeping, 2 running, 0 zombie, 0 stopped
> CPU0 states: 0.0% user, 0.2% system, 0.0% nice, 99.3% idle
> CPU1 states: 97.1% user, 2.1% system, 0.0% nice, 0.2% idle
> Mem: 2065132K av, 2053036K used, 12096K free, 0K shrd, 84616K
> buff
> Swap: 505944K av, 46412K used, 459532K free 1548932K
> cached
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 16404 blast 25 0 232M 232M 10688 R 97.5 11.5 2:56 root.exe
>
>
>
> Chris Crawford wrote:
> > hi tancredi,
> >
> > Tancredi Botto wrote:
> >
> >>> we really need a local $ANALDIR with at least the 'flr' ntuples on
> >>> each of the buds. also could it be set up in the same place in each
> >>> bud, so that you can log into any machine with out changing the
> >>> variable?
> >>>
> >>
> >>
> >> It could be done, but then again, even for the flr's alone that would
> >> multiply 15 G (the present size of all flr) x 24 of "mirrored" disk
> >> space while everything is already available on /net/data/4. what a
> >> waste !
> >>
> > i disagree. it would be a terrible waste not to use 15G/120G for local
> > data storage, both in terms of analysis time and network load. the
> > point is that analysis from ntuples is hightly i/o-intensive, and not
> > only do you face the network bandwidth vs. the local bus, but also
> > everyone who is using /net/data/4 has to wait in line for access to the
> > disk. and it will be essential to achieve any gains via parallel
> > processing with root's PROOF functionality.
> >
> > as an example, the processing time for the december data already went
> > from 2+minutes down to a few seconds by consolidating all of the
> > flr-###.root into a single file using filter.C
> > --chris
> >
> >>
> >> I don't understand why we "really need" to change something, and I
> >> hope that we can survive with the nfs as is. The system should not be
> >> any slower now (in principle, but I guess it may depend on usage), and
> >> the way /net/data/4 is mounted on bud01-24 is the same way it is
> >> mounted on spud1-3, spudd5-8 !
> >>
> > it is very nice to finally have all the spuds cross mounted.
> >
> >>
> >> Please check yourself. Ernie did all the cross-mounts only yesterday !
> >>
> >>
> >>
> >>
> >>> --thanks, chris
> >>>
> >>> pete, for the mean-time, i suggest copying the ntuples you are using
> >>> to your local scratch disk; you're analysis should go a lot faster.
> >>>
> >>> ps. /home/blast is full again...
> >>>
> >>> Peter Karpius wrote:
> >>>
> >>>
> >>>
> >>>> Whenever I try to use the buds I get:
> >>>>
> >>>> Processing show_VectorEL_asym.C...
> >>>> warning: dataset 0 has no data
> >>>>
> >>>> ERROR: NO CHARGE
> >>>>
> >>>> ...if I go back to the spuds and use the exact same runs everything
> >>>> works fine - what am I doing wrong?
> >>>>
> >>>> Pete
> >>>>
> >>>>
> >>>> ----------------------------------------------
> >>>> Pete Karpius
> >>>> Graduate Research Assistant
> >>>> Nuclear Physics Group
> >>>> University of New Hampshire
> >>>> phone: (603)862-1220
> >>>> FAX: (603)862-2998
> >>>> email: karpiusp@einstein.unh.edu
> >>>> http://pubpages.unh.edu/~pkarpius/homepage.htm
> >>>> ----------------------------------------------
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>
>



This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:30 EST