Re: [BLAST_SHIFTS] [BLAST_ANAWARE] autocruncher double crunch 100kC worth of data affected

From: Adrian T Sindile (asindile@cisunix.unh.edu)
Date: Sat Oct 16 2004 - 20:58:09 EDT


Hi, Chi!
I took a look at the recrunched H2 files... I believe they were done on
October 6th and 7th, so after this problem started.
I only had one day to look at them, as all my analysis for Chicago was
done from a previous crunch - anyway, from what I saw, the files looked
good, very good I would say (from the resolution point of view). Of
course, I need more time to play with cuts and we cannot conclude they are
fine, but I think Taylan is right that we should not panic now...
Thanks!

Adrian

-------------------------------
Adrian Sindile
Research Assistant
Nuclear Physics Group
University of New Hampshire
phone: (603)862-1691
FAX: (603)862-2998
email: asindile@alberti.unh.edu
http://einstein.unh.edu/~adrian/

On Sat, 16 Oct 2004, Chi Zhang wrote:

>
> Hi all,
>
> sorry for breaking this bad news but ever since September 22nd, the
> auto-cruncher has been crunching the same runs for multiple times.
>
> the sympton being: in status_list.txt multiple entries for same run appear
> and they are ON DIFFERENT CPUs!!!!!!!!!! When dst is openend and the
> following command in root " dst->Scan("fNEvent") is issued, one can see
> same CODA event umber appears multiple times!!!!!!!!! this continues until
> all but one lrn crashed out.
>
> see this section of status_list.txt:
> 11922 spud2.bates.daq 30
> 11923 bud06.bates.daq 1
> 11923 spud4.bates.daq 30
> 11924 bud23.bates.daq 1
> 11924 spud1.bates.daq 30
> 11925 spud1.bates.daq 30
> 11925 spud3.bates.daq 1
> 11926 spud2.bates.daq 30
> 11927 spud3.bates.daq 1
> 11927 spud5.bates.daq 30
> 11928 bud22.bates.daq 1
>
> the last run crunched normally is run 11297 finished at 3:05 of Sep 22nd.
> The following runs and later are all crunched multiple times and
> unfortunately at the same time:
>
> 143635915 Sep 22 03:50 /net/data/4/Analysis/data//dst-11296.root
> 249868274 Sep 22 07:22 /net/data/4/Analysis/data//dst-11298.root
> 253659856 Sep 22 07:36 /net/data/4/Analysis/data//dst-11293.root
> 251871484 Sep 22 07:56 /net/data/4/Analysis/data//dst-11295.root
> 253803099 Sep 22 08:19 /net/data/4/Analysis/data//dst-11294.root
> 254026042 Sep 22 22:12 /net/data/4/Analysis/data//dst-11299.root
>
> I stopped the cruncher daemon on dlbast09 and there does not seem to be
> another cruncher running at the same time since the runlist is modified
> only by elog, not cruncher (run numbers being written in, not taken out).
>
> all these runs up to 11960 will have to be recunched with
> lrn!!!!!!!!!!!!!!!!!!!
>
> For people going to Chicago, we need to figure out what shall we present.
> For people went to Triesta, hope your "PRELIMINARY" stamps are BIG enough.
>
> Chi
>
>
> keywords: FAILURE
>
> P.S. I don't have the stomach to debug the cruncher, I turned it off and
> am crunching runs from 11962 manually. cruncher experts please
> investigate.
>



This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:31 EST