Re: [BLAST_ANAWARE] Network interruption (rescheduled)

From: Chi Zhang (zhangchi@MIT.EDU)
Date: Tue Apr 05 2005 - 17:04:04 EDT


Hi, just to confirm that I have also experience extreme difficulties
connecting to blast05 and blast02 last couple of days. lost my session
every few minutes. This occured at the worst possible time with an "big"
analysis meeting pending. Only that I have everything in a VNC session on
bud10 so I am able to preserve my results including graphical interfaces.
But still my keyboard got a few heavy bangs.

it also occured to me that there are many defuncted processes on blast05
and blast02 such as netscapes, mozillas and wishes lasting few hundred
minutes and some of them taking significant CPU resource.

Chi

On Tue, 5 Apr 2005, Nikolas Meitanis wrote:

> Hello Ernie,
> perhaps the rest of us have accepted the ontological characteristics of
> the network with fatalistic complacency
> (and periodic trips to our mental consultants). I have certainly
> experienced such problems as Tavi intimated. In fact, in recent days
> (last month, maybe more), a rather frequent unapologetic "freezing" has
> become particularly aggravating. I suspect others connecting from MIT
> have experienced similar meltdowns.
> n
>
> Ernie Bisson, MIT Bates Linear Accelerator wrote:
>
> >>I have lost my data continuously in the last month!
> >>I use former Baris' computer (spinspin) to log in to my bud (bud07).
> >>One run takes in average three days (I use only my bud, due to some
> >>settings to my env.). Last Saturday I was on shift, my data was lost, I
> >>started it again, now, Monday: both runs:
> >>
> >> Read from remote host blast05: Connection reset by peer
> >> Connection to blast05 closed.
> >> <spinspin.lns.mit.edu:103>
> >>
> >>Since this was impossible to do it from UNH, it looks like it is
> >>impossible to do it from Bates also! Normally I loose my histograms, thus
> >>I have started printing all the data in order to be able to reproduce it
> >>later, but this doesn't work either! Do I have to use a special computer
> >>to stay connected?!
> >>
> >>
> >
> >Tavi,
> >
> >Your most recent problems (Saturday and Monday) probably occured because of
> >problems we were having with the network switch blast05 is connected to. I
> >don't know if there had been problems with this switch before the last serveral
> >days. It's been replaced now, so that should no longer be a cause. To diagnose
> >problems you had earlier in the month I would need to know dates and times.
> >I've been unaware that you (or anyone else) had been periodically losing
> >their connections. The log files on blast05 indicate others my have as well,
> >but you have been the first to complain. In the future, please bring such
> >problems to my attention more promptly.
> >
> >Regardless, you should run your jobs in such a way that network disruptions
> >do to cause them to terminate. I asked Chris Crawford to E-mail you about
> >the best ways to do this in conjuction with data analysis.
> >
> >Ernie
> >
> >
> >
>
>



This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:32 EST