no Job ID was provided in the time required (Was: Please rebuild your packages in Fedora Extras development)

Wed Mar 1 15:00:09 UTC 2006

On Wed, 2006-03-01 at 00:31 -0500, seth vidal wrote:
> On Wed, 2006-03-01 at 00:18 -0500, Dan Williams wrote:
> > On Wed, 2006-03-01 at 06:03 +0100, Ralf Corsepius wrote:
> > > >   Please check to make sure the job did
> > > > not actually get enqueued.
> > > How about you fixing your code to not produce false alarms, if you are
> > > sure these are false alarms?
> > 
> > To be perfectly fair, the error message doesn't say "The buildsys is
> > hung", it says just what it means; that the build system didn't give you
> > a job ID.  The only reason that this message attained the connotations
> > it did, was because I kept asking people both on this list and on
> > #fedora-extras to report instances of this message.  Since that's no
> > longer the case, I'd like to officially change the connotation that this
> > error message has to "check the web UI for your job for 5 minutes after
> > you submit your job, then report."
> > 
> > If you'd like, I can increase the plague-client timeout to 20s such that
> > it will almost always give you a job ID.  However, that's really just
> > papering over the problem.  In the long run, we need to evaluate how to
> > deal with sqlite's table locking issues,
> 
> Would dumping sqlite to use the postgresql instance on the buildmaster
> help the locking issue?

Well, we already use postgres on the buildmaster, but that the
architecture we started with when using sqlite isn't the best for
postgres, because limitations of sqlite drag down everyone.  That
limitation is essentially that sqlite doesn't have row-locking, and
therefore all access to the database must be synchronized.  I can think
of a couple of ways to fix this, but they of course take time to work
out...

Dan