plague: Job waited too long for repo to unlock. Killing it...

Michael Schwendt bugs.michael at gmx.net
Thu Jan 3 21:56:26 UTC 2008


On Sun, 30 Dec 2007 17:54:51 +0100, Michael Schwendt wrote:

> If in a failed job.log you see the message
> 
>     Job waited too long for repo to unlock. Killing it...
> 
> please notify me.
> 
> It's a problem in the plague server code that results in a denial of
> service for subsequent build jobs. I have a traceback from Dec 28th, but
> in the context of the source code it doesn't make sense yet (because a few
> lines earlier the code ensures that the files to be copied exist and are
> readable). Buildsys runs a slightly modified version that adds a bit more
> debug output in this area.

Certainly related, job #37767 just failed

  Failed to copy /srv/rpmbuild/server_work/fedora-5-epel/37767-php-pecl-memcache-2.2.1-1.el5/ppc/php-pecl-memcache-2.2.1-1.el5.ppc.rpm to the repository directory.

and that should not happen either. I assume this is not the first time
this has happened. The additional debug details are interesting.

In the server log, the individual arch jobs print the list of files
downloaded from the builders. At that time they should be available
already:

  37767 (php-pecl-memcache/ppc): Build result files - [ 'php-pecl-memcache-2.2.1-1.el5.ppc.rpm', 'php-pecl-memcache-2.2.1-1.el5.src.rpm', 'root.log', 'php-pecl-memcache-debuginfo-2.2.1-1.el5.ppc.rpm', 'state.log', 'job.log', 'build.log' ]

  37767 (php-pecl-memcache/x86_64): Build result files - [ 'root.log', 'build.log', 'state.log', 'php-pecl-memcache-2.2.1-1.el5.src.rpm', 'php-pecl-memcache-debuginfo-2.2.1-1.el5.x86_64.rpm', 'php-pecl-memcache-2.2.1-1.el5.x86_64.rpm', 'job.log' ]

  37767 (php-pecl-memcache/i386): Build result files - [ 'root.log', 'php-pecl-memcache-debuginfo-2.2.1-1.el5.i386.rpm', 'state.log', 'php-pecl-memcache-2.2.1-1.el5.i386.rpm', 'build.log', 'php-pecl-memcache-2.2.1-1.el5.src.rpm', 'job.log' ]

Further down in the log, the fedora-5-epel repo controller is asked to
install the files into the needsign repo:

  Repo 'fedora-5-epel': updating repository metadata...

At that time, the last of the ppc files is not yet accessible (i.e. it
either doesn't exist or cannot be read):

  [...]
  Repo: /srv/rpmbuild/server_work/fedora-5-epel/37767-php-pecl-memcache-2.2.1-1.el5/x86_64/php-pecl-memcache-2.2.1-1.el5.x86_64.rpm is accessible.
  Repo: /srv/rpmbuild/server_work/fedora-5-epel/37767-php-pecl-memcache-2.2.1-1.el5/ppc/php-pecl-memcache-2.2.1-1.el5.ppc.rpm is inaccessible.

Further proof-reading of source code necessary... it should not populate
needsign before all files are downloaded.




More information about the epel-devel-list mailing list