[katello-devel] Katello on 16 core box - concurrency

Jeff Weiss jweiss at redhat.com
Thu Oct 20 13:29:05 UTC 2011


On Thu, 2011-10-20 at 10:30 +0200, Lukas Zapletal wrote:
> On 10/19/2011 11:03 PM, Jeff Weiss wrote:
> > Browser2, click1 ->  thin2
> > Browser3, click1 ->  thin3
> > Browser2, click2 ->  thin4
> > Browser3, click2 ->  thin1  (Browser3 is now stuck in queue)
> > Browser2, click3 ->  thin2
> > Browser2, click4 ->  thin3
> > Browser2, click5 ->  thin4
> > Browser2, click6 ->  thin1  (Browser2 is now stuck in queue)
> 
> Can you please try to increase our timeout value in the thin.yml? 
> Because our default setting is 30, if the request is 3 minutes I would 
> expect terminated connection between Apache and Thin after that period. 
> Apache load balancer module could "think" the resource is free again 
> (while it is not obviously -> leads to block). This would work only if 
> Request Counting Algorithm track the connection state. I think it does not.
> 
> If that does not help we could try **different** mod_proxy_balancer 
> method. At present, there are 3 load balancer scheduler algorithms 
> available for use: Request Counting, Weighted Traffic Counting and 
> Pending Request Counting. These are controlled via the lbmethod value of 
> the Balancer definition. See the ProxyPass directive for more information.
> 
> I think **Pending Request Counting** (lbmethod=bybusyness) is what we 
> want. This scheduler keeps track of how many requests each worker is 
> assigned at present. A new request is automatically assigned to the 
> worker with the **lowest number** of active requests. This is useful in 
> the case of workers that queue incoming requests independently of 
> Apache, to ensure that queue length stays even and a request is always 
> given to the worker most likely to service it fastest.

Will this solve the problem though?  If thin1 has 1 request pending,
that's going to take 3 minutes, and thin2 has 5 requests pending that it
will service in 0.5 seconds total, thin1 is going to get the request,
right?  It's still the wrong choice- requests will still get stacked up
behind long-running tasks.

The "correct" algorithm is for requests to be held in queue until at
least one of the thin processes is ready.  I think that is what HAProxy
does.

> In the case of multiple least-busy workers, the statistics (and 
> weightings) used by the Request Counting method are used to break the 
> tie. Over time, the distribution of work will come to resemble that 
> characteristic of byrequests.
> 
> This algorithm is available in Apache HTTP Server 2.2.10 and later. We 
> should make sure both RHEL and Fedora have that. (*)
> 
> By the way since Thin does not use threads by default a single instance 
> is able to handle only one connection. Sooner or later we hit a 
> concurrency issue, because the maximum amount of Thin processes that can 
> be used on a single 4 RAM box is about twenty (250 MB each). This is 
> poor scalability, I could expect more than 20 people importing their 
> manifests in a multi-teanent environment. Or even 10 importing people 
> with other 10 clicking would block it.
> 
> I would suggest to try with threading option enabled. It is 
> unfortunately an experimental Thin feature (and it has also some issues 
> with Ruby 1.9). We could possibly also have some threading issues in our 
> codebase. But since Katello is not cpu-intensive application and most of 
> its time it just waits for backend engines I would expect we are able to 
> run much more than CPU+1 processes. The only option is threads (see above).

Yeah, this is an unfortunate reality of the Ruby runtime.  I read that
1.9 uses native threads, but still subject to GIL.  I have no idea what
effect GIL would have on our app though, if any.  Maybe we should take
another look at JRuby for post 1.0.  


> If threading works we should review load balancer method, because in 
> this case Request Counting Algorithm method would be the best again IMHO.
> 
> I know we are on the way to make imports faster, but this shows our weak 
> portion. Good finding, Jeff!
> 
> (*) - http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html
> 





More information about the katello-devel mailing list