[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] Re: [PATCH-v2 2/2] Initialize mempool and elevator only for request-based dm devices



Hi Nikanth,

On 08/12/2009 05:47 PM +0900, Nikanth Karthikesan wrote:
> Hi Kiyoshi Ueda,
> 
> On Wednesday 12 August 2009 07:45:56 Kiyoshi Ueda wrote:
>> Hi Nikanth,
>>
>> On 08/11/2009 06:05 PM +0900, Nikanth Karthikesan wrote:
>>> On Tuesday 11 August 2009 13:36:24 Kiyoshi Ueda wrote:
>>>> On 08/10/2009 07:48 PM +0900, Nikanth Karthikesan wrote:
>>>>> +
>>>>> +		/*
>>>>> +		 * reinitialize make_request_fn as it was reset to the
>>>>> +		 * default __make_request by blk_init_allocate_queue
>>>>> +		 */
>>>>> +		md->saved_make_request_fn = md->queue->make_request_fn;
>>>>> +		blk_queue_make_request(md->queue, dm_request);
>>>>> +
>>>>> +		blk_queue_softirq_done(md->queue, dm_softirq_done);
>>>>> +		blk_queue_prep_rq(md->queue, dm_prep_fn);
>>>>> +		blk_queue_lld_busy(md->queue, dm_lld_busy);
>>>>> +	}
>>>>> +
>>>>>  	__unbind(md);
>>>>>  	r = __bind(md, table, &limits);
>>>> The queue has been registered at the device creation time by
>>>> add_disk() in alloc_dev().
>>>> Since the queue is reconfigured (elevator is attached), you have to
>>>> update the queue registration (e.g. unregister, then re-register).
>>>> But it may not be easy.  At least, there is no exported interface to
>>>> unregister/re-register queue.
>>> Ah, yes. The scheduler attributes will not be exported in
>>> /sys/block/dm*/queue/iosched. Exporting elv_register_queue() and calling
>>> it here solves it. Something like..
>>>
>>> @@ -2203,6 +2199,29 @@ int dm_swap_table(struct mapped_device *md, struct
>>> dm_table *table)
>>>  		goto out;
>>>  	}
>>>
>>> +	/* new device is being marked as request-based */
>>> +	if (!md->map && dm_table_request_based(table)) {
>>> +		/* initialize queue for request-based dm */
>>> +		r = blk_init_allocated_queue(md->queue, dm_request_fn, NULL);
>>> +		if (r)
>>> +			goto out;
>>> +
>>> +		r = elv_register_queue(md->queue);
>>> +		/* if (r)
>>> +		 *	 goto out; Better to ignore, just like add_disk does ;-)
>>> +		 */
>>> +		/*
>>> +		 * reinitialize make_request_fn as it was reset to the
>>> +		 * default __make_request by blk_init_allocate_queue
>>> +		 */
>>> +		md->saved_make_request_fn = md->queue->make_request_fn;
>>> +		blk_queue_make_request(md->queue, dm_request);
>>> +
>>> +		blk_queue_softirq_done(md->queue, dm_softirq_done);
>>> +		blk_queue_prep_rq(md->queue, dm_prep_fn);
>>> +		blk_queue_lld_busy(md->queue, dm_lld_busy);
>>> +	}
>>> +
>>>  	__unbind(md);
>>>  	r = __bind(md, table, &limits);
>>>
>>> I would post the v3 of the patches with this change. Do you see any
>>> problems in this?
>> Humm, it might work for now, but I disagree with that.
>>
>> Since elevator is block internal and dm doesn't really care
>> (its initialization is actually hidden in blk_init_allocated_queue()),
>> directly calling elv_register_queue() from dm seems not right.
>> It will likely introduce a bug by future changes in block layer.
>>
>> I think the right approach is to define a proper block layer interface
>> to reflect the queue configuration change.
>> That's why I said "Updating the queue registration may not be easy".
> 
> I do not see too much of overhead in the future with this approach,
> atleast no more than "proper block layer interface".

I don't think so.
Just exporting elv_register_queue() will cause some maintenance costs
to request-based dm developers as below.

Although currently only elevator is the queue's feature which is
needed for only request-based dm, such other features may be added
to queue in the future.
Then, the developer who added the feature may not notice that
request-based dm needs to register the feature here, if there
is no critical problem (e.g. compile error or panic) without it.
That causes the lack of such features only in request-based dm.
Therefore, request-based dm developers always have to watch
the change of the block-layer and make the registration related code.
I think it's a sort of big maintenance cost.

So we should make the code as the change of the block-layer becomes
effective automatically in request-based dm, too, as mush as possible.
In this case, you should make/call an interface for the whole queue,
not only for the elevator, since dm can't/shouldn't know how
blk_init_allocated_queue() initializes the queue.
(And the interface should be used in other generic paths (e.g. add_disk()))
That's a proper block-layer interface which I mentioned, and this
approach should have less overhead than your approach from view point
of longer period.


> IMHO, unregistering the queue and registering the queue again with
> the elevator, is basically wasting CPU cycles and possibly would
> confuse the user-space, which may be watching the sysfs... 

Right, so I said "Updating may not be easy."
(By the way, wasting CPU cycles doesn't matter here, since it happens
 only when we initialize the device and it shouldn't too much.)


> Or asking block layer to recheck and find what we have changed
> in the request_queue. It does not sound like the best solution.

I think this is a better solution than exposing a part of queue
internals as I described above.


> It is better to tell the block-layer that we have added a q->request_fn 
> function, so initialize the elevator.

I don't think it's better as I described above.
(dm can't/shouldn't know how blk_init_allocated_queue() initializes
 the queue.)



By the way, another approach to optimizing the memory usage would be
to determine whether the dm device is bio-based or request-based
at the device creation time, instead of the table binding time.
We want the delayed allocation, since kernel can't decide the device
type until the first table is bound because of the auto-detection
mechanism.  The auto-detection is good for keeping compatibility with
existing user-space tools.  But once user-space tools are changed to
specify device type at the device creation time, we can eventually
remove the auto-detection.
Then, kernel can decide device type in alloc_dev(), so
the initialization code in kernel will become very simple.

FYI, actually, I had this approach in a very early stage of
request-based dm development:
    [kernel]     http://marc.info/?l=dm-devel&m=116656637419846&w=2
    [kernel]     http://marc.info/?l=dm-devel&m=116656689701459&w=2
    [kernel]     http://marc.info/?l=dm-devel&m=116656689707043&w=2
    [user-space] http://marc.info/?l=dm-devel&m=116656689906056&w=2
Now, you can change user-space first before kernel, since
request-based dm is already available.

Thanks,
Kiyoshi Ueda


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]