[dm-devel] [PATCH v10 4/8] block: Add bio_reset()

Jens Axboe axboe at kernel.dk
Fri Sep 7 22:44:00 UTC 2012


On 2012-09-07 16:25, Kent Overstreet wrote:
> On Fri, Sep 07, 2012 at 04:06:45PM -0600, Jens Axboe wrote:
>> On 2012-09-07 15:55, Jens Axboe wrote:
>>> On 2012-09-07 14:58, Kent Overstreet wrote:
>>>> On Thu, Sep 06, 2012 at 07:34:18PM -0600, Jens Axboe wrote:
>>>>> On 2012-09-06 16:34, Kent Overstreet wrote:
>>>>>> Reusing bios is something that's been highly frowned upon in the past,
>>>>>> but driver code keeps doing it anyways. If it's going to happen anyways,
>>>>>> we should provide a generic method.
>>>>>>
>>>>>> This'll help with getting rid of bi_destructor - drivers/block/pktcdvd.c
>>>>>> was open coding it, by doing a bio_init() and resetting bi_destructor.
>>>>>>
>>>>>> This required reordering struct bio, but the block layer is not yet
>>>>>> nearly fast enough for any cacheline effects to matter here.
>>>>>
>>>>> That's an odd and misplaced comment. Was just doing testing today at 5M
>>>>> IOPS, and even years back we've had cache effects for O_DIRECT in higher
>>>>> speed setups.
>>>>
>>>> Ah, I wasn't aware that you were pushing that many iops through the
>>>> block layer - most I've tested myself was around 1M. It wouldn't
>>>> surprise me if cache effects in struct bio mattered around 5M...
>>>
>>> 5M is nothing, just did 13.5M :-)
>>>
>>> But we can reshuffle for now. As mentioned, we're way overdue for a
>>> decent look at cache profiling in any case.
>>
>> No ill effects seen so far, fwiw:
>>
>>   read : io=1735.8GB, bw=53690MB/s, iops=13745K, runt= 33104msec
> 
> Cool!
> 
> I'd be really curious to see a profile. Of the patches I've got queued
> up I don't think anything's going to significantly affect performance
> yet, but I'm hoping the cleanups/immutable bvec stuff/efficient bio
> splitting enables some performance gains.

Got more work to do, but certainly not a problem sharing.

> Well, it certainly will for stacking drivers, but I'm less sure what
> it's going to look like running on just a raw flash device.
> 
> My end goal is making generic_make_request handle arbitrary sized bios,
> and have (efficient) splitting happen as required. This'll get rid of a
> bunch of code and complexity in the upper layers, in bio_add_page() and
> elsewhere. More in the stacking drivers - merge_bvec_fn is horrendous to
> support.

It is a nasty interface, in retrospect probably a mistake. As long as we
don't split ever on non-stacking drivers, I don't care too much. And it
would get rid of complexity in those drivers, so that's a nice win.
merge_bvec_fn not only a bad interface, it's also pretty slow...

> I think I might be able to efficiently get rid of the
> segments-after-merging precalculating, and just have segments merged
> once. That'd get rid of a couple fields in struct bio, and get it under
> 2 cachelines last I counted.

It's 2 cachelines now, but reducing is always a great thing. Getting rid
of the repeated recalculate after merge would be a nice win.

> Course, all this doesn't matter as much for 4k bios so it may just be a
> wash for you.

Right, for me it doesn't matter. As long as you don't slow me down :-)

-- 
Jens Axboe




More information about the dm-devel mailing list