[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Libguestfs] Proposed new file apis

On 23/08/10 17:04, Richard W.M. Jones wrote:
On Mon, Aug 23, 2010 at 04:58:38PM +0100, Matthew Booth wrote:
On 23/08/10 16:23, Richard W.M. Jones wrote:
BTW you need two spaces after full stops.

Out of curiosity is that a technical or a stylistic requirement?
Stylistically, double spaces after a full stop have always seemed
anachronistic to me.

It's a stylistic one.

Meh. Done anyway.

  +hpread returns a NULL C<content>   on error.");
  +  ("hwrite", (RErr, [Int "handle"; BufferIn "content"]), 273,
  +   [ProtocolLimitWarning], [],
  +   "write data to an open handle",
  +   "\
  +This command writes all the data in C<content>   to C<handle>. It
  +returns an error if this fails.");
So we're definitely disallowing partial writes, even though in some
future version we might allow writes to non-file handles?  This API
won't allow partial writes.

Yes. I think reads and writes are quite different in this respect.
You always know exactly what you want to write, but you may not know
what can be read. If this assumption is wrong, now's the time to

I think we should change pwrite in that case.  pwrite only works on
files, and AFAICT there is never a situation where a partial write is
possible for a file, so we should turn pwrite into a full write and
change the documentation accordingly.


Are you saying that a String can be magically turned into an
enumerated type somehow? I was trying to work out how to use an
enumerated type.

Sadly not.  However we have a long history of using strings instead of
enumerated types throughout the API.  In any case, either 0/1/2 needs
to be documented, or this should be a string.  It should not be an
undocumented int.


  +On success, hseek returns the new offset from the beginning of
  +the file. It returns an offset of -1 on failure.");
  +  ("htruncate", (RErr, [Int "handle"; Int64 "size"]), 276, [],
  +   [],
  +   "truncate a file",
  +   "\
  +This command sets the size of the file referred to by C<handle>
  +to C<size>. If the file was previously larger than C<size>, the
  +extra data will be lost. If the file was previously smaller than
  +C<size>, the file will be zero-filled. The latter case will
  +typically create a sparse file.");
Since we already have truncate and allocate calls, I wonder what the
benefit is of having handle versions.  I mean, these are extremely
infrequent operations compared to hread/hwrite, so I doubt there's any
optimization benefit in adding these, and if we add these, why not
other very infrequent ops.

htruncate is required when streaming a file with sparse sections. i.e.

h = hopen_file("/foo");
hwrite(h, "foo");
htruncate(h, hseek(h, 1024, CUR));

Technically you only need this for a sparse section at the end of a
file. You could do it by mixing paths and file handles, but that
would be a bit jarring.

I don't think I understand this.  If you're writing to a file, you can
truncate it once (before any writing) to set its size:

   truncate-size "/output" 100M;
   hopen-file "/output";
   hwrite ...;
   hwrite ...;
   hwrite ...;
   hclose fd;

Why is htruncate needed during writes?

You assume above that you know in advance how large the file you're writing will be. As I said, you could move that truncate-size to the end, but mixing paths and handles is nasty and occasionally error-prone.


h = hopen_file("/foo");

hwrite ...;
hwrite ...;

  mv("/foo", "/bar");

hwrite ...;
hwrite ...;
truncate("/foo", x);

You've now created 2 files rather than 1. If you use htruncate rather than truncate here it's robust to any changes in paths.

hallocate is similarly there to avoid a jarring mix of paths and
file handles when operating on a single file.  I could also add
fchmod and fchown, I guess. Any more?

My point was we don't need to duplicate all of these.

hallocate has the same argument as above: it's a non-path based alternative. The argument for it is the same as the argument for all the posix f* calls which take file descriptors instead of path names. There may be a performance benefit to it, but principally it's harder to make a class of stupid mistake.

Matthew Booth, RHCA, RHCSS
Red Hat Engineering, Virtualisation Team

GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]