rpm-guide rpm-guide-package-structure-en.xml,NONE,1.1
Stuart Ellis (elliss)
fedora-docs-commits at redhat.com
Tue Oct 4 01:56:33 UTC 2005
Author: elliss
Update of /cvs/docs/rpm-guide
In directory cvs-int.fedora.redhat.com:/tmp/cvs-serv802
Added Files:
rpm-guide-package-structure-en.xml
Log Message:
--- NEW FILE rpm-guide-package-structure-en.xml ---
<!-- $Id: -->
<chapter id="ch-package-structure">
<title>RPM Package File Structure</title>
<para>
Copyright (c) 2005 by Eric Foster-Johnson. This material may be
distributed only subject to the terms and conditions set forth in
the Open Publication License, v1.0 or later (the latest version is
presently available at http://www.opencontent.org/openpub/).
</para>
<para/>
<para>
In This Appendix
</para>
<para>
*RPM package file structure
</para>
<para>
*RPM header entry formats
</para>
<para>
*Payload format
</para>
<para>
This appendix describes the format of RPM package files. You can
combine this information with C, Perl, or Python data structures to
access the information. In all cases, you should access elements in
an RPM file using one of the available programming libraries. Do not
attempt to access the files directly, as you may inadvertently
damage the RPM file.
</para>
<para>
Cross Reference
</para>
<para>
Chapters 16, 17, and 18 cover programming with C, Python, and Perl,
respectively.
</para>
<para>
The RPM package format described here has been standardized as part
of the Linux Standards Base, or LSB, version 1.3.
</para>
<para>
Cross Reference
</para>
<para>
The LSB 1.3 section on package file formats is available at
www.linuxbase.org/spec/refspecs/LSB_1.3.0/gLSB/gLSB.html#PACKAGEFMT.
</para>
<sect1>
<title>The Package File</title>
<para>
RPM packages are delivered with one file per package. All RPM
files have the following basic format of four sections:
</para>
<para>
*A lead or file identifier
</para>
<para>
*A signature
</para>
<para>
*Header information
</para>
<para>
*Archive of the payload, the files to install
</para>
<para>
All values are encoded in network byte order, for portability to
multiple processor architectures.
</para>
<sect2>
<title>The file identifier</title>
<para>
Also called the lead or the rpmlead, the identifier marks that
this file is an RPM file. It contains a magic number that the
file command uses to detect RPM files. It also contains version
and architecture information.
</para>
<para>
The start of the identifier is the so-called magic number. The
file command reads the first few bytes of a file and compares
the values found with the contents of /usr/share/magic
(/etc/magic on many UNIX systems), a database of magic numbers.
This allows the file command to quickly identify files.
</para>
<para>
The identifier includes the RPM version number, that is, the
version of the RPM file format used for the package. The
identifier also has a flag that tells the type of the RPM file,
whether the file contains a binary or source package. An
architecture flag allows RPM software to double-check that you
are not trying to install a package for a non-compatible
architecture.
</para>
</sect2>
<sect2>
<title>The signature</title>
<para>
The signature appears after the lead or identifier section. The
RPM signature helps verify the integrity of the package, and
optionally the authenticity.
</para>
<para>
The signature works by performing a mathematical function on the
header and archive section of the file. The mathematical
function can be an encryption process, such as PGP (Pretty Good
Privacy), or a message digest in MD5 format.
</para>
</sect2>
<sect2>
<title>The header</title>
<para>
The identifier section no longer contains enough information to
describe modern RPMs. Furthermore, the identifier section is
nowhere near as flexible as todayâs packages require. To
counter these deficiencies, the header section was introduced to
include more information about the package.
</para>
<para>
The header structure contains three parts:
</para>
<para>
*Header record
</para>
<para>
*One or more header index record structures
</para>
<para>
*Data for the index record structures
</para>
<para>
The header record identifies this as the RPM header. It also
contains a count of the number of index records and the size of
the index record data.
</para>
<para>
Each index record uses a structure that contains a tag number
for the data it contains. This includes tag IDs for the
copyright message, name of the package, version number, and so
on. A type number identifies the type of the item. An offset
indicates where in the data section the data for this header
item begins. A count indicates how many items of the given type
are in this header entry. You can multiply the count by the size
of the type to get the number of bytes used for the header
entry.
</para>
<para>
Table D-1 lists the type identifiers.
</para>
<para>
Table D-1 Header type identifiers
</para>
<informaltable frame="all">
<tgroup cols="3">
<tbody>
<row>
<entry>
<para>
Constant
</para>
</entry>
<entry>
<para>
Value
</para>
</entry>
<entry>
<para>
Size in Bytes
</para>
</entry>
</row>
<row>
<entry>
<para>
RPM_NULL_TYPE
</para>
</entry>
<entry>
<para>
0
</para>
</entry>
<entry>
<para>
No size
[...2175 lines suppressed...]
1.3
</para>
</entry>
<entry>
<para>
The package conforms to the Linux Standards Base RPM
format.
</para>
</entry>
</row>
<row>
<entry>
<para>
rpmlib(VersionedDependencies)
</para>
</entry>
<entry>
<para>
3.0.3-1
</para>
</entry>
<entry>
<para>
The package holds dependencies or prerequisites that
have versions associated with them.
</para>
</entry>
</row>
<row>
<entry>
<para>
rpmlib(PayloadFilesHavePrefix)
</para>
</entry>
<entry>
<para>
4.0-1
</para>
</entry>
<entry>
<para>
File names in the archive have a â.â prepended
on the names.
</para>
</entry>
</row>
<row>
<entry>
<para>
rpmlib(CompressedFileNames)
</para>
</entry>
<entry>
<para>
3.0.4-1
</para>
</entry>
<entry>
<para>
The package uses the RPMTAG_DIRINDEXES,
RPMTAG_DIRNAME and RPMTAG_BASENAMES tags for
specifying file names.
</para>
</entry>
</row>
<row>
<entry>
<para>
/bin/sh
</para>
</entry>
<entry>
<para>
NA
</para>
</entry>
<entry>
<para>
Indicates a requirement for the Bourne shell to run
the installation scripts.
</para>
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</sect3>
</sect2>
<sect2>
<title>The payload</title>
<para>
The payload, or archive, section contains the actual files used
in the package. These are the files that the rpm command
installs when you install the package. To save space, data in
the archive section is compressed in GNU gzip format.
</para>
<para>
Once uncompressed, the data is in cpio format, which is how the
rpm2cpio command can do its work. In cpio format, the payload is
made up of records, one per file. Table D-10 lists the record
structure.
</para>
<para>
Table D-10 cpio file record structure
</para>
<informaltable frame="all">
<tgroup cols="2">
<tbody>
<row>
<entry>
<para>
Element
</para>
</entry>
<entry>
<para>
Holds
</para>
</entry>
</row>
<row>
<entry>
<para>
cpio header
</para>
</entry>
<entry>
<para>
Information on the file, such as the file mode
(permissions)
</para>
</entry>
</row>
<row>
<entry>
<para>
File name
</para>
</entry>
<entry>
<para>
NULL-terminated string
</para>
</entry>
</row>
<row>
<entry>
<para>
Padding
</para>
</entry>
<entry>
<para>
0 to 3 bytes, as needed, to align the next element on
a 4-byte boundary
</para>
</entry>
</row>
<row>
<entry>
<para>
File data
</para>
</entry>
<entry>
<para>
The contents of the file
</para>
</entry>
</row>
<row>
<entry>
<para>
Padding
</para>
</entry>
<entry>
<para>
0 to 3 bytes, as needed, to align the next file record
on a 4-byte boundary
</para>
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
<para>
The information in the cpio header duplicates that of the RPM
file-information header elements.
</para>
</sect2>
</sect1>
</chapter>
<!--
Local variables:
mode: xml
sgml-parent-document:("rpm-guide-en.xml" "book" "chapter")
fill-column: 72
End:
-->
More information about the Fedora-docs-commits
mailing list