[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Ideal Swap Partition Size

On Sat, 2009-01-24 at 11:43 -0430, Patrick O'Callaghan wrote:
> On Sat, 2009-01-24 at 09:03 -0600, Aaron Konstam wrote:
> > On Fri, 2009-01-23 at 15:42 -0800, Gordon Messmer wrote:
> > > Aaron Konstam wrote:
> > > > 
> > > > This is explained in nearly all textbooks on Computer Architecture. So
> > > > the question remains, where is the address space in Linux.
> > > 
> > > Patrick isn't the only one confused by your question.  I can't make 
> > > heads or tails of it.  Are you asking where the mapping between the 
> > > virtual address space and physical memory is done, or what?
> > > 
> > No I am asking where the virtual address space resides of the machine.
> No, sorry, nothing coming through. The question as phrased makes no
> sense.
> If you're asking where does a given address in the virtual address space
> map to, it depends on whether the corresponding page of the process
> address space is currently in RAM, or on backing store (disk or
> whatever), or nowhere because it hasn't been allocated, but the question
> "where is the address space" has no meaning.
> Furthermore, the *machine* as such has no "virtual address space".
> poc
Ok, let me give this a try. Feel free to correct or disregard this if my
explanation is in error or doesn't match what some of you know.

A computer consists of:

	cpu... the processor.  It addresses memory via bits on its output
address lines.
	Cache... This is the fastest memory and these days is part of the
processor's internal hardware.  This is actually where code is executed.
This memory gets loaded by pages, and the size of a page is determined
by hardware design in a special block of the processor called the MMU or
memory management unit.
	Main memory... This is the bulk memory that is commonly known as the
SIMM cards or DIMM cards with DDR memory on it.

	The MMU has a limited addressing range.  If the total number of address
lines is 32, it can only access 0x00000000 to 0xFFFFFFFF, and that is
somewhat more limited by certain requirements for the system.  For
example location 0...XXX where XXX is determined by the processor design
in an INTEL processor is where the interrupt vectors reside.  At the
same time, some memory is reserved for shared applications, such as 
VIDEO memory or Audio memory or in some cases Array processor memory.
Another block is reserved for swap memory.  The MMU knows about swap
memory, but not the others.  These other hardware pieces often have
their own MMU's that can control how the main CPU views their memory.
And in some cases the MMUs have a control or flag access space that will
let them communicate to what ever degree they need to.

	Now this is just background up to now.  Now lets talk about
applications.  An application has data space and program space.  They
seldom overlap, and the locations within data space are typically
intermittantly addressed.  That is when the application is running, the
operation will typically address a relatively small portion of the data
address space for several operations, then move to another portion of
the data address space and work there for a while. In addition, the data
space is usually quite large compared to the program size.  So using a
memory architecture that only keeps a portion of the data in place at
one time is beneficial and yields a cheaper system that can still handle
a very large data space.  For example, the typical INTEL system has 2
CPU's today (dual core) 2 to 8M of embedded cache memory, sometimes some
external cache memory, and a full memory space of 512-2G of main memory.
Attached to this is one or more disks giving storage of up to a few
hundred gigabytes of storage and occasionally a server with up to
2Terabytes of storage.  Now how do you access say a data base of 2G of
data?  One way is to create a "virtual address space" that runs from say
0x000FFFFFFFF to 0xFFFFFFFFFFF.  Now this address space spans
2Terabytes.  But the data is actually on the disk.  So we add another
form of memory cache, using a swap space, and a bit of software to map
the "virtual address space" in blocks into the actual memory used for
swap.  This virtual addressing scheme maps some disk sectors as the true
location which is called the logical address, and the actual disk
sectors become the blocks to move in and out.  Those blocks are now the
physical address space.  So the software mmu operates by simply
detecting whether a specific address call to the virtual address is ther
or not.  If not that is called a page miss, or cache miss, and generates
an interrupt, the processor then sends the software module into action,
requesting the oldest block be written to the disk, and the new address
block be read in.

So literally the "virtual address" doesn't exist in the machine except
as a block reference number that is mapped to a specific disk sector
(which is why swap space is not set up like a regular disk partition)
and can be referenced more or less like standard memory.  Further, when
the page miss happens, the local cache memory is also re-done to match
the new block, which may or may not generate additional local cache (the
cache in the processor memory) misses, requiring more memory be moved
from local RAM into the cache ram.  To keep the processor working at max
efficiency, most MMU's and processor systems can do a bit of look ahead
and hopefully get the memory stuff done before it is actually needed.

Most systems today implement these memory moves by DMA (direct memory
access) which means they occur during the non-memory access cycles of
the processor.  For example, a jump instruction without look ahead takes
about 11 processor cycles (depending on the architecture).  One cycle
reads the instruction, and one to four more read the address (depends on
the exact form of jump and associated actions) and 2 or 3 cycles to
decode the instruction, then one to two cycles to change the instruction
pointer to the destination.  Since the cycles to decode the instruction
and the cycles to set the instruction pointer do not require memory
access, a DMA process can use those cycles to perform the cache and swap
operations, often simultaneously (by managing the addressing
appropriately) and without interrupting the flow of the processor making
the whole thing virtually seamless.  However as the processors have
become pipelined, and the memory access more cycle intensive, DMA is not
as efficient as it was once.  Fortunately the changes in memory
architecture have had some effect on that as well (DDR means Double
cycle, and because it is double cycle) the read lines are different from
the write lines, so some simultaneous read and write operations are
possible if the MMU and board addressing and so forth are designed to
support it.  This also requires some fancy footwork on the RAM boards,
and/or the ram chips themselves.

I hope this is fairly clear and mostly correct.  Please feel free to
adjust my input if not.

The only thing constant in high tech is the rate of change.

Les H

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]