/proc/sys/net/ipv4 parameters (see sysctl) (LONG, can be ignored)

Jeff Kinz jkinz at kinz.org
Thu Oct 13 18:47:51 UTC 2005


Apologies in advance if this post bothers anyone.

/proc/sys/net/ipv4 parameters

Just gave an answer to someone on this stuff and realized that this info, 
although readily available in the lartc manual, seems to be difficult for
many to find. (Including me :-) )

Hopefully having this info in one more place will help.

The lartc manual is :
"Linux Advanced Routing & Traffic Control HOWTO"

available here:
http://lartc.org/howto/   

And here : (copy at TLDP.org )
http://www.ibiblio.org/pub/Linux/docs/HOWTO/other-formats/html_single/Adv-Routing-HOWTO.html

This is a great document, I recommend reading cover to cover.


Chapter 13. Kernel network parameters  ( /proc/sys/net/ipv4 )
13.2. Obscure settings

Ok, there are a lot of parameters which can be modified. We try to list
them all. Also documented (partly) in Documentation/ip-sysctl.txt.

Some of these settings have different defaults based on whether you
answered 'Yes' to 'Configure as router and not host' while compiling
your kernel.

Oskar Andreasson also has a page on all these flags and it appears to be
better than ours, so also check http://ipsysctl-tutorial.frozentux.net/.
13.2.1. Generic ipv4

As a generic note, most rate limiting features don't work on loopback,
so don't test them locally. The limits are supplied in 'jiffies', and
are enforced using the earlier mentioned token bucket filter.

The kernel has an internal clock which runs at 'HZ' ticks (or 'jiffies')
per second. On Intel, 'HZ' is mostly 100. So setting a *_rate file to,
say 50, would allow for 2 packets per second. The token bucket filter is
also configured to allow for a burst of at most 6 packets, if enough
tokens have been earned.

Several entries in the following list have been copied from
/usr/src/linux/Documentation/networking/ip-sysctl.txt, written by Alexey
Kuznetsov <kuznet at ms2.inr.ac.ru> and Andi Kleen <ak at muc.de>

/proc/sys/net/ipv4/icmp_destunreach_rate

    If the kernel decides that it can't deliver a packet, it will drop
it, and send the source of the packet an ICMP notice to this effect.
/proc/sys/net/ipv4/icmp_echo_ignore_all

    Don't act on echo packets at all. Please don't set this by default,
but if you are used as a relay in a DoS attack, it may be useful.
/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts [Useful]

    If you ping the broadcast address of a network, all hosts are
supposed to respond. This makes for a dandy denial-of-service tool. Set
this to 1 to ignore these broadcast messages.
/proc/sys/net/ipv4/icmp_echoreply_rate

    The rate at which echo replies are sent to any one destination.
/proc/sys/net/ipv4/icmp_ignore_bogus_error_responses

    Set this to ignore ICMP errors caused by hosts in the network
reacting badly to frames sent to what they perceive to be the broadcast
address.
/proc/sys/net/ipv4/icmp_paramprob_rate

    A relatively unknown ICMP message, which is sent in response to
incorrect packets with broken IP or TCP headers. With this file you can
control the rate at which it is sent.
/proc/sys/net/ipv4/icmp_timeexceed_rate

    This is the famous cause of the 'Solaris middle star' in
traceroutes. Limits the rate of ICMP Time Exceeded messages sent. 
/proc/sys/net/ipv4/igmp_max_memberships

    Maximum number of listening igmp (multicast) sockets on the host.
FIXME: Is this true?
/proc/sys/net/ipv4/inet_peer_gc_maxtime

    FIXME: Add a little explanation about the inet peer storage? Miximum
interval between garbage collection passes. This interval is in effect
under low (or absent) memory pressure on the pool. Measured in jiffies.
/proc/sys/net/ipv4/inet_peer_gc_mintime

    Minimum interval between garbage collection passes. This interval is
in effect under high memory pressure on the pool. Measured in jiffies.
/proc/sys/net/ipv4/inet_peer_maxttl

    Maximum time-to-live of entries. Unused entries will expire after
this period of time if there is no memory pressure on the pool (i.e.
when the number of entries in the pool is very small). Measured in
jiffies.
/proc/sys/net/ipv4/inet_peer_minttl

    Minimum time-to-live of entries. Should be enough to cover fragment
time-to-live on the reassembling side. This minimum time-to-live is
guaranteed if the pool size is less than inet_peer_threshold. Measured
in jiffies.
/proc/sys/net/ipv4/inet_peer_threshold

    The approximate size of the INET peer storage. Starting from this
threshold entries will be thrown aggressively. This threshold also
determines entries' time-to-live and time intervals between garbage
collection passes. More entries, less time-to-live, less GC interval.
/proc/sys/net/ipv4/ip_autoconfig

    This file contains the number one if the host received its IP
configuration by RARP, BOOTP, DHCP or a similar mechanism. Otherwise it
is zero.
/proc/sys/net/ipv4/ip_default_ttl

    Time To Live of packets. Set to a safe 64. Raise it if you have a
huge network. Don't do so for fun - routing loops cause much more damage
that way. You might even consider lowering it in some circumstances.
/proc/sys/net/ipv4/ip_dynaddr

    You need to set this if you use dial-on-demand with a dynamic
interface address. Once your demand interface comes up, any local TCP
sockets which haven't seen replies will be rebound to have the right
address. This solves the problem that the connection that brings up your
interface itself does not work, but the second try does.
/proc/sys/net/ipv4/ip_forward

    If the kernel should attempt to forward packets. Off by default.
/proc/sys/net/ipv4/ip_local_port_range

    Range of local ports for outgoing connections. Actually quite small
by default, 1024 to 4999.
/proc/sys/net/ipv4/ip_no_pmtu_disc

    Set this if you want to disable Path MTU discovery - a technique to
determine the largest Maximum Transfer Unit possible on your path. See
also the section on Path MTU discovery in the Cookbook chapter.
/proc/sys/net/ipv4/ipfrag_high_thresh

    Maximum memory used to reassemble IP fragments. When
ipfrag_high_thresh bytes of memory is allocated for this purpose, the
fragment handler will toss packets until ipfrag_low_thresh is reached.
/proc/sys/net/ipv4/ip_nonlocal_bind

    Set this if you want your applications to be able to bind to an
address which doesn't belong to a device on your system. This can be
useful when your machine is on a non-permanent (or even dynamic) link,
so your services are able to start up and bind to a specific address
when your link is down.
/proc/sys/net/ipv4/ipfrag_low_thresh

    Minimum memory used to reassemble IP fragments.
/proc/sys/net/ipv4/ipfrag_time

    Time in seconds to keep an IP fragment in memory.
/proc/sys/net/ipv4/tcp_abort_on_overflow

    A boolean flag controlling the behaviour under lots of incoming
connections. When enabled, this causes the kernel to actively send RST
packets when a service is overloaded.
/proc/sys/net/ipv4/tcp_fin_timeout

    Time to hold socket in state FIN-WAIT-2, if it was closed by our
side. Peer can be broken and never close its side, or even died
unexpectedly. Default value is 60sec. Usual value used in 2.2 was 180
seconds, you may restore it, but remember that if your machine is even
underloaded WEB server, you risk to overflow memory with kilotons of
dead sockets, FIN-WAIT-2 sockets are less dangerous than FIN-WAIT-1,
because they eat maximum 1.5K of memory, but they tend to live longer.
Cf. tcp_max_orphans.
/proc/sys/net/ipv4/tcp_keepalive_time

    How often TCP sends out keepalive messages when keepalive is
enabled. Default: 2hours.
/proc/sys/net/ipv4/tcp_keepalive_intvl

    How frequent probes are retransmitted, when a probe isn't
acknowledged. Default: 75 seconds.
/proc/sys/net/ipv4/tcp_keepalive_probes

    How many keepalive probes TCP will send, until it decides that the
connection is broken. Default value: 9. Multiplied with
tcp_keepalive_intvl, this gives the time a link can be non-responsive
after a keepalive has been sent.
/proc/sys/net/ipv4/tcp_max_orphans

    Maximal number of TCP sockets not attached to any user file handle,
held by system. If this number is exceeded orphaned connections are
reset immediately and warning is printed. This limit exists only to
prevent simple DoS attacks, you _must_ not rely on this or lower the
limit artificially, but rather increase it (probably, after increasing
installed memory), if network conditions require more than default
value, and tune network services to linger and kill such states more
aggressively. Let me remind you again: each orphan eats up to  64K of
unswappable memory.
/proc/sys/net/ipv4/tcp_orphan_retries

    How may times to retry before killing TCP connection, closed by our
side. Default value 7 corresponds to  50sec-16min depending on RTO. If
your machine is a loaded WEB server, you should think about lowering
this value, such sockets may consume significant resources. Cf.
tcp_max_orphans.
/proc/sys/net/ipv4/tcp_max_syn_backlog

    Maximal number of remembered connection requests, which still did
not receive an acknowledgment from connecting client. Default value is
1024 for systems with more than 128Mb of memory, and 128 for low memory
machines. If server suffers of overload, try to increase this number.
Warning! If you make it greater than 1024, it would be better to change
TCP_SYNQ_HSIZE in include/net/tcp.h to keep
TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog and to recompile kernel.
/proc/sys/net/ipv4/tcp_max_tw_buckets

    Maximal number of timewait sockets held by system simultaneously. If
this number is exceeded time-wait socket is immediately destroyed and
warning is printed. This limit exists only to prevent simple DoS
attacks, you _must_ not lower the limit artificially, but rather
increase it (probably, after increasing installed memory), if network
conditions require more than default value.
/proc/sys/net/ipv4/tcp_retrans_collapse

    Bug-to-bug compatibility with some broken printers. On retransmit
try to send bigger packets to work around bugs in certain TCP stacks.
/proc/sys/net/ipv4/tcp_retries1

    How many times to retry before deciding that something is wrong and
it is necessary to report this suspicion to network layer. Minimal RFC
value is 3, it is default, which corresponds to  3sec-8min depending on
RTO.
/proc/sys/net/ipv4/tcp_retries2

    How may times to retry before killing alive TCP connection. RFC 1122
says that the limit should be longer than 100 sec. It is too small
number. Default value 15 corresponds to  13-30min depending on RTO.
/proc/sys/net/ipv4/tcp_rfc1337

    This boolean enables a fix for 'time-wait assassination hazards in
tcp', described in RFC 1337. If enabled, this causes the kernel to drop
RST packets for sockets in the time-wait state. Default: 0
/proc/sys/net/ipv4/tcp_sack

    Use Selective ACK which can be used to signify that specific packets
are missing - therefore helping fast recovery.
/proc/sys/net/ipv4/tcp_stdurg

    Use the Host requirements interpretation of the TCP urg pointer
field. Most hosts use the older BSD interpretation, so if you turn this
on Linux might not communicate correctly with them. Default: FALSE 
/proc/sys/net/ipv4/tcp_syn_retries

    Number of SYN packets the kernel will send before giving up on the
new connection.
/proc/sys/net/ipv4/tcp_synack_retries

    To open the other side of the connection, the kernel sends a SYN
with a piggybacked ACK on it, to acknowledge the earlier received SYN.
This is part 2 of the threeway handshake. This setting determines the
number of SYN+ACK packets sent before the kernel gives up on the
connection.
/proc/sys/net/ipv4/tcp_timestamps

    Timestamps are used, amongst other things, to protect against
wrapping sequence numbers. A 1 gigabit link might conceivably
re-encounter a previous sequence number with an out-of-line value,
because it was of a previous generation. The timestamp will let it
recognize this 'ancient packet'.
/proc/sys/net/ipv4/tcp_tw_recycle

    Enable fast recycling TIME-WAIT sockets. Default value is 1. It
should not be changed without advice/request of technical experts.
/proc/sys/net/ipv4/tcp_window_scaling

    TCP/IP normally allows windows up to 65535 bytes big. For really
fast networks, this may not be enough. The window scaling options allows
for almost gigabyte windows, which is good for high bandwidth*delay
products.

13.2.2. Per device settings

DEV can either stand for a real interface, or for 'all' or 'default'.
Default also changes settings for interfaces yet to be created.

/proc/sys/net/ipv4/conf/DEV/accept_redirects

    If a router decides that you are using it for a wrong purpose (ie,
it needs to resend your packet on the same interface), it will send us a
ICMP Redirect. This is a slight security risk however, so you may want
to turn it off, or use secure redirects.
/proc/sys/net/ipv4/conf/DEV/accept_source_route

    Not used very much anymore. You used to be able to give a packet a
list of IP addresses it should visit on its way. Linux can be made to
honor this IP option.
/proc/sys/net/ipv4/conf/DEV/bootp_relay

    Accept packets with source address 0.b.c.d with destinations not to
this host as local ones. It is supposed that a BOOTP relay daemon will
catch and forward such packets.

    The default is 0, since this feature is not implemented yet (kernel
version 2.2.12).
/proc/sys/net/ipv4/conf/DEV/forwarding

    Enable or disable IP forwarding on this interface.
/proc/sys/net/ipv4/conf/DEV/log_martians

    See the section on Reverse Path Filtering.
/proc/sys/net/ipv4/conf/DEV/mc_forwarding

    If we do multicast forwarding on this interface
/proc/sys/net/ipv4/conf/DEV/proxy_arp

    If you set this to 1, this interface will respond to ARP requests
for addresses the kernel has routes to. Can be very useful when building
'ip pseudo bridges'. Do take care that your netmasks are very correct
before enabling this! Also be aware that the rp_filter, mentioned
elsewhere, also operates on ARP queries!
/proc/sys/net/ipv4/conf/DEV/rp_filter

    See the section on Reverse Path Filtering.
/proc/sys/net/ipv4/conf/DEV/secure_redirects

    Accept ICMP redirect messages only for gateways, listed in default
gateway list. Enabled by default.
/proc/sys/net/ipv4/conf/DEV/send_redirects

    If we send the above mentioned redirects.
/proc/sys/net/ipv4/conf/DEV/shared_media

    If it is not set the kernel does not assume that different subnets
on this device can communicate directly. Default setting is 'yes'.
/proc/sys/net/ipv4/conf/DEV/tag

    FIXME: fill this in

13.2.3. Neighbor policy

Dev can either stand for a real interface, or for 'all' or 'default'.
Default also changes settings for interfaces yet to be created.

/proc/sys/net/ipv4/neigh/DEV/anycast_delay

    Maximum for random delay of answers to neighbor solicitation
messages in jiffies (1/100 sec). Not yet implemented (Linux does not
have anycast support yet).
/proc/sys/net/ipv4/neigh/DEV/app_solicit

    Determines the number of requests to send to the user level ARP
daemon. Use 0 to turn off.
/proc/sys/net/ipv4/neigh/DEV/base_reachable_time

    A base value used for computing the random reachable time value as
specified in RFC2461.
/proc/sys/net/ipv4/neigh/DEV/delay_first_probe_time

    Delay for the first time probe if the neighbor is reachable. (see
gc_stale_time)
/proc/sys/net/ipv4/neigh/DEV/gc_stale_time

    Determines how often to check for stale ARP entries. After an ARP
entry is stale it will be resolved again (which is useful when an IP
address migrates to another machine). When ucast_solicit is greater than
0 it first tries to send an ARP packet directly to the known host When
that fails and mcast_solicit is greater than 0, an ARP request is
broadcast.
/proc/sys/net/ipv4/neigh/DEV/locktime

    An ARP/neighbor entry is only replaced with a new one if the old is
at least locktime old. This prevents ARP cache thrashing.
/proc/sys/net/ipv4/neigh/DEV/mcast_solicit

    Maximum number of retries for multicast solicitation.
/proc/sys/net/ipv4/neigh/DEV/proxy_delay

    Maximum time (real time is random [0..proxytime]) before answering
to an ARP request for which we have an proxy ARP entry. In some cases,
this is used to prevent network flooding.
/proc/sys/net/ipv4/neigh/DEV/proxy_qlen

    Maximum queue length of the delayed proxy arp timer. (see
proxy_delay).
/proc/sys/net/ipv4/neigh/DEV/retrans_time

    The time, expressed in jiffies (1/100 sec), between retransmitted
Neighbor Solicitation messages. Used for address resolution and to
determine if a neighbor is unreachable.
/proc/sys/net/ipv4/neigh/DEV/ucast_solicit

    Maximum number of retries for unicast solicitation.
/proc/sys/net/ipv4/neigh/DEV/unres_qlen

    Maximum queue length for a pending arp request - the number of
packets which are accepted from other layers while the ARP address is
still resolved.

13.2.4. Routing settings

/proc/sys/net/ipv4/route/error_burst and
/proc/sys/net/ipv4/route/error_cost

    This parameters are used to limit the warning messages written to
the kernel log from the routing code. The higher the error_cost factor
is, the fewer messages will be written. Error_burst controls when
messages will be dropped. The default settings limit warning messages to
one every five seconds.
/proc/sys/net/ipv4/route/flush

    Writing to this file results in a flush of the routing cache.
/proc/sys/net/ipv4/route/gc_elasticity

    Values to control the frequency and behavior of the garbage
collection algorithm for the routing cache. This can be important for
when doing fail over. At least gc_timeout seconds will elapse before
Linux will skip to another route because the previous one has died. By
default set to 300, you may want to lower it if you want to have a
speedy fail over.

    Also see this post by Ard van Breemen.
/proc/sys/net/ipv4/route/gc_interval

    See /proc/sys/net/ipv4/route/gc_elasticity.
/proc/sys/net/ipv4/route/gc_min_interval

    See /proc/sys/net/ipv4/route/gc_elasticity.
/proc/sys/net/ipv4/route/gc_thresh

    See /proc/sys/net/ipv4/route/gc_elasticity.
/proc/sys/net/ipv4/route/gc_timeout

    See /proc/sys/net/ipv4/route/gc_elasticity.
/proc/sys/net/ipv4/route/max_delay

    Maximum delay for flushing the routing cache.
/proc/sys/net/ipv4/route/max_size

    Maximum size of the routing cache. Old entries will be purged once
the cache reached has this size.
/proc/sys/net/ipv4/route/min_adv_mss

    FIXME: fill this in
/proc/sys/net/ipv4/route/min_delay

    Minimum delay for flushing the routing cache.
/proc/sys/net/ipv4/route/min_pmtu

    FIXME: fill this in
/proc/sys/net/ipv4/route/mtu_expires

    FIXME: fill this in
/proc/sys/net/ipv4/route/redirect_load

    Factors which determine if more ICMP redirects should be sent to a
specific host. No redirects will be sent once the load limit or the
maximum number of redirects has been reached.
/proc/sys/net/ipv4/route/redirect_number

    See /proc/sys/net/ipv4/route/redirect_load.
/proc/sys/net/ipv4/route/redirect_silence

    Timeout for redirects. After this period redirects will be sent
again, even if this has been stopped, because the load or number limit
has been reached.

-- 
speech recognition software was not used in the composition of this e-mail
Jeff Kinz, Emergent Research, Hudson, MA.
¡Ya no mas!




More information about the Redhat-install-list mailing list