Commits · 745041e2aaf1d668f293aaab4b0f6ad7daa056a5 · Librem5 / linux

Feb 22, 2016

lwtunnel: autoload of lwt modules · 745041e2

Robert Shearman authored 9 years ago


The lwt implementations using net devices can autoload using the
existing mechanism using IFLA_INFO_KIND. However, there's no mechanism
that lwt modules not using net devices can use.

Therefore, add the ability to autoload modules registering lwt
operations for lwt implementations not using a net device so that
users don't have to manually load the modules.

Only users with the CAP_NET_ADMIN capability can cause modules to be
loaded, which is ensured by rtnetlink_rcv_msg rejecting non-RTM_GETxxx
messages for users without this capability, and by
lwtunnel_build_state not being called in response to RTM_GETxxx
messages.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

745041e2

Feb 18, 2016

ipv6: Annotate change of locking mechanism for np->opt · e550785c

Benjamin Poirier authored 9 years ago


follows up commit 45f6fad8 ("ipv6: add complete rcu protection around
np->opt") which added mixed rcu/refcount protection to np->opt.

Given the current implementation of rcu_pointer_handoff(), this has no
effect at runtime.

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e550785c

iptunnel: scrub packet in iptunnel_pull_header · 7f290c94

Jiri Benc authored 9 years ago


Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call
skb_scrub_packet directly instead.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7f290c94

vxlan: tun_id is 64bit, not 32bit · 07dabf20

Jiri Benc authored 9 years ago


The tun_id field in struct ip_tunnel_key is __be64, not __be32. We need to
convert the vni to tun_id correctly.

Fixes: 54bfd872 ("vxlan: keep flags and vni in network byte order")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

07dabf20

Revert "genl: Add genlmsg_new_unicast() for unicast message allocation" · 263ea090

Florian Westphal authored 9 years ago


This reverts commit bb9b18fb ("genl: Add genlmsg_new_unicast() for
unicast message allocation")'.

Nothing wrong with it; its no longer needed since this was only for
mmapped netlink support.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

263ea090

vxlan: keep flags and vni in network byte order · 54bfd872

Jiri Benc authored 9 years ago

Prevent repeated conversions from and to network order in the fast path.

To achieve this, define all flag constants in big endian order and store VNI
as __be32. To prevent confusion between the actual VNI value and the VNI
field from the header (which contains additional reserved byte), strictly
distinguish between "vni" and "vni_field".

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

54bfd872

vxlan: introduce vxlan_hdr · d4ac05ff

Jiri Benc authored 9 years ago


Currently, pointer to the vxlan header is kept in a local variable. It has
to be reloaded whenever the pskb pull operations are performed which usually
happens somewhere deep in called functions.

Create a vxlan_hdr function and use it to reference the vxlan header
instead.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d4ac05ff

net: pack tc_cls_u32_knode struct slighter better · e014860e

John Fastabend authored 9 years ago


By packing the structure we can remove a few holes as Jamal
suggests.

before:

struct tc_cls_u32_knode {
	struct tcf_exts *          exts;                 /*     0     8 */
	u8                         fshift;               /*     8     1 */

	/* XXX 3 bytes hole, try to pack */

	u32                        handle;               /*    12     4 */
	u32                        val;                  /*    16     4 */
	u32                        mask;                 /*    20     4 */
	u32                        link_handle;          /*    24     4 */

	/* XXX 4 bytes hole, try to pack */

	struct tc_u32_sel *        sel;                  /*    32     8 */

	/* size: 40, cachelines: 1, members: 7 */
	/* sum members: 33, holes: 2, sum holes: 7 */
	/* last cacheline: 40 bytes */
};

after:

struct tc_cls_u32_knode {
	struct tcf_exts *          exts;                 /*     0     8 */
	struct tc_u32_sel *        sel;                  /*     8     8 */
	u32                        handle;               /*    16     4 */
	u32                        val;                  /*    20     4 */
	u32                        mask;                 /*    24     4 */
	u32                        link_handle;          /*    28     4 */
	u8                         fshift;               /*    32     1 */

	/* size: 40, cachelines: 1, members: 7 */
	/* padding: 7 */
	/* last cacheline: 40 bytes */
};

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e014860e

Feb 17, 2016

sctp: remove the unused sctp_datamsg_free() · 1cd4d5c4

Xin Long authored 9 years ago


Since commit 8b570dc9 ("sctp: only drop the reference on the datamsg
after sending a msg") used sctp_datamsg_put in sctp_sendmsg, instead of
sctp_datamsg_free, this function has no use in sctp.

So we will remove it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1cd4d5c4

net: tc: helper functions to query action types · 3b01cf56

John Fastabend authored 9 years ago


This is a helper function drivers can use to learn if the
action type is a drop action.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3b01cf56

net: sched: add cls_u32 offload hooks for netdevs · a1b7c5fd

John Fastabend authored 9 years ago


This patch allows netdev drivers to consume cls_u32 offloads via
the ndo_setup_tc ndo op.

This works aligns with how network drivers have been doing qdisc
offloads for mqprio.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1b7c5fd

ipv4: namespacify ip fragment max dist sysctl knob · 0fbf4cb2

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0fbf4cb2

ipv4: namespacify ip_early_demux sysctl knob · e21145a9

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e21145a9

ipv4: Namespacify ip_dynaddr sysctl knob · 287b7f38

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

287b7f38

ipv4: Namespaceify ip_default_ttl sysctl knob · fa50d974

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa50d974

net: add dst_cache to ovs vxlan lwtunnel · d71785ff

Paolo Abeni authored 9 years ago


In case of UDP traffic with datagram length
below MTU this give about 2% performance increase
when tunneling over ipv4 and about 60% when tunneling
over ipv6

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d71785ff

net: use dst_cache for vxlan device · 0c1d70af

Paolo Abeni authored 9 years ago


In case of UDP traffic with datagram length
below MTU this give about 3% performance increase
when tunneling over ipv4 and about 70% when
tunneling over ipv6.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c1d70af

ip_tunnel: replace dst_cache with generic implementation · e09acddf

Paolo Abeni authored 9 years ago


The current ip_tunnel cache implementation is prone to a race
that will cause the wrong dst to be cached on cuncurrent dst cache
miss and ip tunnel update via netlink.

Replacing with the generic implementation fix the issue.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e09acddf

net: replace dst_cache ip6_tunnel implementation with the generic one · 607f725f

Paolo Abeni authored 9 years ago


This also fix a potential race into the existing tunnel code, which
could lead to the wrong dst to be permanenty cached:

CPU1:					CPU2:
  <xmit on ip6_tunnel>
  <cache lookup fails>
  dst = ip6_route_output(...)
					<tunnel params are changed via nl>
					dst_cache_reset() // no effect,
							// the cache is empty
  dst_cache_set() // the wrong dst
	// is permanenty stored
	// into the cache

With the new dst implementation the above race is not possible
since the first cache lookup after dst_cache_reset will fail due
to the timestamp check

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

607f725f

net: add dst_cache support · 911362c7

Paolo Abeni authored 9 years ago


This patch add a generic, lockless dst cache implementation.
The need for lock is avoided updating the dst cache fields
only in per cpu scope, and requiring that the cache manipulation
functions are invoked with the local bh disabled.

The refresh_ts and reset_ts fields are used to ensure the cache
consistency in case of cuncurrent cache update (dst_cache_set*) and
reset operation (dst_cache_reset).

Consider the following scenario:

CPU1:                                   	CPU2:
  <cache lookup with emtpy cache: it fails>
  <get dst via uncached route lookup>
						<related configuration changes>
                                        	dst_cache_reset()
  dst_cache_set()

The dst entry set passed to dst_cache_set() should not be used
for later dst cache lookup, because it's obtained using old
configuration values.

Since the refresh_ts is updated only on dst_cache lookup, the
cached value in the above scenario will be discarded on the next
lookup.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

911362c7

Feb 12, 2016

net: ip_tunnel: remove 'csum_help' argument to iptunnel_handle_offloads · 6fa79666

Edward Cree authored 9 years ago


All users now pass false, so we can remove it, and remove the code that
 was conditional upon it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fa79666

net: enable LCO for udp_tunnel_handle_offloads() users · 21e2e7f9

Edward Cree authored 9 years ago


The only protocol affected at present is Geneve.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21e2e7f9

Feb 11, 2016

igmp: Namespacify igmp_qrv sysctl knob · 165094af

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

165094af

igmp: Namespaceify igmp_llm_reports sysctl knob · 87a8a2ae

Nikolay Borisov authored 9 years ago


This was initially introduced in df2cf4a7 ("IGMP: Inhibit
reports for local multicast groups") by defining the sysctl in the
ipv4_net_table array, however it was never implemented to be
namespace aware. Fix this by changing the code accordingly.

Signed-off-by: David S. Miller <davem@davemloft.net>

87a8a2ae

igmp: Namespaceify igmp_max_msf sysctl knob · 166b6b2d

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

166b6b2d

igmp: Namespaceify igmp_max_memberships sysctl knob · 815c5270

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

815c5270

soreuseport: fast reuseport TCP socket selection · c125e80b

Craig Gallek authored 9 years ago


This change extends the fast SO_REUSEPORT socket lookup implemented
for UDP to TCP.  Listener sockets with SO_REUSEPORT and the same
receive address are additionally added to an array for faster
random access.  This means that only a single socket from the group
must be found in the listener list before any socket in the group can
be used to receive a packet.  Previously, every socket in the group
needed to be considered before handing off the incoming packet.

This feature also exposes the ability to use a BPF program when
selecting a socket from a reuseport group.

Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c125e80b

inet: refactor inet[6]_lookup functions to take skb · a583636a

Craig Gallek authored 9 years ago


This is a preliminary step to allow fast socket lookup of SO_REUSEPORT
groups.  Doing so with a BPF filter will require access to the
skb in question.  This change plumbs the skb (and offset to payload
data) through the call stack to the listening socket lookup
implementations where it will be used in a following patch.

Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a583636a

inet: create IPv6-equivalent inet_hash function · 496611d7

Craig Gallek authored 9 years ago


In order to support fast lookups for TCP sockets with SO_REUSEPORT,
the function that adds sockets to the listening hash set needs
to be able to check receive address equality.  Since this equality
check is different for IPv4 and IPv6, we will need two different
socket hashing functions.

This patch adds inet6_hash identical to the existing inet_hash function
and updates the appropriate references.  A following patch will
differentiate the two by passing different comparison functions to
__inet_hash.

Additionally, in order to use the IPv6 address equality function from
inet6_hashtables (which is compiled as a built-in object when IPv6 is
enabled) it also needs to be in a built-in object file as well.  This
moves ipv6_rcv_saddr_equal into inet_hashtables to accomplish this.

Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

496611d7

sock: struct proto hash function may error · 086c653f

Craig Gallek authored 9 years ago


In order to support fast reuseport lookups in TCP, the hash function
defined in struct proto must be capable of returning an error code.
This patch changes the function signature of all related hash functions
to return an integer and handles or propagates this return value at
all call sites.

Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

086c653f

Feb 09, 2016

bonding: 3ad: apply ad_actor settings changes immediately · 5ee14e6d

Nikolay Aleksandrov authored 9 years ago


Currently the bonding allows to set ad_actor_system and prio while the
bond device is down, but these are actually applied only if there aren't
any slaves yet (applied to bond device when first slave shows up, and to
slaves at 3ad bind time). After this patch changes are applied immediately
and the new values can be used/seen after the bond's upped so it's not
necessary anymore to release all and enslave again to see the changes.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5ee14e6d

Feb 07, 2016

ipv4: Namespaceify tcp_notsent_lowat sysctl knob · 4979f2d9

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4979f2d9

ipv4: Namespaceify tcp_fin_timeout sysctl knob · 1e579caa

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1e579caa

ipv4: Namespaceify tcp_orphan_retries sysctl knob · c402d9be

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c402d9be

ipv4: Namespaceify tcp_retries2 sysctl knob · c6214a97

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6214a97

ipv4: Namespaceify tcp_retries1 sysctl knob · ae5c3f40

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ae5c3f40

ipv4: Namespaceify tcp reordering sysctl knob · 1043e25f

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1043e25f

ipv4: Namespaceify tcp syncookies sysctl knob · 12ed8244

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

12ed8244

ipv4: Namespaceify tcp synack retries sysctl knob · 7c083ecb

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c083ecb

ipv4: Namespaceify tcp syn retries sysctl knob · 6fa25166

Nikolay Borisov authored 9 years ago


Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fa25166

Admin message

Admin message