- Feb 22, 2016
-
-
Robert Shearman authored
The lwt implementations using net devices can autoload using the existing mechanism using IFLA_INFO_KIND. However, there's no mechanism that lwt modules not using net devices can use. Therefore, add the ability to autoload modules registering lwt operations for lwt implementations not using a net device so that users don't have to manually load the modules. Only users with the CAP_NET_ADMIN capability can cause modules to be loaded, which is ensured by rtnetlink_rcv_msg rejecting non-RTM_GETxxx messages for users without this capability, and by lwtunnel_build_state not being called in response to RTM_GETxxx messages. Signed-off-by:
Robert Shearman <rshearma@brocade.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 18, 2016
-
-
Benjamin Poirier authored
follows up commit 45f6fad8 ("ipv6: add complete rcu protection around np->opt") which added mixed rcu/refcount protection to np->opt. Given the current implementation of rcu_pointer_handoff(), this has no effect at runtime. Signed-off-by:
Benjamin Poirier <bpoirier@suse.com> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call skb_scrub_packet directly instead. Signed-off-by:
Jiri Benc <jbenc@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
The tun_id field in struct ip_tunnel_key is __be64, not __be32. We need to convert the vni to tun_id correctly. Fixes: 54bfd872 ("vxlan: keep flags and vni in network byte order") Reported-by:
Paolo Abeni <pabeni@redhat.com> Tested-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Jiri Benc <jbenc@redhat.com> Acked-by:
Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
This reverts commit bb9b18fb ("genl: Add genlmsg_new_unicast() for unicast message allocation")'. Nothing wrong with it; its no longer needed since this was only for mmapped netlink support. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Prevent repeated conversions from and to network order in the fast path. To achieve this, define all flag constants in big endian order and store VNI as __be32. To prevent confusion between the actual VNI value and the VNI field from the header (which contains additional reserved byte), strictly distinguish between "vni" and "vni_field". Signed-off-by:
Jiri Benc <jbenc@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Currently, pointer to the vxlan header is kept in a local variable. It has to be reloaded whenever the pskb pull operations are performed which usually happens somewhere deep in called functions. Create a vxlan_hdr function and use it to reference the vxlan header instead. Signed-off-by:
Jiri Benc <jbenc@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
John Fastabend authored
By packing the structure we can remove a few holes as Jamal suggests. before: struct tc_cls_u32_knode { struct tcf_exts * exts; /* 0 8 */ u8 fshift; /* 8 1 */ /* XXX 3 bytes hole, try to pack */ u32 handle; /* 12 4 */ u32 val; /* 16 4 */ u32 mask; /* 20 4 */ u32 link_handle; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ struct tc_u32_sel * sel; /* 32 8 */ /* size: 40, cachelines: 1, members: 7 */ /* sum members: 33, holes: 2, sum holes: 7 */ /* last cacheline: 40 bytes */ }; after: struct tc_cls_u32_knode { struct tcf_exts * exts; /* 0 8 */ struct tc_u32_sel * sel; /* 8 8 */ u32 handle; /* 16 4 */ u32 val; /* 20 4 */ u32 mask; /* 24 4 */ u32 link_handle; /* 28 4 */ u8 fshift; /* 32 1 */ /* size: 40, cachelines: 1, members: 7 */ /* padding: 7 */ /* last cacheline: 40 bytes */ }; Suggested-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
John Fastabend <john.r.fastabend@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 17, 2016
-
-
Xin Long authored
Since commit 8b570dc9 ("sctp: only drop the reference on the datamsg after sending a msg") used sctp_datamsg_put in sctp_sendmsg, instead of sctp_datamsg_free, this function has no use in sctp. So we will remove it. Signed-off-by:
Xin Long <lucien.xin@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
John Fastabend authored
This is a helper function drivers can use to learn if the action type is a drop action. Signed-off-by:
John Fastabend <john.r.fastabend@intel.com> Acked-by:
Jiri Pirko <jiri@mellanox.com> Acked-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
John Fastabend authored
This patch allows netdev drivers to consume cls_u32 offloads via the ndo_setup_tc ndo op. This works aligns with how network drivers have been doing qdisc offloads for mqprio. Signed-off-by:
John Fastabend <john.r.fastabend@intel.com> Acked-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
In case of UDP traffic with datagram length below MTU this give about 2% performance increase when tunneling over ipv4 and about 60% when tunneling over ipv6 Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
In case of UDP traffic with datagram length below MTU this give about 3% performance increase when tunneling over ipv4 and about 70% when tunneling over ipv6. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
The current ip_tunnel cache implementation is prone to a race that will cause the wrong dst to be cached on cuncurrent dst cache miss and ip tunnel update via netlink. Replacing with the generic implementation fix the issue. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
This also fix a potential race into the existing tunnel code, which could lead to the wrong dst to be permanenty cached: CPU1: CPU2: <xmit on ip6_tunnel> <cache lookup fails> dst = ip6_route_output(...) <tunnel params are changed via nl> dst_cache_reset() // no effect, // the cache is empty dst_cache_set() // the wrong dst // is permanenty stored // into the cache With the new dst implementation the above race is not possible since the first cache lookup after dst_cache_reset will fail due to the timestamp check Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
This patch add a generic, lockless dst cache implementation. The need for lock is avoided updating the dst cache fields only in per cpu scope, and requiring that the cache manipulation functions are invoked with the local bh disabled. The refresh_ts and reset_ts fields are used to ensure the cache consistency in case of cuncurrent cache update (dst_cache_set*) and reset operation (dst_cache_reset). Consider the following scenario: CPU1: CPU2: <cache lookup with emtpy cache: it fails> <get dst via uncached route lookup> <related configuration changes> dst_cache_reset() dst_cache_set() The dst entry set passed to dst_cache_set() should not be used for later dst cache lookup, because it's obtained using old configuration values. Since the refresh_ts is updated only on dst_cache lookup, the cached value in the above scenario will be discarded on the next lookup. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 12, 2016
-
-
Edward Cree authored
All users now pass false, so we can remove it, and remove the code that was conditional upon it. Signed-off-by:
Edward Cree <ecree@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The only protocol affected at present is Geneve. Signed-off-by:
Edward Cree <ecree@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 11, 2016
-
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
This was initially introduced in df2cf4a7 ("IGMP: Inhibit reports for local multicast groups") by defining the sysctl in the ipv4_net_table array, however it was never implemented to be namespace aware. Fix this by changing the code accordingly. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Craig Gallek authored
This change extends the fast SO_REUSEPORT socket lookup implemented for UDP to TCP. Listener sockets with SO_REUSEPORT and the same receive address are additionally added to an array for faster random access. This means that only a single socket from the group must be found in the listener list before any socket in the group can be used to receive a packet. Previously, every socket in the group needed to be considered before handing off the incoming packet. This feature also exposes the ability to use a BPF program when selecting a socket from a reuseport group. Signed-off-by:
Craig Gallek <kraig@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Craig Gallek authored
This is a preliminary step to allow fast socket lookup of SO_REUSEPORT groups. Doing so with a BPF filter will require access to the skb in question. This change plumbs the skb (and offset to payload data) through the call stack to the listening socket lookup implementations where it will be used in a following patch. Signed-off-by:
Craig Gallek <kraig@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Craig Gallek authored
In order to support fast lookups for TCP sockets with SO_REUSEPORT, the function that adds sockets to the listening hash set needs to be able to check receive address equality. Since this equality check is different for IPv4 and IPv6, we will need two different socket hashing functions. This patch adds inet6_hash identical to the existing inet_hash function and updates the appropriate references. A following patch will differentiate the two by passing different comparison functions to __inet_hash. Additionally, in order to use the IPv6 address equality function from inet6_hashtables (which is compiled as a built-in object when IPv6 is enabled) it also needs to be in a built-in object file as well. This moves ipv6_rcv_saddr_equal into inet_hashtables to accomplish this. Signed-off-by:
Craig Gallek <kraig@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Craig Gallek authored
In order to support fast reuseport lookups in TCP, the hash function defined in struct proto must be capable of returning an error code. This patch changes the function signature of all related hash functions to return an integer and handles or propagates this return value at all call sites. Signed-off-by:
Craig Gallek <kraig@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 09, 2016
-
-
Nikolay Aleksandrov authored
Currently the bonding allows to set ad_actor_system and prio while the bond device is down, but these are actually applied only if there aren't any slaves yet (applied to bond device when first slave shows up, and to slaves at 3ad bind time). After this patch changes are applied immediately and the new values can be used/seen after the bond's upped so it's not necessary anymore to release all and enslave again to see the changes. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by:
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by:
Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 07, 2016
-
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nikolay Borisov authored
Signed-off-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-