mirror of
https://github.com/Fishwaldo/linux-bl808.git
synced 2025-06-17 20:25:19 +00:00
bpf: Support socket migration by eBPF.
This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT
to check if the attached eBPF program is capable of migrating sockets. When
the eBPF program is attached, we run it for socket migration if the
expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or
net.ipv4.tcp_migrate_req is enabled.
Currently, the expected_attach_type is not enforced for the
BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the
earlier idea in the commit aac3fc320d
("bpf: Post-hooks for sys_bind") to
fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type().
Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to
select a new listener based on the child socket. migrating_sk varies
depending on if it is migrating a request in the accept queue or during
3WHS.
- accept_queue : sock (ESTABLISHED/SYN_RECV)
- 3WHS : request_sock (NEW_SYN_RECV)
In the eBPF program, we can select a new listener by
BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning
SK_DROP. This feature is useful when listeners have different settings at
the socket API level or when we want to free resources as soon as possible.
- SK_PASS with selected_sk, select it as a new listener
- SK_PASS with selected_sk NULL, fallbacks to the random selection
- SK_DROP, cancel the migration.
There is a noteworthy point. We select a listening socket in three places,
but we do not have struct skb at closing a listener or retransmitting a
SYN+ACK. On the other hand, some helper functions do not expect skb is NULL
(e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer()
in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb
temporarily before running the eBPF program.
Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp
This commit is contained in:
parent
e061047684
commit
d5e4ddaeb6
7 changed files with 88 additions and 5 deletions
|
@ -994,6 +994,8 @@ enum bpf_attach_type {
|
|||
BPF_SK_LOOKUP,
|
||||
BPF_XDP,
|
||||
BPF_SK_SKB_VERDICT,
|
||||
BPF_SK_REUSEPORT_SELECT,
|
||||
BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
|
||||
__MAX_BPF_ATTACH_TYPE
|
||||
};
|
||||
|
||||
|
@ -5416,7 +5418,20 @@ struct sk_reuseport_md {
|
|||
__u32 ip_protocol; /* IP protocol. e.g. IPPROTO_TCP, IPPROTO_UDP */
|
||||
__u32 bind_inany; /* Is sock bound to an INANY address? */
|
||||
__u32 hash; /* A hash of the packet 4 tuples */
|
||||
/* When reuse->migrating_sk is NULL, it is selecting a sk for the
|
||||
* new incoming connection request (e.g. selecting a listen sk for
|
||||
* the received SYN in the TCP case). reuse->sk is one of the sk
|
||||
* in the reuseport group. The bpf prog can use reuse->sk to learn
|
||||
* the local listening ip/port without looking into the skb.
|
||||
*
|
||||
* When reuse->migrating_sk is not NULL, reuse->sk is closed and
|
||||
* reuse->migrating_sk is the socket that needs to be migrated
|
||||
* to another listening socket. migrating_sk could be a fullsock
|
||||
* sk that is fully established or a reqsk that is in-the-middle
|
||||
* of 3-way handshake.
|
||||
*/
|
||||
__bpf_md_ptr(struct bpf_sock *, sk);
|
||||
__bpf_md_ptr(struct bpf_sock *, migrating_sk);
|
||||
};
|
||||
|
||||
#define BPF_TAG_SIZE 8
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue