Exploring eBPF Probes: Observing TCP Connections with Kernel-Level Instrumentation
Last Update : 07 August, 2023 | Published : 17 April, 2023 | 23 Min ReadThe provided eBPF kernel-side program is a kprobe program
attached to the __x64_sys_tcp_connect
function, which is a system call handler for tcp_connect on x64 systems. Let’s go through the program step by step:
SEC("kprobe/__x64_sys_tcp_connect")
int kprobe__tcp_connect(struct pt_regs *ctx)
{
struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);
struct sock_common conn = READ_KERN(sk->__sk_common);
struct sockaddr_in sockv4;
struct sockaddr_in6 sockv6;
sys_context_t context = {};
args_t args = {};
u64 types = ARG_TYPE0(STR_T) | ARG_TYPE1(SOCKADDR_T);
init_context(&context);
context.argnum = get_arg_num(types);
context.retval = PT_REGS_RC(ctx);
if (context.retval >= 0 && drop_syscall(_NETWORK_PROBE))
{
return 0;
}
if (get_connection_info(&conn, &sockv4, &sockv6, &context, &args, _TCP_CONNECT) != 0)
{
return 0;
}
args.args[0] = (unsigned long)conn.skc_prot->name;
set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t));
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
return 0;
save_context_to_buffer(bufs_p, (void *)&context);
save_args_to_buffer(types, &args);
events_perf_submit(ctx);
return 0;
}
The program starts with the definition of the kprobe function kprobe__tcp_connect
, which takes a pointer to struct pt_regs (ctx)
as an argument.
The program declares and initializes several variables:
struct sock *sk:
struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);
It represents the socket object obtained from the first parameter of the system call.It retrieves the socket object from the function arguments and stores it in the sk variable for further processing within the eBPF program.
-
PT_REGS_PARM1(ctx):
This macro is used to access the value of the second parameter (parm1) of the function, which is represented by the ctx pointer to struct pt_regs. -
(struct sock *):
The obtained value from PT_REGS_PARM1(ctx) is cast to a pointer of type struct sock *. This is done to interpret the value as a socket object. -
struct sock *sk:
This line declares a variable named sk of type struct sock * and assigns the obtained socket object to it.
struct sock_common conn:
struct sock_common conn = READ_KERN(sk->__sk_common);
- It is used to read the value of the
__sk_common
field from thesk
socket object and store it in astruct sock_common
variable namedconn
.
Let’s break down the code:
-
sk->__sk_common:
This accesses the__sk_common
field within the sk socket object. The__sk_common
field is a member of the sk structure and represents the common data shared by different socket types. -
READ_KERN(...):
TheREAD_KERN
macro is used to read kernel memory. In this case, it reads the value ofsk->__sk_common
from the kernel memory space. -
struct sock_common conn:
This line declares a variable named conn of typestruct sock_common
and assigns the value read fromsk->__sk_common
to it.
This allows subsequent access to the common socket information for further processing within the eBPF program.
struct sockaddr_in sockv4
andstruct sockaddr_in6 sockv6
struct sockaddr_in sockv4;
struct sockaddr_in6 sockv6;
- The lines
struct sockaddr_in sockv4
; andstruct sockaddr_in6 sockv6
; declare two variables,sockv4
andsockv6
, respectively. These variables are of types struct sockaddr_in and struct sockaddr_in6, which are used to represent socket addresses for IPv4 and IPv6 protocols.
Here’s a brief explanation of these structures:
-
struct sockaddr_in
:- This structure is defined in the <netinet/in.h> header file.
- It represents an
IPv4 socket address.
- It has the following members:
sin_family:
The address family, which is typically set to AF_INET for IPv4.sin_port:
The port number associated with the socket address.sin_addr:
The IPv4 address, stored as an in_addr structure.sin_zero:
Padding to ensure structure alignment.
-
struct sockaddr_in6
:- This structure is also defined in the <netinet/in.h> header file.
- It represents an
IPv6 socket address.
- It has the following members:
sin6_family:
The address family, usually set to AF_INET6 for IPv6.sin6_port:
The port number associated with the socket address.sin6_flowinfo:
IPv6 flow information.sin6_addr:
The IPv6 address, stored as an in6_addr structure.sin6_scope_id:
Scope ID for link-local addresses.sin6_padding:
Padding for future use.
In the given code, the variables sockv4 and sockv6 are declared to store socket addresses of type IPv4 and IPv6, respectively. These variables used in the get_connection_info function to populate the socket address information based on the connection details being processed by the eBPF program.
- Here’s a brief explanation of these lines:
sys_context_t context = {};
args_t args = {};
The lines sys_context_t context = {};
and args_t args = {};
declare and initialize variables context and args, respectively, with empty or zero-initialized values.
sys_context_t context = {};:
- This line declares a variable named
context
of typesys_context_t.
- The
sys_context_t
type is a user-defined structure or typedef representing the context or state of the system. - The {} initializer initializes the context variable, setting all its members to their default or zero values.
This line is commonly used to ensure that the context variable starts with default values before being populated or used further in the program.
args_t args = {};:
- This line declares a variable named
args
of typeargs_t.
- The
args_t
type is a user-defined structure or typedef representing arguments or parameters used in a certain context or function. - The {} initializer initializes the args variable, setting all its members to their default or zero values.
Similar to the previous line, this initialization ensures that the args variable starts with default values before being assigned or utilized in subsequent program logic. By initializing these variables to empty or zero values, it provides a clean and consistent starting point for the context and args structures, allowing them to be populated with specific data as required by the program’s logic.
- Here’s a brief explanation of these lines:
u64 types = ARG_TYPE0(STR_T) | ARG_TYPE1(SOCKADDR_T);
The line u64 types = ARG_TYPE0(STR_T) | ARG_TYPE1(SOCKADDR_T)
; defines a variable types of type u64
(unsigned 64-bit integer) and assigns it a value calculated using the macros ARG_TYPE0()
and ARG_TYPE1()
.
// Definition of ENC_ARG_TYPE,ARG_TYPE0
#define MAX_ARGS 6
#define ENC_ARG_TYPE(n, type) type << (8 * n)
#define ARG_TYPE0(type) ENC_ARG_TYPE(0, type)
#define ARG_TYPE1(type) ENC_ARG_TYPE(1, type)
#define ARG_TYPE2(type) ENC_ARG_TYPE(2, type)
#define ARG_TYPE3(type) ENC_ARG_TYPE(3, type)
#define ARG_TYPE4(type) ENC_ARG_TYPE(4, type)
#define ARG_TYPE5(type) ENC_ARG_TYPE(5, type)
#define DEC_ARG_TYPE(n, type) ((type >> (8 * n)) & 0xFF)
Here
#define ARG_TYPE0(type) ENC_ARG_TYPE(0, type)
The macro ENC_ARG_TYPE(n, type)
is defined with the purpose of encoding an argument type type at a specific position n within a 64-bit value. Here’s a breakdown of how it works:
The macro definition ENC_ARG_TYPE(n, type) type << (8 * n)
consists of two parts:
-
type:
It represents the argument type that needs to be encoded. It is provided as an argument to the macro. -
<< (8 * n):
It performs a left shift operation on thetype
value by8 * n
bits. This shift operation moves the bits oftype
to the left by a certain number of positions determined byn
.
- The value 8 corresponds to the number of bits occupied by one byte.
- The variable n determines the position within the 64-bit value where the type will be encoded.
- Multiplying 8 by n calculates the number of bits by which the type value should be shifted to the left to occupy the desired position. The resulting value of type after the left shift operation represents the encoded argument type at the specified position n within the 64-bit value.
#define ARG_TYPE0(type) ENC_ARG_TYPE(0, type)
The macro ARG_TYPE0(type)
is defined as a convenience macro that uses the ENC_ARG_TYPE(n, type)
macro to encode an argument type type
at position 0 within a 64-bit value. Here’s how it works:
The macro definition ARG_TYPE0(type) ENC_ARG_TYPE(0, type)
expands to the following:
-
ENC_ARG_TYPE(0, type):
This macro is called with arguments0
andtype
, which represents the position and type of the argument to be encoded. -
ENC_ARG_TYPE(n, type) type << (8 * n)
: TheENC_ARG_TYPE(n, type)
macro is invoked with 0 as the positionn
andtype
as the argument type. It performs a left shift operation on thetype
value by8 * 0
bits, effectively encoding the type at position 0.
In summary, ARG_TYPE0(type)
is a shorthand macro that encodes the argument type
type at position 0 within a 64-bit value. It simplifies the encoding process by providing a clear and concise way to specify the position of the argument type. The resulting encoded value can be used as a bitmask or flag to represent the argument type within the program.
These macros allow for convenient encoding and decoding of argument types used in the program. Here’s a breakdown of the relevant macros and their corresponding type values:
ARG_TYPE0(type):
This macro takes an argument type(type)
and encodes it at position 0. It uses the ENC_ARG_TYPE(n, type) macro to calculate the encoded value by shifting the type value type by 8 * n bits, where n is the position.ARG_TYPE1(type):
Similar to ARG_TYPE0, but encodes the type at position 1. In this specific case,STR_T
andSOCKADDR_T
are type values associated with string and socket address types, respectively.
The | operator is used to perform a bitwise OR
operation between the encoded values of STR_T
and SOCKADDR_T
, resulting in a combined type value.
Finally, the combined type value is assigned to the types variable of type u64. The types variable can now be used as a bitmask or flag to represent multiple argument types within the program.
-
init_context(&context);
-
The line
init_context(&context)
; is a function call that invokes the init_context function and passes the address of the context variable as an argument. This call initializes thesys_context_t
object context by populating its members with relevant information based on the current task. -
By calling
init_context(&context)
;, the context object is prepared to store context-specific information such as the timestamp, process IDs, parent process ID, user ID, and command name associated with the current task. Once the function call completes, the context object will hold the initialized values, ready for further use or analysis in the program.
#define TASK_COMM_LEN 16
typedef struct __attribute__((__packed__)) sys_context
{
u64 ts;
u32 pid_id;
u32 mnt_id;
u32 host_ppid;
u32 host_pid;
u32 ppid;
u32 pid;
u32 uid;
u32 event_id;
u32 argnum;
s64 retval;
char comm[TASK_COMM_LEN];
} sys_context_t;
- The code snippet provided defines a data structure called
sys_context_t
, which represents the context information associated with a system event. Thesys_context_t
structure has the following members:
- ts: Represents the
timestamp
of the event. - pid_id and mnt_id: Store the
process ID
,namespace ID
andmount namespace ID
, respectively. - host_ppid and host_pid: Store the
parent process ID
andprocess ID
of the host. - ppid and pid: Store the
parent process ID
andprocess ID
within the container or host, depending on the configuration. - uid: Represents the
user ID
associated with the event. - event_id: Represents the
ID of the event.
- argnum: Indicates the
number of arguments
associated with the event. - retval: Stores the
return value of the event.
- comm: Represents the command name associated with the task, stored as a character array with a maximum length defined by
TASK_COMM_LEN.
// == Context Management == //
static __always_inline u32 init_context(sys_context_t *context)
{
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
context->ts = bpf_ktime_get_ns();
context->host_ppid = get_task_ppid(task);
context->host_pid = bpf_get_current_pid_tgid() >> 32;
#if defined(MONITOR_HOST)
context->pid_id = 0;
context->mnt_id = 0;
context->ppid = get_task_ppid(task);
context->pid = bpf_get_current_pid_tgid() >> 32;
#else // MONITOR_CONTAINER or MONITOR_CONTAINER_AND_HOST
u32 pid = get_task_ns_tgid(task);
if (context->host_pid == pid)
{ // host
context->pid_id = 0;
context->mnt_id = 0;
context->ppid = get_task_ppid(task);
context->pid = bpf_get_current_pid_tgid() >> 32;
}
else
{ // container
context->pid_id = get_task_pid_ns_id(task);
context->mnt_id = get_task_mnt_ns_id(task);
context->ppid = get_task_ns_ppid(task);
context->pid = pid;
}
#endif /* MONITOR_CONTAINER || MONITOR_HOST */
context->uid = bpf_get_current_uid_gid();
bpf_get_current_comm(&context->comm, sizeof(context->comm));
return 0;
}
The provided code snippet is a continuation of the previous code. It includes the implementation of the init_context function
, which is responsible for initializing the sys_context_t
structure with relevant context information.
Let’s go through the code step by step:
-
struct task_struct *task = (struct task_struct *)bpf_get_current_task();:
This line retrieves the current task (process) using thebpf_get_current_task()
BPF helper function. It casts the task to thetask_struct
structure. -
context->ts = bpf_ktime_get_ns();:
This line assigns thecurrent timestamp
, obtained using thebpf_ktime_get_ns()
BPF helper function, to the ts member of the context structure.
3.context->host_ppid = get_task_ppid(task);
: This line retrieves the parent process ID (PPID)
of the current task using the get_task_ppid()
function and assigns it to the host_ppid member
of the context structure.
-
context->host_pid = bpf_get_current_pid_tgid() >> 32;
: This line retrieves the process ID (PID) and thread group ID (TGID) of the current task using thebpf_get_current_pid_tgid()
BPF helper function. It shifts the64-bit
value right by 32 bits to obtain the PID and assigns it to thehost_pid member
of the context structure. -
The code block starting with
#if defined(MONITOR_HOST)
and ending with #else or #endif is conditional compilation based on the presence of the MONITOR_HOST macro. Depending on the configuration, the code inside the corresponding block will be included in the final program. -
If
MONITOR_HOST
is defined, the code block inside #if defined(MONITOR_HOST) is executed. It setspid_id
andmnt_id
to 0, indicating that the current task is running in the host environment. It also assigns thehost PPID
andPID
toppid and pid members
, respectively. -
If
MONITOR_HOST
is not defined, the code block inside #else or #endif is executed. It checks if thehost_pid
is equal to the PID obtained from the task’s namespace (pid). If they are equal, it means the current task is running in the host environment. In that case,pid_id
andmnt_id
are set to 0, and thehost PPID
andPID
are assigned to ppid and pid members, respectively. -
If the condition in step 7 is not met, it means the current task is running in a container. The code inside the else block assigns the
PID namespace ID
andmount namespace ID
topid_id
andmnt_id
members, respectively. It also retrieves thecontainer's PPID
andPID from the task's namespace
and assigns them toppid
andpid members
, respectively. -
context->uid = bpf_get_current_uid_gid();:
This line retrieves theuser ID (UID)
associated with the current task using thebpf_get_current_uid_gid()
BPF helper function and assigns it to the uid member of the context structure. -
bpf_get_current_comm(&context->comm, sizeof(context->comm));
: This line retrieves the command name (executable name) associated with the current task and copies it to the comm member of the context structure. The sizeof(context->comm) specifies the size of the destination buffer.
Finally, the function returns 0, indicating successful initialization of the context.
Overall, the init_context function initializes the sys_context_t structure by populating its members with relevant context information obtained from the current task.
-
context.argnum = get_arg_num(types); context.retval = PT_REGS_RC(ctx);
-
context.argnum = get_arg_num(types);:
This line calls theget_arg_num
function with the types argument and assigns the returned value to theargnum
member of the context structure. The types variable represents a bitmask of argument types, which determines the number of arguments present in the function call. Theget_arg_num
function calculates the number of arguments based on the bitmask and returns the result.static __always_inline int get_arg_num(u64 types) { unsigned int i, argnum = 0; #pragma unroll for (i = 0; i < MAX_ARGS; i++) { if (DEC_ARG_TYPE(i, types) != NONE_T) argnum++; } return argnum; }
The code snippet shows the implementation of the get_arg_num
function. This function takes a bitmask types as an argument and returns the number of arguments based on the bitmask.
Here’s how the function works:
-
It initializes two variables:
i
for the loop counter andargnum
to keep track of the number of arguments. -
The
for
loop iterates overMAX_ARGS
number of times.MAX_ARGS
is a constant defined as 6 code,representing the maximum number of arguments
. -
Inside the loop, it checks the argument type for each position
i
in the bitmask using theDEC_ARG_TYPE(i, types)
macro. TheDEC_ARG_TYPE
macro extracts the argument type from the bitmask based on the positioni
. -
If the argument type is not
NONE_T
(indicating that there is an argument at that position), it increments theargnum
counter. -
After the loop, it returns the final value of
argnum
, which represents the total number of arguments present in the bitmask.
In summary, the get_arg_num function iterates over the bitmask of argument types and counts the number of non-zero argument types, returning the total count as the result.
-
if (context.retval >= 0 && drop_syscall(_NETWORK_PROBE)) { return 0; }
-
context.retval >= 0:
This condition checks if the value ofcontext.retval
(which represents the return value of a system call) is greater than or equal to 0. This condition ensures that the system call executed successfully. -
drop_syscall(_NETWORK_PROBE):
This condition calls thedrop_syscall
function with the_NETWORK_PROBE
scope
as an argument. If this function returns a non-zero value, indicating that the system call should be dropped, the condition evaluates to true.
If both conditions are true, meaning the system call executed successfully and should be dropped based on the provided scope, the code block within the if statement will be executed.
enum
{
_FILE_PROBE = 0,
_PROCESS_PROBE = 1,
_NETWORK_PROBE = 2,
_CAPS_PROBE = 3,
_TRACE_SYSCALL = 0,
_IGNORE_SYSCALL = 1,
};
struct outer_key
{
u32 pid_ns;
u32 mnt_ns;
};
static __always_inline u32 drop_syscall(u32 scope)
{
struct outer_key okey;
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
get_outer_key(&okey, task);
u32 *ns_visibility = bpf_map_lookup_elem(&kubearmor_visibility, &okey);
if (!ns_visibility)
{
return _TRACE_SYSCALL;
}
u32 *on_off_switch = bpf_map_lookup_elem(ns_visibility, &scope);
if (!on_off_switch)
{
return _TRACE_SYSCALL;
}
if (*on_off_switch)
return _IGNORE_SYSCALL;
return _TRACE_SYSCALL;
}
static __always_inline void get_outer_key(struct outer_key *pokey,
struct task_struct *t)
{
pokey->pid_ns = get_task_pid_ns_id(t);
pokey->mnt_ns = get_task_mnt_ns_id(t);
if (pokey->pid_ns == PROC_PID_INIT_INO)
{
pokey->pid_ns = 0;
pokey->mnt_ns = 0;
}
}
The drop_syscall
function is used to determine whether a syscall should be dropped or traced based on the provided scope.
Here’s a breakdown of the function:
-
It starts by defining a structure
struct outer_key
and obtaining thecurrent task
usingbpf_get_current_task()
. -
The function then calls
get_outer_key
to populate the okey structure with relevant information based on the current task. -
It looks up the value associated with the
okey
in thekubearmor_visibility
map usingbpf_map_lookup_elem
. If the lookup fails (!ns_visibility), it returns _TRACE_SYSCALL, indicating that the syscall should be traced. -
Next, it looks up the value associated with the scope in the
ns_visibility
map using bpf_map_lookup_elem. If the lookup fails (!on_off_switch), it returns _TRACE_SYSCALL, again indicating that the syscall should be traced. -
If the lookup succeeds, it checks the value pointed to by
on_off_switch
. If it is non-zero (*on_off_switch is true), it returns_IGNORE_SYSCALL
, indicating that the syscall should be dropped.
If none of the previous conditions are met, it returns _TRACE_SYSCALL, indicating that the syscall should be traced.
Overall, this function is responsible for determining whether a syscall should be dropped or traced based on the provided scope and the information stored in the kubearmor_visibility map.
10.Here’s a brief explanation of these lines:
if (get_connection_info(&conn, &sockv4, &sockv6, &context, &args, _TCP_CONNECT) != 0)
{
return 0;
}
This snippet checks the return value of the function get_connection_info
against zero. If the return value is not equal to zero, the code block within the if statement is executed, and the function or block of code that contains this snippet returns 0.
- Here’s a brief explanation of these lines:
args.args[0] = (unsigned long)conn.skc_prot->name;
The code snippet assigns the value of conn.skc_prot->name
to args.args[0]
. It appears that args
is a structure or array with a member called args, which is an array or a structure itself.
By using (unsigned long)conn.skc_prot->name
, it converts the value of conn.skc_prot->name
to an unsigned long type and assigns it to args.args[0].
- Here’s a brief explanation of these lines:
set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t));
In this specific case, the function call set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t))
updates the value of the element in the bufs_offset
array with the index DATA_BUF_TYPE (0)
. The new value assigned to that element is the size of the sys_context_t
structure, obtained using sizeof(sys_context_t)
.
Required definiton
#define DATA_BUF_TYPE 0
#define EXEC_BUF_TYPE 1
#define FILE_BUF_TYPE 2
static __always_inline void set_buffer_offset(int buf_type, u32 off)
{
bpf_map_update_elem(&bufs_offset, &buf_type, &off, BPF_ANY);
}
The set_buffer_offset function takes two arguments: buf_type (the buffer type)
and off (the offset value)
. It updates the corresponding element in the bufs_offset
array with the provided off
value using the bpf_map_update_elem
function.
BPF_PERCPU_ARRAY(bufs_offset, u32, 3);
There is a BPF per-CPU array named bufs_offset
defined with a size of 3. This array is used to store the offset values for different buffer types.
This code is useful for maintaining and accessing offset values associated with different buffer types in the BPF program. It allows the BPF program to efficiently calculate the memory locations for specific buffer types based on the provided offsets.
- Here’s a brief explanation of these lines:
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
It declares a pointer variable bufs_p of type bufs_t*, and assigns it the value returned by the get_buffer function when called with the DATA_BUF_TYPE parameter.
The get_buffer
function, retrieves an element from the bufs
map based on the provided buffer type. Since bufs_p
is assigned the returned pointer, it will point to the bufs_t
structure corresponding to the DATA_BUF_TYPE
in the bufs map.
This allows you to access and manipulate the data stored in the buffer through the bufs_p
pointer.
Required definition
typedef struct buffers
{
u8 buf[MAX_BUFFER_SIZE];
} bufs_t;
static __always_inline bufs_t *get_buffer(int buf_type)
{
return bpf_map_lookup_elem(&bufs, &buf_type);
}
The code snippet defines a structure bufs_t
that contains a byte array buf
with a maximum size defined by MAX_BUFFER_SIZE
.
The get_buffer
function is declared as an inline function, which is always inlined at the call site to optimize performance. This function takes an integer parameter buf_type
and returns a pointer to a bufs_t structure.
Within the get_buffer
function, bpf_map_lookup_elem
is used to retrieve an element from the bufs
map.The first parameter of bpf_map_lookup_elem is the map object (&bufs)
, and the second parameter is the key used to lookup the element (&buf_type)
.
The function returns a pointer to the retrieved bufs_t structure.
BPF_PERCPU_ARRAY(bufs, bufs_t, 3);
The code snippet declares a BPF per-CPU array
named bufs
that can store elements of type bufs_t
. The array has a size of 3, indicating that it can hold three elements.
A BPF per-CPU array is an array data structure in eBPF that allows each CPU to have its own private copy of the array. This is useful in scenarios where concurrent access to the array from multiple CPUs needs to be managed efficiently.
In this case, the bufs
array is defined to store elements of type bufs_t
, which is a structure containing a byte array buf
with a maximum size defined by MAX_BUFFER_SIZE.
By declaring a per-CPU array, the BPF program can efficiently store and access buffers of type bufs_t per CPU
. Each CPU will have its own private copy of the array
,` enabling concurrent access without requiring synchronization mechanisms.
- Here’s a brief explanation of these lines:
if (bufs_p == NULL)
return 0;
Checks if the bufs_p pointer is NULL, indicating that the corresponding buffer was not found in the bufs map.
If bufs_p is NULL, it means that the buffer retrieval failed, and the code returns 0 to indicate an error or failure condition.
- Here’s a brief explanation of these lines:
save_context_to_buffer(bufs_p, (void *)&context);
save_context_to_buffer(bufs_p, (void *)&context)
; calls the save_context_to_buffer
function with bufs_p
as the buffer pointer
and (void *)&context
as the pointer to the context data
.
The function save_context_to_buffer
attempts to save the context data pointed to by (
void *)&contextinto the buffer pointed to by
bufs_p.` If the save operation is successful, it returns the size of the context data (sizeof(sys_context_t).
Where , save_context_to_buffer
is defined as
static __always_inline int save_context_to_buffer(bufs_t *bufs_p, void *ptr)
{
if (bpf_probe_read(&(bufs_p->buf[0]), sizeof(sys_context_t), ptr) == 0)
{
return sizeof(sys_context_t);
}
return 0;
}
The function save_context_to_buffer
is used to save the context
data pointed to by ptr
into the buffer bufs_p
. Here’s how the function works:
-
It attempts to read the data from
ptr
usingbpf_probe_read
and stores it in the bufferbufs_p->buf[0]
. The size of the data being read issizeof(sys_context_t)
. -
If the read operation succeeds (indicated by
bpf_probe_read
returning0
), it returnssizeof(sys_context_t)
to indicate the number of bytes saved to the buffer. -
If the read operation fails (indicated by
bpf_probe_read
returning a non-zero value), it returns 0 to indicate that the save operation failed.
Overall, the function attempts to save the context data to the buffer and returns the number of bytes saved if successful, or 0 if it fails.
- Here’s a brief explanation of these lines:
save_args_to_buffer(types, &args);
save_args_to_buffer(types, &args)
; is a function call to the save_args_to_buffer
function, which is responsible for saving the arguments to the buffer.
In this function call, types
is a variable representing the types of arguments, and args
is a pointer to the args_t
structure that holds the argument values.
By calling save_args_to_buffer(types, &args)
, the function will process the arguments based on their types and save them to the buffer.
static __always_inline int save_args_to_buffer(u64 types, args_t *args)
{
if (types == 0)
{
return 0;
}
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
{
return 0;
}
#pragma unroll
for (int i = 0; i < MAX_ARGS; i++)
{
switch (DEC_ARG_TYPE(i, types))
{
case NONE_T:
break;
case INT_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), INT_T);
break;
case OPEN_FLAGS_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), OPEN_FLAGS_T);
break;
case FILE_TYPE_T:
save_file_to_buffer(bufs_p, (void *)args->args[i]);
break;
case PTRACE_REQ_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), PTRACE_REQ_T);
break;
case MOUNT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), MOUNT_FLAG_T);
break;
case UMOUNT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), UMOUNT_FLAG_T);
break;
case STR_T:
save_str_to_buffer(bufs_p, (void *)args->args[i]);
break;
case SOCK_DOM_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), SOCK_DOM_T);
break;
case SOCK_TYPE_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), SOCK_TYPE_T);
break;
case SOCKADDR_T:
if (args->args[i])
{
short family = 0;
bpf_probe_read(&family, sizeof(short), (void *)args->args[i]);
switch (family)
{
case AF_UNIX:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_un), SOCKADDR_T);
break;
case AF_INET:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_in), SOCKADDR_T);
break;
case AF_INET6:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_in6), SOCKADDR_T);
break;
default:
save_to_buffer(bufs_p, (void *)&family, sizeof(short), SOCKADDR_T);
}
}
break;
case UNLINKAT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), UNLINKAT_FLAG_T);
break;
}
}
return 0;
}
The code snippet is the definition of the save_args_to_buffer
function. This function is responsible for saving the arguments (args) to the buffer.
Here’s a breakdown of what the code does:
- It checks if the
types
value iszero
. If it is, it returns 0 indicating that there are no arguments to save. - It obtains a pointer to the buffer of type
bufs_t
by calling theget_buffer function
withDATA_BUF_TYPE
as the argument. If the buffer pointer is NULL, it returns 0 indicating a failure. - It then iterates over the arguments using a loop.
Inside the loop, it switches on the argument type determined by
DEC_ARG_TYPE(i, types)
where i is the current iteration index. Based on the argument type, different actions are taken:
- For argument types such as
INT_T
,OPEN_FLAGS_T
,PTRACE_REQ_T
,MOUNT_FLAG_T
,UMOUNT_FLAG_T
,SOCK_DOM_T
,SOCK_TYPE_T
, andUNLINKAT_FLAG_T
, it calls thesave_to_buffer
function to save the argument value to the buffer with the corresponding size and type. - For
FILE_TYPE_T
andSTR_T
, it calls thesave_file_to_buffer
andsave_str_to_buffer
functions respectively to save the argument values to the buffer. - For
SOCKADDR_T
, it checks theaddress family
and based on thefamily value
, it saves the appropriatestruct sockaddr data
to the buffer. - After iterating over all the arguments, it returns 0 indicating a successful save.
- Overall, the
save_args_to_buffer
function is responsible for saving the arguments to the buffer based on their types.
17.Here’s a brief explanation of these lines:
events_perf_submit(ctx);
static __always_inline int events_perf_submit(struct pt_regs *ctx)
{
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
return -1;
u32 *off = get_buffer_offset(DATA_BUF_TYPE);
if (off == NULL)
return -1;
void *data = bufs_p->buf;
int size = *off & (MAX_BUFFER_SIZE - 1);
return bpf_perf_event_output(ctx, &sys_events, BPF_F_CURRENT_CPU, data, size);
}
The code is the implementation of the events_perf_submit
function. This function is responsible for submitting events to a BPF perf event output buffer.
Here’s a breakdown of the code:
-
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
: It retrieves a pointer to the buffer of typeDATA_BUF_TYPE
using theget_buffer
function. If the buffer is not found or is NULL, it returns -1. -
u32 *off = get_buffer_offset(DATA_BUF_TYPE);
: It retrieves the offset value for theDATA_BUF_TYPE
buffer using theget_buffer_offset
function. If the offset is not found or is NULL, it returns -1. -
void *data = bufs_p->buf;
: It assigns the starting address of the buffer to the data pointer. -
int size = *off & (MAX_BUFFER_SIZE - 1);
: It calculates the size of the data in the buffer by masking the offset value withMAX_BUFFER_SIZE - 1
. This ensures that the size is within the maximum buffer size. -
return
bpf_perf_event_output(ctx, &sys_events, BPF_F_CURRENT_CPU, data, size);
: It submits the event to the BPF perf event output buffer using thebpf_perf_event_output
function. Thectx
parameter is a pointer tostruct pt_regs
,&sys_events
is the BPF map representing theperf event output buffer
,BPF_F_CURRENT_CPU
specifies the CPU to submit the event to,data
is the pointer to the data buffer, andsize
is the size of the data. The function returns the result of the submission.
Overall, the events_perf_submit
function retrieves the buffer and offset, prepares the data and size, and then submits the event to the perf event output buffer.
Looking for Cloud-Native Implementation?
Finding the right talent is pain. More so, keeping up with concepts, culture, technology and tools. We all have been there. Our AI-based automated solutions helps eliminate these issues, making your teams lives easy.
Contact Us