I played a little with different libseccomp versions and the experimental and wanted to share some findings.
I’ve rebuilt 2.3.3, 2.4.1, master and a bintree branch from https://github.com/drakenclimber/libseccomp/tree/rfc-bintree-v2 , then proceeded to compile BPF profiles with each version.
The disassembly of the profile generated with 2.4.1
version can be found here https://paste.ubuntu.com/p/wRTDmq39Sw/ and a bintree
one is right here: https://paste.ubuntu.com/p/Q9VYyhchRR/.
The results obtained running volatility
(without compiled Python modules) and the same image as @mvo are rather disappointing:
-
@unrestricted
~18s -
2.4.1
~20.53s -
bintree
~20.25s
The gains of binary tree are negligible for this particular snap. I then proceeded to hack a bit on the BPF simulator built into libseccomp and added a count of instructions executed before reaching a ret <action>
instruction. The patch is included below.
BPF simulator patch
diff --git a/tools/scmp_bpf_sim.c b/tools/scmp_bpf_sim.c
index 73d056b..2eebcec 100644
--- a/tools/scmp_bpf_sim.c
+++ b/tools/scmp_bpf_sim.c
@@ -155,7 +155,7 @@ static void end_action(uint32_t action, unsigned int line)
static void bpf_execute(const struct bpf_program *prg,
const struct seccomp_data *sys_data)
{
- unsigned int ip, ip_c;
+ unsigned int ip, ip_c, inst_cnt;
struct sim_state state;
bpf_instr_raw *bpf;
unsigned char *sys_data_b = (unsigned char *)sys_data;
@@ -167,9 +167,11 @@ static void bpf_execute(const struct bpf_program *prg,
/* initialize the machine state */
ip_c = 0;
ip = 0;
+ inst_cnt = 0;
memset(&state, 0, sizeof(state));
while (ip < prg->i_cnt) {
+ inst_cnt++;
/* get the instruction and bump the ip */
ip_c = ip;
bpf = &prg->i[ip++];
@@ -215,6 +217,9 @@ static void bpf_execute(const struct bpf_program *prg,
ip += jf;
break;
case BPF_RET+BPF_K:
+ if (opt_verbose) {
+ fprintf(stderr, "instruction count: %d\n", inst_cnt);
+ }
end_action(k, ip_c);
break;
default:
The results are summarized in the table:
| syscall | <nr> | 2.4.1 #inst | bintree #inst |
|--------------+------+-------------+----------------|
| read() | 0 | 6 | 14 |
| lseek() | 8 | 14 | 14 |
| mmap() | 9 | 15 | 14 |
| getppid() | 110 | 108 | 16 | (Oracle's blog post)
| epoll_wait() | 232 | 192 | 16 |
| openat() | 257 | 204 | 16 |
That kind of explains why the gain is so little with the experimental branch. Strace data shows that lseek()
and read()
account for 97% of all the syscalls executed by the benchmarked command. With lseek()
being basically the same, the result for volatility
are influenced by other syscalls. For instance openat()
(probably due to Python compiling modules?), pops up high on my list and it also exhibits significantly shorter instruction count with bintree
profile. By the looks of it, the experimental branch seems capable of delivering better average performance.