• Daniel Wagner's avatar
    bpf: BPF based latency tracing · 0fb1170e
    Daniel Wagner authored
    BPF offers another way to generate latency histograms. We attach
    kprobes at trace_preempt_off and trace_preempt_on and calculate the
    time it takes to from seeing the off/on transition.
    
    The first array is used to store the start time stamp. The key is the
    CPU id. The second array stores the log2(time diff). We need to use
    static allocation here (array and not hash tables). The kprobes
    hooking into trace_preempt_on|off should not calling any dynamic
    memory allocation or free path. We need to avoid recursivly
    getting called. Besides that, it reduces jitter in the measurement.
    
    CPU 0
          latency        : count     distribution
           1 -> 1        : 0        |                                        |
           2 -> 3        : 0        |                                        |
           4 -> 7        : 0        |                                        |
           8 -> 15       : 0        |                                        |
          16 -> 31       : 0        |                                        |
          32 -> 63       : 0        |                                        |
          64 -> 127      : 0        |                                        |
         128 -> 255      : 0        |                                        |
         256 -> 511      : 0        |                                        |
         512 -> 1023     : 0        |                                        |
        1024 -> 2047     : 0        |                                        |
        2048 -> 4095     : 166723   |*************************************** |
        4096 -> 8191     : 19870    |***                                     |
        8192 -> 16383    : 6324     |                                        |
       16384 -> 32767    : 1098     |                                        |
       32768 -> 65535    : 190      |                                        |
       65536 -> 131071   : 179      |                                        |
      131072 -> 262143   : 18       |                                        |
      262144 -> 524287   : 4        |                                        |
      524288 -> 1048575  : 1363     |                                        |
    CPU 1
          latency        : count     distribution
           1 -> 1        : 0        |                                        |
           2 -> 3        : 0        |                                        |
           4 -> 7        : 0        |                                        |
           8 -> 15       : 0        |                                        |
          16 -> 31       : 0        |                                        |
          32 -> 63       : 0        |                                        |
          64 -> 127      : 0        |                                        |
         128 -> 255      : 0        |                                        |
         256 -> 511      : 0        |                                        |
         512 -> 1023     : 0        |                                        |
        1024 -> 2047     : 0        |                                        |
        2048 -> 4095     : 114042   |*************************************** |
        4096 -> 8191     : 9587     |**                                      |
        8192 -> 16383    : 4140     |                                        |
       16384 -> 32767    : 673      |                                        |
       32768 -> 65535    : 179      |                                        |
       65536 -> 131071   : 29       |                                        |
      131072 -> 262143   : 4        |                                        |
      262144 -> 524287   : 1        |                                        |
      524288 -> 1048575  : 364      |                                        |
    CPU 2
          latency        : count     distribution
           1 -> 1        : 0        |                                        |
           2 -> 3        : 0        |                                        |
           4 -> 7        : 0        |                                        |
           8 -> 15       : 0        |                                        |
          16 -> 31       : 0        |                                        |
          32 -> 63       : 0        |                                        |
          64 -> 127      : 0        |                                        |
         128 -> 255      : 0        |                                        |
         256 -> 511      : 0        |                                        |
         512 -> 1023     : 0        |                                        |
        1024 -> 2047     : 0        |                                        |
        2048 -> 4095     : 40147    |*************************************** |
        4096 -> 8191     : 2300     |*                                       |
        8192 -> 16383    : 828      |                                        |
       16384 -> 32767    : 178      |                                        |
       32768 -> 65535    : 59       |                                        |
       65536 -> 131071   : 2        |                                        |
      131072 -> 262143   : 0        |                                        |
      262144 -> 524287   : 1        |                                        |
      524288 -> 1048575  : 174      |                                        |
    CPU 3
          latency        : count     distribution
           1 -> 1        : 0        |                                        |
           2 -> 3        : 0        |                                        |
           4 -> 7        : 0        |                                        |
           8 -> 15       : 0        |                                        |
          16 -> 31       : 0        |                                        |
          32 -> 63       : 0        |                                        |
          64 -> 127      : 0        |                                        |
         128 -> 255      : 0        |                                        |
         256 -> 511      : 0        |                                        |
         512 -> 1023     : 0        |                                        |
        1024 -> 2047     : 0        |                                        |
        2048 -> 4095     : 29626    |*************************************** |
        4096 -> 8191     : 2704     |**                                      |
        8192 -> 16383    : 1090     |                                        |
       16384 -> 32767    : 160      |                                        |
       32768 -> 65535    : 72       |                                        |
       65536 -> 131071   : 32       |                                        |
      131072 -> 262143   : 26       |                                        |
      262144 -> 524287   : 12       |                                        |
      524288 -> 1048575  : 298      |                                        |
    
    All this is based on the trace3 examples written by
    Alexei Starovoitov <ast@plumgrid.com>.
    Signed-off-by: 's avatarDaniel Wagner <daniel.wagner@bmw-carit.de>
    Cc: Alexei Starovoitov <ast@plumgrid.com>
    Cc: Alexei Starovoitov <ast@plumgrid.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: linux-kernel@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Acked-by: 's avatarAlexei Starovoitov <ast@plumgrid.com>
    Acked-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
    0fb1170e
lathist_kern.c 2.12 KB