Skip to content
  • Wang Nan's avatar
    perf record: Add --tail-synthesize option · 4ea648ae
    Wang Nan authored
    
    
    When working with overwritable ring buffer there's a inconvenience
    problem: if perf dumps data after a long period after it starts,
    non-sample events may lost, which makes following 'perf report' unable
    to identify proc name and mmap layout. For example:
    
     # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \
            dd if=/dev/zero of=/dev/null
    
    send SIGUSR2 after dd runs long enough. The resuling perf.data lost
    correct comm and mmap events:
    
     # perf script -i perf.data.2016061522374354
     perf 24478 [004] 2581325.601789:  raw_syscalls:sys_exit: NR 0 = 512
     ^^^^
     Should be 'dd'
                       27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                       203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                       b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                 7f47c417edf0 [unknown] ([unknown])
                 ^^^^^^^^^^^^
                 Fail to unwind
    
    This patch provides a '--tail-synthesize' option, allows perf to collect
    system status when finalizing output file. In resuling output file, the
    non-sample events reflect system status when dumping data.
    
    After this patch:
     # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \
            dd if=/dev/zero of=/dev/null
    
     # perf script -i perf.data.2016061600544998
     dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ...
     ^^
     Correct comm
                       203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms])
                       203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms])
                       203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms])
                       b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                        d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
                        ^^^^^
                        Correct unwind
    
    This option doesn't aim to solve this problem completely. If a process
    terminates before SIGUSR2, we still lost its COMM and MMAP events. For
    example, we can't unwind correctly from the final perf.data we get from
    the previous example, because when perf collects the final output file
    (when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap'
    becomes empty.
    
    However, this is a cheaper choice. To completely solve this problem we
    need to continously output non-sample events. To satisify the
    requirement of daemonization, we need to merge them periodically. It is
    possible but requires much more code and cycles.
    
    Automatically select --tail-synthesize when --overwrite is provided.
    
    Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
    Cc: He Kuang <hekuang@huawei.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Nilay Vaish <nilayvaish@gmail.com>
    Cc: Zefan Li <lizefan@huawei.com>
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1468485287-33422-16-git-send-email-wangnan0@huawei.com
    
    
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    4ea648ae