summaryrefslogtreecommitdiff
path: root/kernel/trace
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-12-05 09:51:37 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2025-12-05 09:51:37 -0800
commit69c5079b49fa120c1a108b6e28b3a6a8e4ae2db5 (patch)
treed3b2ecb61bcbf9d9d9a8f9fa7f620af0030b514d /kernel/trace
parent36492b7141b9abc967e92c991af32c670351dc16 (diff)
parentf6ed9c5d3190cf18382ee75e0420602101f53586 (diff)
Merge tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Extend tracing option mask to 64 bits The trace options were defined by a 32 bit variable. This limits the tracing instances to have a total of 32 different options. As that limit has been hit, and more options are being added, increase the option mask to a 64 bit number, doubling the number of options available. As this is required for the kprobe topic branches as well as the tracing topic branch, a separate branch was created and merged into both. - Make trace_user_fault_read() available for the rest of tracing The function trace_user_fault_read() is used by trace_marker file read to allow reading user space to be done fast and without locking or allocations. Make this available so that the system call trace events can use it too. - Have system call trace events read user space values Now that the system call trace events callbacks are called in a faultable context, take advantage of this and read the user space buffers for various system calls. For example, show the path name of the openat system call instead of just showing the pointer to that path name in user space. Also show the contents of the buffer of the write system call. Several system call trace events are updated to make tracing into a light weight strace tool for all applications in the system. - Update perf system call tracing to do the same - And a config and syscall_user_buf_size file to control the size of the buffer Limit the amount of data that can be read from user space. The default size is 63 bytes but that can be expanded to 165 bytes. - Allow the persistent ring buffer to print system calls normally The persistent ring buffer prints trace events by their type and ignores the print_fmt. This is because the print_fmt may change from kernel to kernel. As the system call output is fixed by the system call ABI itself, there's no reason to limit that. This makes reading the system call events in the persistent ring buffer much nicer and easier to understand. - Add options to show text offset to function profiler The function profiler that counts the number of times a function is hit currently lists all functions by its name and offset. But this becomes ambiguous when there are several functions with the same name. Add a tracing option that changes the output to be that of '_text+offset' instead. Now a user space tool can use this information to map the '_text+offset' to the unique function it is counting. - Report bad dynamic event command If a bad command is passed to the dynamic_events file, report it properly in the error log. - Clean up tracer options Clean up the tracer option code a bit, by removing some useless code and also using switch statements instead of a series of if statements. - Have tracing options be instance specific Tracers can have their own options (function tracer, irqsoff tracer, function graph tracer, etc). But now that the same tracer can be enabled in multiple trace instances, their options are still global. The API is per instance, thus changing one affects other instances. This isn't even consistent, as the option take affect differently depending on when an tracer started in an instance. Make the options for instances only affect the instance it is changed under. - Optimize pid_list lock contention Whenever the pid_list is read, it uses a spin lock. This happens at every sched switch. Taking the lock at sched switch can be removed by instead using a seqlock counter. - Clean up the trace trigger structures The trigger code uses two different structures to implement a single tigger. This was due to trying to reuse code for the two different types of triggers (always on trigger, and count limited trigger). But by adding a single field to one structure, the other structure could be absorbed into the first structure making he code easier to understand. - Create a bulk garbage collector for trace triggers If user space has triggers for several hundreds of events and then removes them, it can take several seconds to complete. This is because each removal calls tracepoint_synchronize_unregister() that can take hundreds of milliseconds to complete. Instead, create a helper thread that will do the clean up. When a trigger is removed, it will create the kthread if it isn't already created, and then add the trigger to a llist. The kthread will take the items off the llist, call tracepoint_synchronize_unregister(), and then remove the items it took off. It will then check if there's more items to free before sleeping. This makes user space removing all these triggers to finish in less than a second. - Allow function tracing of some of the tracing infrastructure code Because the tracing code can cause recursion issues if it is traced by the function tracer the entire tracing directory disables function tracing. But not all of tracing causes issues if it is traced. Namely, the event tracing code. Add a config that enables some of the tracing code to be traced to help in debugging it. Note, when this is enabled, it does add noise to general function tracing, especially if events are enabled as well (which is a common case). - Add boot-time backup instance for persistent buffer The persistent ring buffer is used mostly for kernel crash analysis in the field. One issue is that if there's a crash, the data in the persistent ring buffer must be read before tracing can begin using it. This slows down the boot process. Once tracing starts in the persistent ring buffer, the old data must be freed and the addresses no longer match and old events can't be in the buffer with new events. Create a way to create a backup buffer that copies the persistent ring buffer at boot up. Then after a crash, the always on tracer can begin immediately as well as the normal boot process while the crash analysis tooling uses the backup buffer. After the backup buffer is finished being read, it can be removed. - Enable function graph args and return address options at the same time Currently the when reading of arguments in the function graph tracer is enabled, the option to record the parent function in the entry event can not be enabled. Update the code so that it can. - Add new struct_offset() helper macro Add a new macro that takes a pointer to a structure and a name of one of its members and it will return the offset of that member. This allows the ring buffer code to simplify the following: From: size = struct_size(entry, buf, cnt - sizeof(entry->id)); To: size = struct_offset(entry, id) + cnt; There should be other simplifications that this macro can help out with as well * tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (42 commits) overflow: Introduce struct_offset() to get offset of member function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously tracing: Add boot-time backup of persistent ring buffer ftrace: Allow tracing of some of the tracing code tracing: Use strim() in trigger_process_regex() instead of skip_spaces() tracing: Add bulk garbage collection of freeing event_trigger_data tracing: Remove unneeded event_mutex lock in event_trigger_regex_release() tracing: Merge struct event_trigger_ops into struct event_command tracing: Remove get_trigger_ops() and add count_func() from trigger ops tracing: Show the tracer options in boot-time created instance ftrace: Avoid redundant initialization in register_ftrace_direct tracing: Remove unused variable in tracing_trace_options_show() fgraph: Make fgraph_no_sleep_time signed tracing: Convert function graph set_flags() to use a switch() statement tracing: Have function graph tracer option sleep-time be per instance tracing: Move graph-time out of function graph options tracing: Have function graph tracer option funcgraph-irqs be per instance trace/pid_list: optimize pid_list->lock contention tracing: Have function graph tracer define options per instance tracing: Have function tracer define options per instance ...
Diffstat (limited to 'kernel/trace')
-rw-r--r--kernel/trace/Kconfig28
-rw-r--r--kernel/trace/Makefile17
-rw-r--r--kernel/trace/blktrace.c6
-rw-r--r--kernel/trace/fgraph.c10
-rw-r--r--kernel/trace/ftrace.c32
-rw-r--r--kernel/trace/pid_list.c30
-rw-r--r--kernel/trace/pid_list.h1
-rw-r--r--kernel/trace/trace.c893
-rw-r--r--kernel/trace/trace.h230
-rw-r--r--kernel/trace/trace_dynevent.c11
-rw-r--r--kernel/trace/trace_entries.h15
-rw-r--r--kernel/trace/trace_eprobe.c19
-rw-r--r--kernel/trace/trace_events.c4
-rw-r--r--kernel/trace/trace_events_hist.c143
-rw-r--r--kernel/trace/trace_events_synth.c2
-rw-r--r--kernel/trace/trace_events_trigger.c408
-rw-r--r--kernel/trace/trace_fprobe.c6
-rw-r--r--kernel/trace/trace_functions.c10
-rw-r--r--kernel/trace/trace_functions_graph.c220
-rw-r--r--kernel/trace/trace_irqsoff.c30
-rw-r--r--kernel/trace/trace_kdb.c2
-rw-r--r--kernel/trace/trace_kprobe.c6
-rw-r--r--kernel/trace/trace_output.c45
-rw-r--r--kernel/trace/trace_output.h11
-rw-r--r--kernel/trace/trace_sched_wakeup.c24
-rw-r--r--kernel/trace/trace_syscalls.c935
26 files changed, 2242 insertions, 896 deletions
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 4661b9e606e0..bfa2ec46e075 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -342,6 +342,20 @@ config DYNAMIC_FTRACE_WITH_JMP
depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS
depends on HAVE_DYNAMIC_FTRACE_WITH_JMP
+config FUNCTION_SELF_TRACING
+ bool "Function trace tracing code"
+ depends on FUNCTION_TRACER
+ help
+ Normally all the tracing code is set to notrace, where the function
+ tracer will ignore all the tracing functions. Sometimes it is useful
+ for debugging to trace some of the tracing infratructure itself.
+ Enable this to allow some of the tracing infrastructure to be traced
+ by the function tracer. Note, this will likely add noise to function
+ tracing if events and other tracing features are enabled along with
+ function tracing.
+
+ If unsure, say N.
+
config FPROBE
bool "Kernel Function Probe (fprobe)"
depends on HAVE_FUNCTION_GRAPH_FREGS && HAVE_FTRACE_GRAPH_FUNC
@@ -587,6 +601,20 @@ config FTRACE_SYSCALLS
help
Basic tracer to catch the syscall entry and exit events.
+config TRACE_SYSCALL_BUF_SIZE_DEFAULT
+ int "System call user read max size"
+ range 0 165
+ default 63
+ depends on FTRACE_SYSCALLS
+ help
+ Some system call trace events will record the data from a user
+ space address that one of the parameters point to. The amount of
+ data per event is limited. That limit is set by this config and
+ this config also affects how much user space data perf can read.
+
+ For a tracing instance, this size may be changed by writing into
+ its syscall_user_buf_size file.
+
config TRACER_SNAPSHOT
bool "Create a snapshot trace buffer"
select TRACER_MAX_TRACE
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index dcb4e02afc5f..fc5dcc888e13 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -16,6 +16,23 @@ obj-y += trace_selftest_dynamic.o
endif
endif
+# Allow some files to be function traced
+ifdef CONFIG_FUNCTION_SELF_TRACING
+CFLAGS_trace_output.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_seq.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_stat.o = $(CC_FLAGS_FTRACE)
+CFLAGS_tracing_map.o = $(CC_FLAGS_FTRACE)
+CFLAGS_synth_event_gen_test.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_syscalls.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_filter.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_trigger.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_synth.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_hist.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_user.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_dynevent.o = $(CC_FLAGS_FTRACE)
+endif
+
ifdef CONFIG_FTRACE_STARTUP_TEST
CFLAGS_trace_kprobe_selftest.o = $(CC_FLAGS_FTRACE)
obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe_selftest.o
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index af8cbc8e1a7c..d031c8d80be4 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1738,7 +1738,7 @@ static enum print_line_t print_one_line(struct trace_iterator *iter,
t = te_blk_io_trace(iter->ent);
what = (t->action & ((1 << BLK_TC_SHIFT) - 1)) & ~__BLK_TA_CGROUP;
- long_act = !!(tr->trace_flags & TRACE_ITER_VERBOSE);
+ long_act = !!(tr->trace_flags & TRACE_ITER(VERBOSE));
log_action = classic ? &blk_log_action_classic : &blk_log_action;
has_cg = t->action & __BLK_TA_CGROUP;
@@ -1803,9 +1803,9 @@ blk_tracer_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
/* don't output context-info for blk_classic output */
if (bit == TRACE_BLK_OPT_CLASSIC) {
if (set)
- tr->trace_flags &= ~TRACE_ITER_CONTEXT_INFO;
+ tr->trace_flags &= ~TRACE_ITER(CONTEXT_INFO);
else
- tr->trace_flags |= TRACE_ITER_CONTEXT_INFO;
+ tr->trace_flags |= TRACE_ITER(CONTEXT_INFO);
}
return 0;
}
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 484ad7a18463..7fb9b169d6d4 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -498,9 +498,6 @@ found:
return get_data_type_data(current, offset);
}
-/* Both enabled by default (can be cleared by function_graph tracer flags */
-bool fgraph_sleep_time = true;
-
#ifdef CONFIG_DYNAMIC_FTRACE
/*
* archs can override this function if they must do something
@@ -1023,11 +1020,6 @@ void fgraph_init_ops(struct ftrace_ops *dst_ops,
#endif
}
-void ftrace_graph_sleep_time_control(bool enable)
-{
- fgraph_sleep_time = enable;
-}
-
/*
* Simply points to ftrace_stub, but with the proper protocol.
* Defined by the linker script in linux/vmlinux.lds.h
@@ -1098,7 +1090,7 @@ ftrace_graph_probe_sched_switch(void *ignore, bool preempt,
* Does the user want to count the time a function was asleep.
* If so, do not update the time stamps.
*/
- if (fgraph_sleep_time)
+ if (!fgraph_no_sleep_time)
return;
timestamp = trace_clock_local();
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index bbb37c0f8c6c..3ec2033c0774 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -534,7 +534,9 @@ static int function_stat_headers(struct seq_file *m)
static int function_stat_show(struct seq_file *m, void *v)
{
+ struct trace_array *tr = trace_get_global_array();
struct ftrace_profile *rec = v;
+ const char *refsymbol = NULL;
char str[KSYM_SYMBOL_LEN];
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static struct trace_seq s;
@@ -554,7 +556,29 @@ static int function_stat_show(struct seq_file *m, void *v)
return 0;
#endif
- kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+ if (tr->trace_flags & TRACE_ITER(PROF_TEXT_OFFSET)) {
+ unsigned long offset;
+
+ if (core_kernel_text(rec->ip)) {
+ refsymbol = "_text";
+ offset = rec->ip - (unsigned long)_text;
+ } else {
+ struct module *mod;
+
+ guard(rcu)();
+ mod = __module_text_address(rec->ip);
+ if (mod) {
+ refsymbol = mod->name;
+ /* Calculate offset from module's text entry address. */
+ offset = rec->ip - (unsigned long)mod->mem[MOD_TEXT].base;
+ }
+ }
+ if (refsymbol)
+ snprintf(str, sizeof(str), " %s+%#lx", refsymbol, offset);
+ }
+ if (!refsymbol)
+ kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+
seq_printf(m, " %-30.30s %10lu", str, rec->counter);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -838,6 +862,8 @@ static int profile_graph_entry(struct ftrace_graph_ent *trace,
return 1;
}
+bool fprofile_no_sleep_time;
+
static void profile_graph_return(struct ftrace_graph_ret *trace,
struct fgraph_ops *gops,
struct ftrace_regs *fregs)
@@ -863,7 +889,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace,
calltime = rettime - profile_data->calltime;
- if (!fgraph_sleep_time) {
+ if (fprofile_no_sleep_time) {
if (current->ftrace_sleeptime)
calltime -= current->ftrace_sleeptime - profile_data->sleeptime;
}
@@ -6075,7 +6101,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
new_hash = NULL;
ops->func = call_direct_funcs;
- ops->flags = MULTI_FLAGS;
+ ops->flags |= MULTI_FLAGS;
ops->trampoline = FTRACE_REGS_ADDR;
ops->direct_call = addr;
diff --git a/kernel/trace/pid_list.c b/kernel/trace/pid_list.c
index 090bb5ea4a19..dbee72d69d0a 100644
--- a/kernel/trace/pid_list.c
+++ b/kernel/trace/pid_list.c
@@ -3,6 +3,7 @@
* Copyright (C) 2021 VMware Inc, Steven Rostedt <rostedt@goodmis.org>
*/
#include <linux/spinlock.h>
+#include <linux/seqlock.h>
#include <linux/irq_work.h>
#include <linux/slab.h>
#include "trace.h"
@@ -126,7 +127,7 @@ bool trace_pid_list_is_set(struct trace_pid_list *pid_list, unsigned int pid)
{
union upper_chunk *upper_chunk;
union lower_chunk *lower_chunk;
- unsigned long flags;
+ unsigned int seq;
unsigned int upper1;
unsigned int upper2;
unsigned int lower;
@@ -138,14 +139,16 @@ bool trace_pid_list_is_set(struct trace_pid_list *pid_list, unsigned int pid)
if (pid_split(pid, &upper1, &upper2, &lower) < 0)
return false;
- raw_spin_lock_irqsave(&pid_list->lock, flags);
- upper_chunk = pid_list->upper[upper1];
- if (upper_chunk) {
- lower_chunk = upper_chunk->data[upper2];
- if (lower_chunk)
- ret = test_bit(lower, lower_chunk->data);
- }
- raw_spin_unlock_irqrestore(&pid_list->lock, flags);
+ do {
+ seq = read_seqcount_begin(&pid_list->seqcount);
+ ret = false;
+ upper_chunk = pid_list->upper[upper1];
+ if (upper_chunk) {
+ lower_chunk = upper_chunk->data[upper2];
+ if (lower_chunk)
+ ret = test_bit(lower, lower_chunk->data);
+ }
+ } while (read_seqcount_retry(&pid_list->seqcount, seq));
return ret;
}
@@ -178,6 +181,7 @@ int trace_pid_list_set(struct trace_pid_list *pid_list, unsigned int pid)
return -EINVAL;
raw_spin_lock_irqsave(&pid_list->lock, flags);
+ write_seqcount_begin(&pid_list->seqcount);
upper_chunk = pid_list->upper[upper1];
if (!upper_chunk) {
upper_chunk = get_upper_chunk(pid_list);
@@ -199,6 +203,7 @@ int trace_pid_list_set(struct trace_pid_list *pid_list, unsigned int pid)
set_bit(lower, lower_chunk->data);
ret = 0;
out:
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock_irqrestore(&pid_list->lock, flags);
return ret;
}
@@ -230,6 +235,7 @@ int trace_pid_list_clear(struct trace_pid_list *pid_list, unsigned int pid)
return -EINVAL;
raw_spin_lock_irqsave(&pid_list->lock, flags);
+ write_seqcount_begin(&pid_list->seqcount);
upper_chunk = pid_list->upper[upper1];
if (!upper_chunk)
goto out;
@@ -250,6 +256,7 @@ int trace_pid_list_clear(struct trace_pid_list *pid_list, unsigned int pid)
}
}
out:
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock_irqrestore(&pid_list->lock, flags);
return 0;
}
@@ -340,8 +347,10 @@ static void pid_list_refill_irq(struct irq_work *iwork)
again:
raw_spin_lock(&pid_list->lock);
+ write_seqcount_begin(&pid_list->seqcount);
upper_count = CHUNK_ALLOC - pid_list->free_upper_chunks;
lower_count = CHUNK_ALLOC - pid_list->free_lower_chunks;
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock(&pid_list->lock);
if (upper_count <= 0 && lower_count <= 0)
@@ -370,6 +379,7 @@ static void pid_list_refill_irq(struct irq_work *iwork)
}
raw_spin_lock(&pid_list->lock);
+ write_seqcount_begin(&pid_list->seqcount);
if (upper) {
*upper_next = pid_list->upper_list;
pid_list->upper_list = upper;
@@ -380,6 +390,7 @@ static void pid_list_refill_irq(struct irq_work *iwork)
pid_list->lower_list = lower;
pid_list->free_lower_chunks += lcnt;
}
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock(&pid_list->lock);
/*
@@ -419,6 +430,7 @@ struct trace_pid_list *trace_pid_list_alloc(void)
init_irq_work(&pid_list->refill_irqwork, pid_list_refill_irq);
raw_spin_lock_init(&pid_list->lock);
+ seqcount_raw_spinlock_init(&pid_list->seqcount, &pid_list->lock);
for (i = 0; i < CHUNK_ALLOC; i++) {
union upper_chunk *chunk;
diff --git a/kernel/trace/pid_list.h b/kernel/trace/pid_list.h
index 62e73f1ac85f..0b45fb0eadb9 100644
--- a/kernel/trace/pid_list.h
+++ b/kernel/trace/pid_list.h
@@ -76,6 +76,7 @@ union upper_chunk {
};
struct trace_pid_list {
+ seqcount_raw_spinlock_t seqcount;
raw_spinlock_t lock;
struct irq_work refill_irqwork;
union upper_chunk *upper[UPPER1_SIZE]; // 1 or 2K in size
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 304e93597126..ed5eddb08ef3 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -20,6 +20,7 @@
#include <linux/security.h>
#include <linux/seq_file.h>
#include <linux/irqflags.h>
+#include <linux/syscalls.h>
#include <linux/debugfs.h>
#include <linux/tracefs.h>
#include <linux/pagemap.h>
@@ -93,17 +94,13 @@ static bool tracepoint_printk_stop_on_boot __initdata;
static bool traceoff_after_boot __initdata;
static DEFINE_STATIC_KEY_FALSE(tracepoint_printk_key);
-/* For tracers that don't implement custom flags */
-static struct tracer_opt dummy_tracer_opt[] = {
- { }
+/* Store tracers and their flags per instance */
+struct tracers {
+ struct list_head list;
+ struct tracer *tracer;
+ struct tracer_flags *flags;
};
-static int
-dummy_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
-{
- return 0;
-}
-
/*
* To prevent the comm cache from being overwritten when no
* tracing is active, only save the comm when a trace event
@@ -512,22 +509,23 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_export);
/* trace_flags holds trace_options default values */
#define TRACE_DEFAULT_FLAGS \
- (FUNCTION_DEFAULT_FLAGS | \
- TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK | \
- TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | \
- TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE | \
- TRACE_ITER_IRQ_INFO | TRACE_ITER_MARKERS | \
- TRACE_ITER_HASH_PTR | TRACE_ITER_TRACE_PRINTK | \
- TRACE_ITER_COPY_MARKER)
+ (FUNCTION_DEFAULT_FLAGS | FPROFILE_DEFAULT_FLAGS | \
+ TRACE_ITER(PRINT_PARENT) | TRACE_ITER(PRINTK) | \
+ TRACE_ITER(ANNOTATE) | TRACE_ITER(CONTEXT_INFO) | \
+ TRACE_ITER(RECORD_CMD) | TRACE_ITER(OVERWRITE) | \
+ TRACE_ITER(IRQ_INFO) | TRACE_ITER(MARKERS) | \
+ TRACE_ITER(HASH_PTR) | TRACE_ITER(TRACE_PRINTK) | \
+ TRACE_ITER(COPY_MARKER))
/* trace_options that are only supported by global_trace */
-#define TOP_LEVEL_TRACE_FLAGS (TRACE_ITER_PRINTK | \
- TRACE_ITER_PRINTK_MSGONLY | TRACE_ITER_RECORD_CMD)
+#define TOP_LEVEL_TRACE_FLAGS (TRACE_ITER(PRINTK) | \
+ TRACE_ITER(PRINTK_MSGONLY) | TRACE_ITER(RECORD_CMD) | \
+ TRACE_ITER(PROF_TEXT_OFFSET) | FPROFILE_DEFAULT_FLAGS)
/* trace_flags that are default zero for instances */
#define ZEROED_TRACE_FLAGS \
- (TRACE_ITER_EVENT_FORK | TRACE_ITER_FUNC_FORK | TRACE_ITER_TRACE_PRINTK | \
- TRACE_ITER_COPY_MARKER)
+ (TRACE_ITER(EVENT_FORK) | TRACE_ITER(FUNC_FORK) | TRACE_ITER(TRACE_PRINTK) | \
+ TRACE_ITER(COPY_MARKER))
/*
* The global_trace is the descriptor that holds the top-level tracing
@@ -558,9 +556,9 @@ static void update_printk_trace(struct trace_array *tr)
if (printk_trace == tr)
return;
- printk_trace->trace_flags &= ~TRACE_ITER_TRACE_PRINTK;
+ printk_trace->trace_flags &= ~TRACE_ITER(TRACE_PRINTK);
printk_trace = tr;
- tr->trace_flags |= TRACE_ITER_TRACE_PRINTK;
+ tr->trace_flags |= TRACE_ITER(TRACE_PRINTK);
}
/* Returns true if the status of tr changed */
@@ -573,7 +571,7 @@ static bool update_marker_trace(struct trace_array *tr, int enabled)
return false;
list_add_rcu(&tr->marker_list, &marker_copies);
- tr->trace_flags |= TRACE_ITER_COPY_MARKER;
+ tr->trace_flags |= TRACE_ITER(COPY_MARKER);
return true;
}
@@ -581,7 +579,7 @@ static bool update_marker_trace(struct trace_array *tr, int enabled)
return false;
list_del_init(&tr->marker_list);
- tr->trace_flags &= ~TRACE_ITER_COPY_MARKER;
+ tr->trace_flags &= ~TRACE_ITER(COPY_MARKER);
return true;
}
@@ -1139,7 +1137,7 @@ int __trace_array_puts(struct trace_array *tr, unsigned long ip,
unsigned int trace_ctx;
int alloc;
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
if (unlikely(tracing_selftest_running && tr == &global_trace))
@@ -1205,7 +1203,7 @@ int __trace_bputs(unsigned long ip, const char *str)
if (!printk_binsafe(tr))
return __trace_puts(ip, str, strlen(str));
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
if (unlikely(tracing_selftest_running || tracing_disabled))
@@ -2173,6 +2171,7 @@ static int save_selftest(struct tracer *type)
static int run_tracer_selftest(struct tracer *type)
{
struct trace_array *tr = &global_trace;
+ struct tracer_flags *saved_flags = tr->current_trace_flags;
struct tracer *saved_tracer = tr->current_trace;
int ret;
@@ -2203,6 +2202,7 @@ static int run_tracer_selftest(struct tracer *type)
tracing_reset_online_cpus(&tr->array_buffer);
tr->current_trace = type;
+ tr->current_trace_flags = type->flags ? : type->default_flags;
#ifdef CONFIG_TRACER_MAX_TRACE
if (type->use_max_tr) {
@@ -2219,6 +2219,7 @@ static int run_tracer_selftest(struct tracer *type)
ret = type->selftest(type, tr);
/* the test is responsible for resetting too */
tr->current_trace = saved_tracer;
+ tr->current_trace_flags = saved_flags;
if (ret) {
printk(KERN_CONT "FAILED!\n");
/* Add the warning after printing 'FAILED' */
@@ -2311,10 +2312,23 @@ static inline int do_run_tracer_selftest(struct tracer *type)
}
#endif /* CONFIG_FTRACE_STARTUP_TEST */
-static void add_tracer_options(struct trace_array *tr, struct tracer *t);
+static int add_tracer(struct trace_array *tr, struct tracer *t);
static void __init apply_trace_boot_options(void);
+static void free_tracers(struct trace_array *tr)
+{
+ struct tracers *t, *n;
+
+ lockdep_assert_held(&trace_types_lock);
+
+ list_for_each_entry_safe(t, n, &tr->tracers, list) {
+ list_del(&t->list);
+ kfree(t->flags);
+ kfree(t);
+ }
+}
+
/**
* register_tracer - register a tracer with the ftrace system.
* @type: the plugin for the tracer
@@ -2323,6 +2337,7 @@ static void __init apply_trace_boot_options(void);
*/
int __init register_tracer(struct tracer *type)
{
+ struct trace_array *tr;
struct tracer *t;
int ret = 0;
@@ -2354,31 +2369,25 @@ int __init register_tracer(struct tracer *type)
}
}
- if (!type->set_flag)
- type->set_flag = &dummy_set_flag;
- if (!type->flags) {
- /*allocate a dummy tracer_flags*/
- type->flags = kmalloc(sizeof(*type->flags), GFP_KERNEL);
- if (!type->flags) {
- ret = -ENOMEM;
- goto out;
- }
- type->flags->val = 0;
- type->flags->opts = dummy_tracer_opt;
- } else
- if (!type->flags->opts)
- type->flags->opts = dummy_tracer_opt;
-
/* store the tracer for __set_tracer_option */
- type->flags->trace = type;
+ if (type->flags)
+ type->flags->trace = type;
ret = do_run_tracer_selftest(type);
if (ret < 0)
goto out;
+ list_for_each_entry(tr, &ftrace_trace_arrays, list) {
+ ret = add_tracer(tr, type);
+ if (ret < 0) {
+ /* The tracer will still exist but without options */
+ pr_warn("Failed to create tracer options for %s\n", type->name);
+ break;
+ }
+ }
+
type->next = trace_types;
trace_types = type;
- add_tracer_options(&global_trace, type);
out:
mutex_unlock(&trace_types_lock);
@@ -2391,7 +2400,7 @@ int __init register_tracer(struct tracer *type)
printk(KERN_INFO "Starting tracer '%s'\n", type->name);
/* Do we want this tracer to start on bootup? */
- tracing_set_tracer(&global_trace, type->name);
+ WARN_ON(tracing_set_tracer(&global_trace, type->name) < 0);
default_bootup_tracer = NULL;
apply_trace_boot_options();
@@ -3078,7 +3087,7 @@ static inline void ftrace_trace_stack(struct trace_array *tr,
unsigned int trace_ctx,
int skip, struct pt_regs *regs)
{
- if (!(tr->trace_flags & TRACE_ITER_STACKTRACE))
+ if (!(tr->trace_flags & TRACE_ITER(STACKTRACE)))
return;
__ftrace_trace_stack(tr, buffer, trace_ctx, skip, regs);
@@ -3139,7 +3148,7 @@ ftrace_trace_userstack(struct trace_array *tr,
struct ring_buffer_event *event;
struct userstack_entry *entry;
- if (!(tr->trace_flags & TRACE_ITER_USERSTACKTRACE))
+ if (!(tr->trace_flags & TRACE_ITER(USERSTACKTRACE)))
return;
/*
@@ -3484,7 +3493,7 @@ int trace_array_printk(struct trace_array *tr,
if (tr == &global_trace)
return 0;
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
va_start(ap, fmt);
@@ -3521,7 +3530,7 @@ int trace_array_printk_buf(struct trace_buffer *buffer,
int ret;
va_list ap;
- if (!(printk_trace->trace_flags & TRACE_ITER_PRINTK))
+ if (!(printk_trace->trace_flags & TRACE_ITER(PRINTK)))
return 0;
va_start(ap, fmt);
@@ -3791,7 +3800,7 @@ const char *trace_event_format(struct trace_iterator *iter, const char *fmt)
if (WARN_ON_ONCE(!fmt))
return fmt;
- if (!iter->tr || iter->tr->trace_flags & TRACE_ITER_HASH_PTR)
+ if (!iter->tr || iter->tr->trace_flags & TRACE_ITER(HASH_PTR))
return fmt;
p = fmt;
@@ -4113,7 +4122,7 @@ static void print_event_info(struct array_buffer *buf, struct seq_file *m)
static void print_func_help_header(struct array_buffer *buf, struct seq_file *m,
unsigned int flags)
{
- bool tgid = flags & TRACE_ITER_RECORD_TGID;
+ bool tgid = flags & TRACE_ITER(RECORD_TGID);
print_event_info(buf, m);
@@ -4124,7 +4133,7 @@ static void print_func_help_header(struct array_buffer *buf, struct seq_file *m,
static void print_func_help_header_irq(struct array_buffer *buf, struct seq_file *m,
unsigned int flags)
{
- bool tgid = flags & TRACE_ITER_RECORD_TGID;
+ bool tgid = flags & TRACE_ITER(RECORD_TGID);
static const char space[] = " ";
int prec = tgid ? 12 : 2;
@@ -4197,7 +4206,7 @@ static void test_cpu_buff_start(struct trace_iterator *iter)
struct trace_seq *s = &iter->seq;
struct trace_array *tr = iter->tr;
- if (!(tr->trace_flags & TRACE_ITER_ANNOTATE))
+ if (!(tr->trace_flags & TRACE_ITER(ANNOTATE)))
return;
if (!(iter->iter_flags & TRACE_FILE_ANNOTATE))
@@ -4219,6 +4228,22 @@ static void test_cpu_buff_start(struct trace_iterator *iter)
iter->cpu);
}
+#ifdef CONFIG_FTRACE_SYSCALLS
+static bool is_syscall_event(struct trace_event *event)
+{
+ return (event->funcs == &enter_syscall_print_funcs) ||
+ (event->funcs == &exit_syscall_print_funcs);
+
+}
+#define syscall_buf_size CONFIG_TRACE_SYSCALL_BUF_SIZE_DEFAULT
+#else
+static inline bool is_syscall_event(struct trace_event *event)
+{
+ return false;
+}
+#define syscall_buf_size 0
+#endif /* CONFIG_FTRACE_SYSCALLS */
+
static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
{
struct trace_array *tr = iter->tr;
@@ -4233,7 +4258,7 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
event = ftrace_find_event(entry->type);
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO)) {
if (iter->iter_flags & TRACE_FILE_LAT_FMT)
trace_print_lat_context(iter);
else
@@ -4244,17 +4269,19 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
return TRACE_TYPE_PARTIAL_LINE;
if (event) {
- if (tr->trace_flags & TRACE_ITER_FIELDS)
+ if (tr->trace_flags & TRACE_ITER(FIELDS))
return print_event_fields(iter, event);
/*
* For TRACE_EVENT() events, the print_fmt is not
* safe to use if the array has delta offsets
* Force printing via the fields.
*/
- if ((tr->text_delta) &&
- event->type > __TRACE_LAST_TYPE)
+ if ((tr->text_delta)) {
+ /* ftrace and system call events are still OK */
+ if ((event->type > __TRACE_LAST_TYPE) &&
+ !is_syscall_event(event))
return print_event_fields(iter, event);
-
+ }
return event->funcs->trace(iter, sym_flags, event);
}
@@ -4272,7 +4299,7 @@ static enum print_line_t print_raw_fmt(struct trace_iterator *iter)
entry = iter->ent;
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO)
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO))
trace_seq_printf(s, "%d %d %llu ",
entry->pid, iter->cpu, iter->ts);
@@ -4298,7 +4325,7 @@ static enum print_line_t print_hex_fmt(struct trace_iterator *iter)
entry = iter->ent;
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO)) {
SEQ_PUT_HEX_FIELD(s, entry->pid);
SEQ_PUT_HEX_FIELD(s, iter->cpu);
SEQ_PUT_HEX_FIELD(s, iter->ts);
@@ -4327,7 +4354,7 @@ static enum print_line_t print_bin_fmt(struct trace_iterator *iter)
entry = iter->ent;
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO)) {
SEQ_PUT_FIELD(s, entry->pid);
SEQ_PUT_FIELD(s, iter->cpu);
SEQ_PUT_FIELD(s, iter->ts);
@@ -4398,27 +4425,27 @@ enum print_line_t print_trace_line(struct trace_iterator *iter)
}
if (iter->ent->type == TRACE_BPUTS &&
- trace_flags & TRACE_ITER_PRINTK &&
- trace_flags & TRACE_ITER_PRINTK_MSGONLY)
+ trace_flags & TRACE_ITER(PRINTK) &&
+ trace_flags & TRACE_ITER(PRINTK_MSGONLY))
return trace_print_bputs_msg_only(iter);
if (iter->ent->type == TRACE_BPRINT &&
- trace_flags & TRACE_ITER_PRINTK &&
- trace_flags & TRACE_ITER_PRINTK_MSGONLY)
+ trace_flags & TRACE_ITER(PRINTK) &&
+ trace_flags & TRACE_ITER(PRINTK_MSGONLY))
return trace_print_bprintk_msg_only(iter);
if (iter->ent->type == TRACE_PRINT &&
- trace_flags & TRACE_ITER_PRINTK &&
- trace_flags & TRACE_ITER_PRINTK_MSGONLY)
+ trace_flags & TRACE_ITER(PRINTK) &&
+ trace_flags & TRACE_ITER(PRINTK_MSGONLY))
return trace_print_printk_msg_only(iter);
- if (trace_flags & TRACE_ITER_BIN)
+ if (trace_flags & TRACE_ITER(BIN))
return print_bin_fmt(iter);
- if (trace_flags & TRACE_ITER_HEX)
+ if (trace_flags & TRACE_ITER(HEX))
return print_hex_fmt(iter);
- if (trace_flags & TRACE_ITER_RAW)
+ if (trace_flags & TRACE_ITER(RAW))
return print_raw_fmt(iter);
return print_trace_fmt(iter);
@@ -4436,7 +4463,7 @@ void trace_latency_header(struct seq_file *m)
if (iter->iter_flags & TRACE_FILE_LAT_FMT)
print_trace_header(m, iter);
- if (!(tr->trace_flags & TRACE_ITER_VERBOSE))
+ if (!(tr->trace_flags & TRACE_ITER(VERBOSE)))
print_lat_help_header(m);
}
@@ -4446,7 +4473,7 @@ void trace_default_header(struct seq_file *m)
struct trace_array *tr = iter->tr;
unsigned long trace_flags = tr->trace_flags;
- if (!(trace_flags & TRACE_ITER_CONTEXT_INFO))
+ if (!(trace_flags & TRACE_ITER(CONTEXT_INFO)))
return;
if (iter->iter_flags & TRACE_FILE_LAT_FMT) {
@@ -4454,11 +4481,11 @@ void trace_default_header(struct seq_file *m)
if (trace_empty(iter))
return;
print_trace_header(m, iter);
- if (!(trace_flags & TRACE_ITER_VERBOSE))
+ if (!(trace_flags & TRACE_ITER(VERBOSE)))
print_lat_help_header(m);
} else {
- if (!(trace_flags & TRACE_ITER_VERBOSE)) {
- if (trace_flags & TRACE_ITER_IRQ_INFO)
+ if (!(trace_flags & TRACE_ITER(VERBOSE))) {
+ if (trace_flags & TRACE_ITER(IRQ_INFO))
print_func_help_header_irq(iter->array_buffer,
m, trace_flags);
else
@@ -4682,7 +4709,7 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
* If pause-on-trace is enabled, then stop the trace while
* dumping, unless this is the "snapshot" file
*/
- if (!iter->snapshot && (tr->trace_flags & TRACE_ITER_PAUSE_ON_TRACE))
+ if (!iter->snapshot && (tr->trace_flags & TRACE_ITER(PAUSE_ON_TRACE)))
tracing_stop_tr(tr);
if (iter->cpu_file == RING_BUFFER_ALL_CPUS) {
@@ -4876,7 +4903,7 @@ static int tracing_open(struct inode *inode, struct file *file)
iter = __tracing_open(inode, file, false);
if (IS_ERR(iter))
ret = PTR_ERR(iter);
- else if (tr->trace_flags & TRACE_ITER_LATENCY_FMT)
+ else if (tr->trace_flags & TRACE_ITER(LATENCY_FMT))
iter->iter_flags |= TRACE_FILE_LAT_FMT;
}
@@ -5139,21 +5166,26 @@ static int tracing_trace_options_show(struct seq_file *m, void *v)
{
struct tracer_opt *trace_opts;
struct trace_array *tr = m->private;
+ struct tracer_flags *flags;
u32 tracer_flags;
int i;
guard(mutex)(&trace_types_lock);
- tracer_flags = tr->current_trace->flags->val;
- trace_opts = tr->current_trace->flags->opts;
-
for (i = 0; trace_options[i]; i++) {
- if (tr->trace_flags & (1 << i))
+ if (tr->trace_flags & (1ULL << i))
seq_printf(m, "%s\n", trace_options[i]);
else
seq_printf(m, "no%s\n", trace_options[i]);
}
+ flags = tr->current_trace_flags;
+ if (!flags || !flags->opts)
+ return 0;
+
+ tracer_flags = flags->val;
+ trace_opts = flags->opts;
+
for (i = 0; trace_opts[i].name; i++) {
if (tracer_flags & trace_opts[i].bit)
seq_printf(m, "%s\n", trace_opts[i].name);
@@ -5169,9 +5201,10 @@ static int __set_tracer_option(struct trace_array *tr,
struct tracer_opt *opts, int neg)
{
struct tracer *trace = tracer_flags->trace;
- int ret;
+ int ret = 0;
- ret = trace->set_flag(tr, tracer_flags->val, opts->bit, !neg);
+ if (trace->set_flag)
+ ret = trace->set_flag(tr, tracer_flags->val, opts->bit, !neg);
if (ret)
return ret;
@@ -5185,37 +5218,41 @@ static int __set_tracer_option(struct trace_array *tr,
/* Try to assign a tracer specific option */
static int set_tracer_option(struct trace_array *tr, char *cmp, int neg)
{
- struct tracer *trace = tr->current_trace;
- struct tracer_flags *tracer_flags = trace->flags;
+ struct tracer_flags *tracer_flags = tr->current_trace_flags;
struct tracer_opt *opts = NULL;
int i;
+ if (!tracer_flags || !tracer_flags->opts)
+ return 0;
+
for (i = 0; tracer_flags->opts[i].name; i++) {
opts = &tracer_flags->opts[i];
if (strcmp(cmp, opts->name) == 0)
- return __set_tracer_option(tr, trace->flags, opts, neg);
+ return __set_tracer_option(tr, tracer_flags, opts, neg);
}
return -EINVAL;
}
/* Some tracers require overwrite to stay enabled */
-int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
+int trace_keep_overwrite(struct tracer *tracer, u64 mask, int set)
{
- if (tracer->enabled && (mask & TRACE_ITER_OVERWRITE) && !set)
+ if (tracer->enabled && (mask & TRACE_ITER(OVERWRITE)) && !set)
return -1;
return 0;
}
-int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
+int set_tracer_flag(struct trace_array *tr, u64 mask, int enabled)
{
- if ((mask == TRACE_ITER_RECORD_TGID) ||
- (mask == TRACE_ITER_RECORD_CMD) ||
- (mask == TRACE_ITER_TRACE_PRINTK) ||
- (mask == TRACE_ITER_COPY_MARKER))
+ switch (mask) {
+ case TRACE_ITER(RECORD_TGID):
+ case TRACE_ITER(RECORD_CMD):
+ case TRACE_ITER(TRACE_PRINTK):
+ case TRACE_ITER(COPY_MARKER):
lockdep_assert_held(&event_mutex);
+ }
/* do nothing if flag is already set */
if (!!(tr->trace_flags & mask) == !!enabled)
@@ -5226,7 +5263,8 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
if (tr->current_trace->flag_changed(tr, mask, !!enabled))
return -EINVAL;
- if (mask == TRACE_ITER_TRACE_PRINTK) {
+ switch (mask) {
+ case TRACE_ITER(TRACE_PRINTK):
if (enabled) {
update_printk_trace(tr);
} else {
@@ -5243,45 +5281,59 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
if (printk_trace == tr)
update_printk_trace(&global_trace);
}
- }
+ break;
- if (mask == TRACE_ITER_COPY_MARKER)
+ case TRACE_ITER(COPY_MARKER):
update_marker_trace(tr, enabled);
+ /* update_marker_trace updates the tr->trace_flags */
+ return 0;
+ }
if (enabled)
tr->trace_flags |= mask;
else
tr->trace_flags &= ~mask;
- if (mask == TRACE_ITER_RECORD_CMD)
+ switch (mask) {
+ case TRACE_ITER(RECORD_CMD):
trace_event_enable_cmd_record(enabled);
+ break;
- if (mask == TRACE_ITER_RECORD_TGID) {
+ case TRACE_ITER(RECORD_TGID):
if (trace_alloc_tgid_map() < 0) {
- tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
+ tr->trace_flags &= ~TRACE_ITER(RECORD_TGID);
return -ENOMEM;
}
trace_event_enable_tgid_record(enabled);
- }
+ break;
- if (mask == TRACE_ITER_EVENT_FORK)
+ case TRACE_ITER(EVENT_FORK):
trace_event_follow_fork(tr, enabled);
+ break;
- if (mask == TRACE_ITER_FUNC_FORK)
+ case TRACE_ITER(FUNC_FORK):
ftrace_pid_follow_fork(tr, enabled);
+ break;
- if (mask == TRACE_ITER_OVERWRITE) {
+ case TRACE_ITER(OVERWRITE):
ring_buffer_change_overwrite(tr->array_buffer.buffer, enabled);
#ifdef CONFIG_TRACER_MAX_TRACE
ring_buffer_change_overwrite(tr->max_buffer.buffer, enabled);
#endif
- }
+ break;
- if (mask == TRACE_ITER_PRINTK) {
+ case TRACE_ITER(PRINTK):
trace_printk_start_stop_comm(enabled);
trace_printk_control(enabled);
+ break;
+
+#if defined(CONFIG_FUNCTION_PROFILER) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
+ case TRACE_GRAPH_GRAPH_TIME:
+ ftrace_graph_graph_time_control(enabled);
+ break;
+#endif
}
return 0;
@@ -5311,7 +5363,7 @@ int trace_set_options(struct trace_array *tr, char *option)
if (ret < 0)
ret = set_tracer_option(tr, cmp, neg);
else
- ret = set_tracer_flag(tr, 1 << ret, !neg);
+ ret = set_tracer_flag(tr, 1ULL << ret, !neg);
mutex_unlock(&trace_types_lock);
mutex_unlock(&event_mutex);
@@ -6215,11 +6267,6 @@ int tracing_update_buffers(struct trace_array *tr)
return ret;
}
-struct trace_option_dentry;
-
-static void
-create_trace_option_files(struct trace_array *tr, struct tracer *tracer);
-
/*
* Used to clear out the tracer before deletion of an instance.
* Must have trace_types_lock held.
@@ -6235,26 +6282,15 @@ static void tracing_set_nop(struct trace_array *tr)
tr->current_trace->reset(tr);
tr->current_trace = &nop_trace;
+ tr->current_trace_flags = nop_trace.flags;
}
static bool tracer_options_updated;
-static void add_tracer_options(struct trace_array *tr, struct tracer *t)
-{
- /* Only enable if the directory has been created already. */
- if (!tr->dir && !(tr->flags & TRACE_ARRAY_FL_GLOBAL))
- return;
-
- /* Only create trace option files after update_tracer_options finish */
- if (!tracer_options_updated)
- return;
-
- create_trace_option_files(tr, t);
-}
-
int tracing_set_tracer(struct trace_array *tr, const char *buf)
{
- struct tracer *t;
+ struct tracer *trace = NULL;
+ struct tracers *t;
#ifdef CONFIG_TRACER_MAX_TRACE
bool had_max_tr;
#endif
@@ -6272,18 +6308,20 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
ret = 0;
}
- for (t = trace_types; t; t = t->next) {
- if (strcmp(t->name, buf) == 0)
+ list_for_each_entry(t, &tr->tracers, list) {
+ if (strcmp(t->tracer->name, buf) == 0) {
+ trace = t->tracer;
break;
+ }
}
- if (!t)
+ if (!trace)
return -EINVAL;
- if (t == tr->current_trace)
+ if (trace == tr->current_trace)
return 0;
#ifdef CONFIG_TRACER_SNAPSHOT
- if (t->use_max_tr) {
+ if (trace->use_max_tr) {
local_irq_disable();
arch_spin_lock(&tr->max_lock);
ret = tr->cond_snapshot ? -EBUSY : 0;
@@ -6294,14 +6332,14 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
}
#endif
/* Some tracers won't work on kernel command line */
- if (system_state < SYSTEM_RUNNING && t->noboot) {
+ if (system_state < SYSTEM_RUNNING && trace->noboot) {
pr_warn("Tracer '%s' is not allowed on command line, ignored\n",
- t->name);
+ trace->name);
return -EINVAL;
}
/* Some tracers are only allowed for the top level buffer */
- if (!trace_ok_for_array(t, tr))
+ if (!trace_ok_for_array(trace, tr))
return -EINVAL;
/* If trace pipe files are being read, we can't change the tracer */
@@ -6320,8 +6358,9 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
/* Current trace needs to be nop_trace before synchronize_rcu */
tr->current_trace = &nop_trace;
+ tr->current_trace_flags = nop_trace.flags;
- if (had_max_tr && !t->use_max_tr) {
+ if (had_max_tr && !trace->use_max_tr) {
/*
* We need to make sure that the update_max_tr sees that
* current_trace changed to nop_trace to keep it from
@@ -6334,7 +6373,7 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
tracing_disarm_snapshot(tr);
}
- if (!had_max_tr && t->use_max_tr) {
+ if (!had_max_tr && trace->use_max_tr) {
ret = tracing_arm_snapshot_locked(tr);
if (ret)
return ret;
@@ -6343,18 +6382,21 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
tr->current_trace = &nop_trace;
#endif
- if (t->init) {
- ret = tracer_init(t, tr);
+ tr->current_trace_flags = t->flags ? : t->tracer->flags;
+
+ if (trace->init) {
+ ret = tracer_init(trace, tr);
if (ret) {
#ifdef CONFIG_TRACER_MAX_TRACE
- if (t->use_max_tr)
+ if (trace->use_max_tr)
tracing_disarm_snapshot(tr);
#endif
+ tr->current_trace_flags = nop_trace.flags;
return ret;
}
}
- tr->current_trace = t;
+ tr->current_trace = trace;
tr->current_trace->enabled++;
trace_branch_enable(tr);
@@ -6532,7 +6574,7 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
/* trace pipe does not show start of buffer */
cpumask_setall(iter->started);
- if (tr->trace_flags & TRACE_ITER_LATENCY_FMT)
+ if (tr->trace_flags & TRACE_ITER(LATENCY_FMT))
iter->iter_flags |= TRACE_FILE_LAT_FMT;
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
@@ -6593,7 +6635,7 @@ trace_poll(struct trace_iterator *iter, struct file *filp, poll_table *poll_tabl
if (trace_buffer_iter(iter, iter->cpu_file))
return EPOLLIN | EPOLLRDNORM;
- if (tr->trace_flags & TRACE_ITER_BLOCK)
+ if (tr->trace_flags & TRACE_ITER(BLOCK))
/*
* Always select as readable when in blocking mode
*/
@@ -6912,6 +6954,43 @@ out_err:
}
static ssize_t
+tracing_syscall_buf_read(struct file *filp, char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ struct inode *inode = file_inode(filp);
+ struct trace_array *tr = inode->i_private;
+ char buf[64];
+ int r;
+
+ r = snprintf(buf, 64, "%d\n", tr->syscall_buf_sz);
+
+ return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static ssize_t
+tracing_syscall_buf_write(struct file *filp, const char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ struct inode *inode = file_inode(filp);
+ struct trace_array *tr = inode->i_private;
+ unsigned long val;
+ int ret;
+
+ ret = kstrtoul_from_user(ubuf, cnt, 10, &val);
+ if (ret)
+ return ret;
+
+ if (val > SYSCALL_FAULT_USER_MAX)
+ val = SYSCALL_FAULT_USER_MAX;
+
+ tr->syscall_buf_sz = val;
+
+ *ppos += cnt;
+
+ return cnt;
+}
+
+static ssize_t
tracing_entries_read(struct file *filp, char __user *ubuf,
size_t cnt, loff_t *ppos)
{
@@ -7145,7 +7224,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
struct trace_array *tr = inode->i_private;
/* disable tracing ? */
- if (tr->trace_flags & TRACE_ITER_STOP_ON_FREE)
+ if (tr->trace_flags & TRACE_ITER(STOP_ON_FREE))
tracer_tracing_off(tr);
/* resize the ring buffer to 0 */
tracing_resize_ring_buffer(tr, 0, RING_BUFFER_ALL_CPUS);
@@ -7223,52 +7302,43 @@ struct trace_user_buf {
char *buf;
};
-struct trace_user_buf_info {
- struct trace_user_buf __percpu *tbuf;
- int ref;
-};
-
-
static DEFINE_MUTEX(trace_user_buffer_mutex);
static struct trace_user_buf_info *trace_user_buffer;
-static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+/**
+ * trace_user_fault_destroy - free up allocated memory of a trace user buffer
+ * @tinfo: The descriptor to free up
+ *
+ * Frees any data allocated in the trace info dsecriptor.
+ */
+void trace_user_fault_destroy(struct trace_user_buf_info *tinfo)
{
char *buf;
int cpu;
+ if (!tinfo || !tinfo->tbuf)
+ return;
+
for_each_possible_cpu(cpu) {
buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
kfree(buf);
}
free_percpu(tinfo->tbuf);
- kfree(tinfo);
}
-static int trace_user_fault_buffer_enable(void)
+static int user_fault_buffer_enable(struct trace_user_buf_info *tinfo, size_t size)
{
- struct trace_user_buf_info *tinfo;
char *buf;
int cpu;
- guard(mutex)(&trace_user_buffer_mutex);
-
- if (trace_user_buffer) {
- trace_user_buffer->ref++;
- return 0;
- }
-
- tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
- if (!tinfo)
- return -ENOMEM;
+ lockdep_assert_held(&trace_user_buffer_mutex);
tinfo->tbuf = alloc_percpu(struct trace_user_buf);
- if (!tinfo->tbuf) {
- kfree(tinfo);
+ if (!tinfo->tbuf)
return -ENOMEM;
- }
tinfo->ref = 1;
+ tinfo->size = size;
/* Clear each buffer in case of error */
for_each_possible_cpu(cpu) {
@@ -7276,42 +7346,165 @@ static int trace_user_fault_buffer_enable(void)
}
for_each_possible_cpu(cpu) {
- buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ buf = kmalloc_node(size, GFP_KERNEL,
cpu_to_node(cpu));
- if (!buf) {
- trace_user_fault_buffer_free(tinfo);
+ if (!buf)
return -ENOMEM;
- }
per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
}
- trace_user_buffer = tinfo;
-
return 0;
}
-static void trace_user_fault_buffer_disable(void)
+/* For internal use. Free and reinitialize */
+static void user_buffer_free(struct trace_user_buf_info **tinfo)
{
- struct trace_user_buf_info *tinfo;
+ lockdep_assert_held(&trace_user_buffer_mutex);
- guard(mutex)(&trace_user_buffer_mutex);
+ trace_user_fault_destroy(*tinfo);
+ kfree(*tinfo);
+ *tinfo = NULL;
+}
- tinfo = trace_user_buffer;
+/* For internal use. Initialize and allocate */
+static int user_buffer_init(struct trace_user_buf_info **tinfo, size_t size)
+{
+ bool alloc = false;
+ int ret;
+
+ lockdep_assert_held(&trace_user_buffer_mutex);
- if (WARN_ON_ONCE(!tinfo))
+ if (!*tinfo) {
+ alloc = true;
+ *tinfo = kzalloc(sizeof(**tinfo), GFP_KERNEL);
+ if (!*tinfo)
+ return -ENOMEM;
+ }
+
+ ret = user_fault_buffer_enable(*tinfo, size);
+ if (ret < 0 && alloc)
+ user_buffer_free(tinfo);
+
+ return ret;
+}
+
+/* For internal use, derefrence and free if necessary */
+static void user_buffer_put(struct trace_user_buf_info **tinfo)
+{
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (WARN_ON_ONCE(!*tinfo || !(*tinfo)->ref))
return;
- if (--tinfo->ref)
+ if (--(*tinfo)->ref)
return;
- trace_user_fault_buffer_free(tinfo);
- trace_user_buffer = NULL;
+ user_buffer_free(tinfo);
+}
+
+/**
+ * trace_user_fault_init - Allocated or reference a per CPU buffer
+ * @tinfo: A pointer to the trace buffer descriptor
+ * @size: The size to allocate each per CPU buffer
+ *
+ * Create a per CPU buffer that can be used to copy from user space
+ * in a task context. When calling trace_user_fault_read(), preemption
+ * must be disabled, and it will enable preemption and copy user
+ * space data to the buffer. If any schedule switches occur, it will
+ * retry until it succeeds without a schedule switch knowing the buffer
+ * is still valid.
+ *
+ * Returns 0 on success, negative on failure.
+ */
+int trace_user_fault_init(struct trace_user_buf_info *tinfo, size_t size)
+{
+ int ret;
+
+ if (!tinfo)
+ return -EINVAL;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ ret = user_buffer_init(&tinfo, size);
+ if (ret < 0)
+ trace_user_fault_destroy(tinfo);
+
+ return ret;
+}
+
+/**
+ * trace_user_fault_get - up the ref count for the user buffer
+ * @tinfo: A pointer to a pointer to the trace buffer descriptor
+ *
+ * Ups the ref count of the trace buffer.
+ *
+ * Returns the new ref count.
+ */
+int trace_user_fault_get(struct trace_user_buf_info *tinfo)
+{
+ if (!tinfo)
+ return -1;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo->ref++;
+ return tinfo->ref;
+}
+
+/**
+ * trace_user_fault_put - dereference a per cpu trace buffer
+ * @tinfo: The @tinfo that was passed to trace_user_fault_get()
+ *
+ * Decrement the ref count of @tinfo.
+ *
+ * Returns the new refcount (negative on error).
+ */
+int trace_user_fault_put(struct trace_user_buf_info *tinfo)
+{
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (WARN_ON_ONCE(!tinfo || !tinfo->ref))
+ return -1;
+
+ --tinfo->ref;
+ return tinfo->ref;
}
-/* Must be called with preemption disabled */
-static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
- const char __user *ptr, size_t size,
- size_t *read_size)
+/**
+ * trace_user_fault_read - Read user space into a per CPU buffer
+ * @tinfo: The @tinfo allocated by trace_user_fault_get()
+ * @ptr: The user space pointer to read
+ * @size: The size of user space to read.
+ * @copy_func: Optional function to use to copy from user space
+ * @data: Data to pass to copy_func if it was supplied
+ *
+ * Preemption must be disabled when this is called, and must not
+ * be enabled while using the returned buffer.
+ * This does the copying from user space into a per CPU buffer.
+ *
+ * The @size must not be greater than the size passed in to
+ * trace_user_fault_init().
+ *
+ * If @copy_func is NULL, trace_user_fault_read() will use copy_from_user(),
+ * otherwise it will call @copy_func. It will call @copy_func with:
+ *
+ * buffer: the per CPU buffer of the @tinfo.
+ * ptr: The pointer @ptr to user space to read
+ * size: The @size of the ptr to read
+ * data: The @data parameter
+ *
+ * It is expected that @copy_func will return 0 on success and non zero
+ * if there was a fault.
+ *
+ * Returns a pointer to the buffer with the content read from @ptr.
+ * Preemption must remain disabled while the caller accesses the
+ * buffer returned by this function.
+ * Returns NULL if there was a fault, or the size passed in is
+ * greater than the size passed to trace_user_fault_init().
+ */
+char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ trace_user_buf_copy copy_func, void *data)
{
int cpu = smp_processor_id();
char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
@@ -7319,9 +7512,14 @@ static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
int trys = 0;
int ret;
- if (size > TRACE_MARKER_MAX_SIZE)
- size = TRACE_MARKER_MAX_SIZE;
- *read_size = 0;
+ lockdep_assert_preemption_disabled();
+
+ /*
+ * It's up to the caller to not try to copy more than it said
+ * it would.
+ */
+ if (size > tinfo->size)
+ return NULL;
/*
* This acts similar to a seqcount. The per CPU context switches are
@@ -7361,7 +7559,14 @@ static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
*/
preempt_enable_notrace();
- ret = __copy_from_user(buffer, ptr, size);
+ /* Make sure preemption is enabled here */
+ lockdep_assert_preemption_enabled();
+
+ if (copy_func) {
+ ret = copy_func(buffer, ptr, size, data);
+ } else {
+ ret = __copy_from_user(buffer, ptr, size);
+ }
preempt_disable_notrace();
migrate_enable();
@@ -7378,7 +7583,6 @@ static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
*/
} while (nr_context_switches_cpu(cpu) != cnt);
- *read_size = size;
return buffer;
}
@@ -7389,13 +7593,12 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
- size_t size;
char *buf;
if (tracing_disabled)
return -EINVAL;
- if (!(tr->trace_flags & TRACE_ITER_MARKERS))
+ if (!(tr->trace_flags & TRACE_ITER(MARKERS)))
return -EINVAL;
if ((ssize_t)cnt < 0)
@@ -7407,13 +7610,10 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
/* Must have preemption disabled while having access to the buffer */
guard(preempt_notrace)();
- buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, NULL, NULL);
if (!buf)
return -EFAULT;
- if (cnt > size)
- cnt = size;
-
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7442,7 +7642,7 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
size_t size;
/* cnt includes both the entry->id and the data behind it. */
- size = struct_size(entry, buf, cnt - sizeof(entry->id));
+ size = struct_offset(entry, id) + cnt;
buffer = tr->array_buffer.buffer;
@@ -7473,30 +7673,29 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
- size_t size;
char *buf;
if (tracing_disabled)
return -EINVAL;
- if (!(tr->trace_flags & TRACE_ITER_MARKERS))
+ if (!(tr->trace_flags & TRACE_ITER(MARKERS)))
return -EINVAL;
/* The marker must at least have a tag id */
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* raw write is all or nothing */
+ if (cnt > TRACE_MARKER_MAX_SIZE)
+ return -EINVAL;
+
/* Must have preemption disabled while having access to the buffer */
guard(preempt_notrace)();
- buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, NULL, NULL);
if (!buf)
return -EFAULT;
- /* raw write is all or nothing */
- if (cnt > size)
- return -EINVAL;
-
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7516,20 +7715,26 @@ static int tracing_mark_open(struct inode *inode, struct file *filp)
{
int ret;
- ret = trace_user_fault_buffer_enable();
- if (ret < 0)
- return ret;
+ scoped_guard(mutex, &trace_user_buffer_mutex) {
+ if (!trace_user_buffer) {
+ ret = user_buffer_init(&trace_user_buffer, TRACE_MARKER_MAX_SIZE);
+ if (ret < 0)
+ return ret;
+ } else {
+ trace_user_buffer->ref++;
+ }
+ }
stream_open(inode, filp);
ret = tracing_open_generic_tr(inode, filp);
if (ret < 0)
- trace_user_fault_buffer_disable();
+ user_buffer_put(&trace_user_buffer);
return ret;
}
static int tracing_mark_release(struct inode *inode, struct file *file)
{
- trace_user_fault_buffer_disable();
+ user_buffer_put(&trace_user_buffer);
return tracing_release_generic_tr(inode, file);
}
@@ -7917,6 +8122,14 @@ static const struct file_operations tracing_entries_fops = {
.release = tracing_release_generic_tr,
};
+static const struct file_operations tracing_syscall_buf_fops = {
+ .open = tracing_open_generic_tr,
+ .read = tracing_syscall_buf_read,
+ .write = tracing_syscall_buf_write,
+ .llseek = generic_file_llseek,
+ .release = tracing_release_generic_tr,
+};
+
static const struct file_operations tracing_buffer_meta_fops = {
.open = tracing_buffer_meta_open,
.read = seq_read,
@@ -8801,8 +9014,8 @@ static int tracing_buffers_mmap(struct file *filp, struct vm_area_struct *vma)
struct trace_iterator *iter = &info->iter;
int ret = 0;
- /* A memmap'ed buffer is not supported for user space mmap */
- if (iter->tr->flags & TRACE_ARRAY_FL_MEMMAP)
+ /* A memmap'ed and backup buffers are not supported for user space mmap */
+ if (iter->tr->flags & (TRACE_ARRAY_FL_MEMMAP | TRACE_ARRAY_FL_VMALLOC))
return -ENODEV;
ret = get_snapshot_map(iter->tr);
@@ -9315,7 +9528,7 @@ trace_options_core_read(struct file *filp, char __user *ubuf, size_t cnt,
get_tr_index(tr_index, &tr, &index);
- if (tr->trace_flags & (1 << index))
+ if (tr->trace_flags & (1ULL << index))
buf = "1\n";
else
buf = "0\n";
@@ -9344,7 +9557,7 @@ trace_options_core_write(struct file *filp, const char __user *ubuf, size_t cnt,
mutex_lock(&event_mutex);
mutex_lock(&trace_types_lock);
- ret = set_tracer_flag(tr, 1 << index, val);
+ ret = set_tracer_flag(tr, 1ULL << index, val);
mutex_unlock(&trace_types_lock);
mutex_unlock(&event_mutex);
@@ -9417,39 +9630,19 @@ create_trace_option_file(struct trace_array *tr,
topt->entry = trace_create_file(opt->name, TRACE_MODE_WRITE,
t_options, topt, &trace_options_fops);
-
}
-static void
-create_trace_option_files(struct trace_array *tr, struct tracer *tracer)
+static int
+create_trace_option_files(struct trace_array *tr, struct tracer *tracer,
+ struct tracer_flags *flags)
{
struct trace_option_dentry *topts;
struct trace_options *tr_topts;
- struct tracer_flags *flags;
struct tracer_opt *opts;
int cnt;
- int i;
-
- if (!tracer)
- return;
-
- flags = tracer->flags;
if (!flags || !flags->opts)
- return;
-
- /*
- * If this is an instance, only create flags for tracers
- * the instance may have.
- */
- if (!trace_ok_for_array(tracer, tr))
- return;
-
- for (i = 0; i < tr->nr_topts; i++) {
- /* Make sure there's no duplicate flags. */
- if (WARN_ON_ONCE(tr->topts[i].tracer->flags == tracer->flags))
- return;
- }
+ return 0;
opts = flags->opts;
@@ -9458,13 +9651,13 @@ create_trace_option_files(struct trace_array *tr, struct tracer *tracer)
topts = kcalloc(cnt + 1, sizeof(*topts), GFP_KERNEL);
if (!topts)
- return;
+ return 0;
tr_topts = krealloc(tr->topts, sizeof(*tr->topts) * (tr->nr_topts + 1),
GFP_KERNEL);
if (!tr_topts) {
kfree(topts);
- return;
+ return -ENOMEM;
}
tr->topts = tr_topts;
@@ -9479,6 +9672,97 @@ create_trace_option_files(struct trace_array *tr, struct tracer *tracer)
"Failed to create trace option: %s",
opts[cnt].name);
}
+ return 0;
+}
+
+static int get_global_flags_val(struct tracer *tracer)
+{
+ struct tracers *t;
+
+ list_for_each_entry(t, &global_trace.tracers, list) {
+ if (t->tracer != tracer)
+ continue;
+ if (!t->flags)
+ return -1;
+ return t->flags->val;
+ }
+ return -1;
+}
+
+static int add_tracer_options(struct trace_array *tr, struct tracers *t)
+{
+ struct tracer *tracer = t->tracer;
+ struct tracer_flags *flags = t->flags ?: tracer->flags;
+
+ if (!flags)
+ return 0;
+
+ /* Only add tracer options after update_tracer_options finish */
+ if (!tracer_options_updated)
+ return 0;
+
+ return create_trace_option_files(tr, tracer, flags);
+}
+
+static int add_tracer(struct trace_array *tr, struct tracer *tracer)
+{
+ struct tracer_flags *flags;
+ struct tracers *t;
+ int ret;
+
+ /* Only enable if the directory has been created already. */
+ if (!tr->dir && !(tr->flags & TRACE_ARRAY_FL_GLOBAL))
+ return 0;
+
+ /*
+ * If this is an instance, only create flags for tracers
+ * the instance may have.
+ */
+ if (!trace_ok_for_array(tracer, tr))
+ return 0;
+
+ t = kmalloc(sizeof(*t), GFP_KERNEL);
+ if (!t)
+ return -ENOMEM;
+
+ t->tracer = tracer;
+ t->flags = NULL;
+ list_add(&t->list, &tr->tracers);
+
+ flags = tracer->flags;
+ if (!flags) {
+ if (!tracer->default_flags)
+ return 0;
+
+ /*
+ * If the tracer defines default flags, it means the flags are
+ * per trace instance.
+ */
+ flags = kmalloc(sizeof(*flags), GFP_KERNEL);
+ if (!flags)
+ return -ENOMEM;
+
+ *flags = *tracer->default_flags;
+ flags->trace = tracer;
+
+ t->flags = flags;
+
+ /* If this is an instance, inherit the global_trace flags */
+ if (!(tr->flags & TRACE_ARRAY_FL_GLOBAL)) {
+ int val = get_global_flags_val(tracer);
+ if (!WARN_ON_ONCE(val < 0))
+ flags->val = val;
+ }
+ }
+
+ ret = add_tracer_options(tr, t);
+ if (ret < 0) {
+ list_del(&t->list);
+ kfree(t->flags);
+ kfree(t);
+ }
+
+ return ret;
}
static struct dentry *
@@ -9508,8 +9792,9 @@ static void create_trace_options_dir(struct trace_array *tr)
for (i = 0; trace_options[i]; i++) {
if (top_level ||
- !((1 << i) & TOP_LEVEL_TRACE_FLAGS))
+ !((1ULL << i) & TOP_LEVEL_TRACE_FLAGS)) {
create_trace_option_core_file(tr, trace_options[i], i);
+ }
}
}
@@ -9830,7 +10115,7 @@ allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, int size
struct trace_scratch *tscratch;
unsigned int scratch_size = 0;
- rb_flags = tr->trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
+ rb_flags = tr->trace_flags & TRACE_ITER(OVERWRITE) ? RB_FL_OVERWRITE : 0;
buf->tr = tr;
@@ -9928,19 +10213,39 @@ static void init_trace_flags_index(struct trace_array *tr)
tr->trace_flags_index[i] = i;
}
-static void __update_tracer_options(struct trace_array *tr)
+static int __update_tracer(struct trace_array *tr)
{
struct tracer *t;
+ int ret = 0;
+
+ for (t = trace_types; t && !ret; t = t->next)
+ ret = add_tracer(tr, t);
- for (t = trace_types; t; t = t->next)
- add_tracer_options(tr, t);
+ return ret;
+}
+
+static __init int __update_tracer_options(struct trace_array *tr)
+{
+ struct tracers *t;
+ int ret = 0;
+
+ list_for_each_entry(t, &tr->tracers, list) {
+ ret = add_tracer_options(tr, t);
+ if (ret < 0)
+ break;
+ }
+
+ return ret;
}
-static void update_tracer_options(struct trace_array *tr)
+static __init void update_tracer_options(void)
{
+ struct trace_array *tr;
+
guard(mutex)(&trace_types_lock);
tracer_options_updated = true;
- __update_tracer_options(tr);
+ list_for_each_entry(tr, &ftrace_trace_arrays, list)
+ __update_tracer_options(tr);
}
/* Must have trace_types_lock held */
@@ -9985,9 +10290,13 @@ static int trace_array_create_dir(struct trace_array *tr)
}
init_tracer_tracefs(tr, tr->dir);
- __update_tracer_options(tr);
-
- return ret;
+ ret = __update_tracer(tr);
+ if (ret) {
+ event_trace_del_tracer(tr);
+ tracefs_remove(tr->dir);
+ return ret;
+ }
+ return 0;
}
static struct trace_array *
@@ -10029,16 +10338,20 @@ trace_array_create_systems(const char *name, const char *systems,
raw_spin_lock_init(&tr->start_lock);
+ tr->syscall_buf_sz = global_trace.syscall_buf_sz;
+
tr->max_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
#ifdef CONFIG_TRACER_MAX_TRACE
spin_lock_init(&tr->snapshot_trigger_lock);
#endif
tr->current_trace = &nop_trace;
+ tr->current_trace_flags = nop_trace.flags;
INIT_LIST_HEAD(&tr->systems);
INIT_LIST_HEAD(&tr->events);
INIT_LIST_HEAD(&tr->hist_vars);
INIT_LIST_HEAD(&tr->err_log);
+ INIT_LIST_HEAD(&tr->tracers);
INIT_LIST_HEAD(&tr->marker_list);
#ifdef CONFIG_MODULES
@@ -10193,7 +10506,7 @@ static int __remove_instance(struct trace_array *tr)
/* Disable all the flags that were enabled coming in */
for (i = 0; i < TRACE_FLAGS_MAX_SIZE; i++) {
if ((1 << i) & ZEROED_TRACE_FLAGS)
- set_tracer_flag(tr, 1 << i, 0);
+ set_tracer_flag(tr, 1ULL << i, 0);
}
if (printk_trace == tr)
@@ -10211,11 +10524,14 @@ static int __remove_instance(struct trace_array *tr)
free_percpu(tr->last_func_repeats);
free_trace_buffers(tr);
clear_tracing_err_log(tr);
+ free_tracers(tr);
if (tr->range_name) {
reserve_mem_release_by_name(tr->range_name);
kfree(tr->range_name);
}
+ if (tr->flags & TRACE_ARRAY_FL_VMALLOC)
+ vfree((void *)tr->range_addr_start);
for (i = 0; i < tr->nr_topts; i++) {
kfree(tr->topts[i].topts);
@@ -10345,6 +10661,9 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer,
tr, &buffer_subbuf_size_fops);
+ trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer,
+ tr, &tracing_syscall_buf_fops);
+
create_trace_options_dir(tr);
#ifdef CONFIG_TRACER_MAX_TRACE
@@ -10630,7 +10949,7 @@ static __init void tracer_init_tracefs_work_func(struct work_struct *work)
create_trace_instances(NULL);
- update_tracer_options(&global_trace);
+ update_tracer_options();
}
static __init int tracer_init_tracefs(void)
@@ -10783,10 +11102,10 @@ static void ftrace_dump_one(struct trace_array *tr, enum ftrace_dump_mode dump_m
/* While dumping, do not allow the buffer to be enable */
tracer_tracing_disable(tr);
- old_userobj = tr->trace_flags & TRACE_ITER_SYM_USEROBJ;
+ old_userobj = tr->trace_flags & TRACE_ITER(SYM_USEROBJ);
/* don't look at user memory in panic mode */
- tr->trace_flags &= ~TRACE_ITER_SYM_USEROBJ;
+ tr->trace_flags &= ~TRACE_ITER(SYM_USEROBJ);
if (dump_mode == DUMP_ORIG)
iter.cpu_file = raw_smp_processor_id();
@@ -11018,6 +11337,42 @@ __init static void do_allocate_snapshot(const char *name)
static inline void do_allocate_snapshot(const char *name) { }
#endif
+__init static int backup_instance_area(const char *backup,
+ unsigned long *addr, phys_addr_t *size)
+{
+ struct trace_array *backup_tr;
+ void *allocated_vaddr = NULL;
+
+ backup_tr = trace_array_get_by_name(backup, NULL);
+ if (!backup_tr) {
+ pr_warn("Tracing: Instance %s is not found.\n", backup);
+ return -ENOENT;
+ }
+
+ if (!(backup_tr->flags & TRACE_ARRAY_FL_BOOT)) {
+ pr_warn("Tracing: Instance %s is not boot mapped.\n", backup);
+ trace_array_put(backup_tr);
+ return -EINVAL;
+ }
+
+ *size = backup_tr->range_addr_size;
+
+ allocated_vaddr = vzalloc(*size);
+ if (!allocated_vaddr) {
+ pr_warn("Tracing: Failed to allocate memory for copying instance %s (size 0x%lx)\n",
+ backup, (unsigned long)*size);
+ trace_array_put(backup_tr);
+ return -ENOMEM;
+ }
+
+ memcpy(allocated_vaddr,
+ (void *)backup_tr->range_addr_start, (size_t)*size);
+ *addr = (unsigned long)allocated_vaddr;
+
+ trace_array_put(backup_tr);
+ return 0;
+}
+
__init static void enable_instances(void)
{
struct trace_array *tr;
@@ -11040,11 +11395,15 @@ __init static void enable_instances(void)
char *flag_delim;
char *addr_delim;
char *rname __free(kfree) = NULL;
+ char *backup;
tok = strsep(&curr_str, ",");
- flag_delim = strchr(tok, '^');
- addr_delim = strchr(tok, '@');
+ name = strsep(&tok, "=");
+ backup = tok;
+
+ flag_delim = strchr(name, '^');
+ addr_delim = strchr(name, '@');
if (addr_delim)
*addr_delim++ = '\0';
@@ -11052,7 +11411,10 @@ __init static void enable_instances(void)
if (flag_delim)
*flag_delim++ = '\0';
- name = tok;
+ if (backup) {
+ if (backup_instance_area(backup, &addr, &size) < 0)
+ continue;
+ }
if (flag_delim) {
char *flag;
@@ -11148,7 +11510,13 @@ __init static void enable_instances(void)
tr->ref++;
}
- if (start) {
+ /*
+ * Backup buffers can be freed but need vfree().
+ */
+ if (backup)
+ tr->flags |= TRACE_ARRAY_FL_VMALLOC;
+
+ if (start || backup) {
tr->flags |= TRACE_ARRAY_FL_BOOT | TRACE_ARRAY_FL_LAST_BOOT;
tr->range_name = no_free_ptr(rname);
}
@@ -11242,6 +11610,7 @@ __init static int tracer_alloc_buffers(void)
* just a bootstrap of current_trace anyway.
*/
global_trace.current_trace = &nop_trace;
+ global_trace.current_trace_flags = nop_trace.flags;
global_trace.max_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
#ifdef CONFIG_TRACER_MAX_TRACE
@@ -11255,10 +11624,7 @@ __init static int tracer_alloc_buffers(void)
init_trace_flags_index(&global_trace);
- register_tracer(&nop_trace);
-
- /* Function tracing may start here (via kernel command line) */
- init_function_trace();
+ INIT_LIST_HEAD(&global_trace.tracers);
/* All seems OK, enable tracing */
tracing_disabled = 0;
@@ -11270,6 +11636,8 @@ __init static int tracer_alloc_buffers(void)
global_trace.flags = TRACE_ARRAY_FL_GLOBAL;
+ global_trace.syscall_buf_sz = syscall_buf_size;
+
INIT_LIST_HEAD(&global_trace.systems);
INIT_LIST_HEAD(&global_trace.events);
INIT_LIST_HEAD(&global_trace.hist_vars);
@@ -11277,6 +11645,11 @@ __init static int tracer_alloc_buffers(void)
list_add(&global_trace.marker_list, &marker_copies);
list_add(&global_trace.list, &ftrace_trace_arrays);
+ register_tracer(&nop_trace);
+
+ /* Function tracing may start here (via kernel command line) */
+ init_function_trace();
+
apply_trace_boot_options();
register_snapshot_cmd();
@@ -11300,7 +11673,7 @@ out_free_buffer_mask:
#ifdef CONFIG_FUNCTION_TRACER
/* Used to set module cached ftrace filtering at boot up */
-__init struct trace_array *trace_get_global_array(void)
+struct trace_array *trace_get_global_array(void)
{
return &global_trace;
}
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 85eabb454bee..c2b61bcd912f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -22,6 +22,7 @@
#include <linux/ctype.h>
#include <linux/once_lite.h>
#include <linux/ftrace_regs.h>
+#include <linux/llist.h>
#include "pid_list.h"
@@ -131,6 +132,8 @@ enum trace_type {
#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
#define HIST_STACKTRACE_SKIP 5
+#define SYSCALL_FAULT_USER_MAX 165
+
/*
* syscalls are special, and need special handling, this is why
* they are not included in trace_entries.h
@@ -216,7 +219,7 @@ struct array_buffer {
int cpu;
};
-#define TRACE_FLAGS_MAX_SIZE 32
+#define TRACE_FLAGS_MAX_SIZE 64
struct trace_options {
struct tracer *tracer;
@@ -390,7 +393,8 @@ struct trace_array {
int buffer_percent;
unsigned int n_err_log_entries;
struct tracer *current_trace;
- unsigned int trace_flags;
+ struct tracer_flags *current_trace_flags;
+ u64 trace_flags;
unsigned char trace_flags_index[TRACE_FLAGS_MAX_SIZE];
unsigned int flags;
raw_spinlock_t start_lock;
@@ -404,6 +408,7 @@ struct trace_array {
struct list_head systems;
struct list_head events;
struct list_head marker_list;
+ struct list_head tracers;
struct trace_event_file *trace_marker_file;
cpumask_var_t tracing_cpumask; /* only trace on set CPUs */
/* one per_cpu trace_pipe can be opened by only one user */
@@ -430,6 +435,7 @@ struct trace_array {
int function_enabled;
#endif
int no_filter_buffering_ref;
+ unsigned int syscall_buf_sz;
struct list_head hist_vars;
#ifdef CONFIG_TRACER_SNAPSHOT
struct cond_snapshot *cond_snapshot;
@@ -448,6 +454,7 @@ enum {
TRACE_ARRAY_FL_LAST_BOOT = BIT(2),
TRACE_ARRAY_FL_MOD_INIT = BIT(3),
TRACE_ARRAY_FL_MEMMAP = BIT(4),
+ TRACE_ARRAY_FL_VMALLOC = BIT(5),
};
#ifdef CONFIG_MODULES
@@ -631,9 +638,10 @@ struct tracer {
u32 old_flags, u32 bit, int set);
/* Return 0 if OK with change, else return non-zero */
int (*flag_changed)(struct trace_array *tr,
- u32 mask, int set);
+ u64 mask, int set);
struct tracer *next;
struct tracer_flags *flags;
+ struct tracer_flags *default_flags;
int enabled;
bool print_max;
bool allow_instances;
@@ -937,8 +945,6 @@ static __always_inline bool ftrace_hash_empty(struct ftrace_hash *hash)
#define TRACE_GRAPH_PRINT_FILL_SHIFT 28
#define TRACE_GRAPH_PRINT_FILL_MASK (0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
-extern void ftrace_graph_sleep_time_control(bool enable);
-
#ifdef CONFIG_FUNCTION_PROFILER
extern void ftrace_graph_graph_time_control(bool enable);
#else
@@ -958,7 +964,8 @@ extern int __trace_graph_entry(struct trace_array *tr,
extern int __trace_graph_retaddr_entry(struct trace_array *tr,
struct ftrace_graph_ent *trace,
unsigned int trace_ctx,
- unsigned long retaddr);
+ unsigned long retaddr,
+ struct ftrace_regs *fregs);
extern void __trace_graph_return(struct trace_array *tr,
struct ftrace_graph_ret *trace,
unsigned int trace_ctx,
@@ -1109,7 +1116,8 @@ static inline void ftrace_graph_addr_finish(struct fgraph_ops *gops, struct ftra
#endif /* CONFIG_DYNAMIC_FTRACE */
extern unsigned int fgraph_max_depth;
-extern bool fgraph_sleep_time;
+extern int fgraph_no_sleep_time;
+extern bool fprofile_no_sleep_time;
static inline bool
ftrace_graph_ignore_func(struct fgraph_ops *gops, struct ftrace_graph_ent *trace)
@@ -1345,11 +1353,11 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
# define FUNCTION_FLAGS \
C(FUNCTION, "function-trace"), \
C(FUNC_FORK, "function-fork"),
-# define FUNCTION_DEFAULT_FLAGS TRACE_ITER_FUNCTION
+# define FUNCTION_DEFAULT_FLAGS TRACE_ITER(FUNCTION)
#else
# define FUNCTION_FLAGS
# define FUNCTION_DEFAULT_FLAGS 0UL
-# define TRACE_ITER_FUNC_FORK 0UL
+# define TRACE_ITER_FUNC_FORK_BIT -1
#endif
#ifdef CONFIG_STACKTRACE
@@ -1359,6 +1367,24 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
# define STACK_FLAGS
#endif
+#ifdef CONFIG_FUNCTION_PROFILER
+# define PROFILER_FLAGS \
+ C(PROF_TEXT_OFFSET, "prof-text-offset"),
+# ifdef CONFIG_FUNCTION_GRAPH_TRACER
+# define FPROFILE_FLAGS \
+ C(GRAPH_TIME, "graph-time"),
+# define FPROFILE_DEFAULT_FLAGS TRACE_ITER(GRAPH_TIME)
+# else
+# define FPROFILE_FLAGS
+# define FPROFILE_DEFAULT_FLAGS 0UL
+# endif
+#else
+# define PROFILER_FLAGS
+# define FPROFILE_FLAGS
+# define FPROFILE_DEFAULT_FLAGS 0UL
+# define TRACE_ITER_PROF_TEXT_OFFSET_BIT -1
+#endif
+
/*
* trace_iterator_flags is an enumeration that defines bit
* positions into trace_flags that controls the output.
@@ -1391,13 +1417,15 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
C(MARKERS, "markers"), \
C(EVENT_FORK, "event-fork"), \
C(TRACE_PRINTK, "trace_printk_dest"), \
- C(COPY_MARKER, "copy_trace_marker"),\
+ C(COPY_MARKER, "copy_trace_marker"), \
C(PAUSE_ON_TRACE, "pause-on-trace"), \
C(HASH_PTR, "hash-ptr"), /* Print hashed pointer */ \
FUNCTION_FLAGS \
FGRAPH_FLAGS \
STACK_FLAGS \
- BRANCH_FLAGS
+ BRANCH_FLAGS \
+ PROFILER_FLAGS \
+ FPROFILE_FLAGS
/*
* By defining C, we can make TRACE_FLAGS a list of bit names
@@ -1413,20 +1441,17 @@ enum trace_iterator_bits {
};
/*
- * By redefining C, we can make TRACE_FLAGS a list of masks that
- * use the bits as defined above.
+ * And use TRACE_ITER(flag) to define the bit masks.
*/
-#undef C
-#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
-
-enum trace_iterator_flags { TRACE_FLAGS };
+#define TRACE_ITER(flag) \
+ (TRACE_ITER_##flag##_BIT < 0 ? 0 : 1ULL << (TRACE_ITER_##flag##_BIT))
/*
* TRACE_ITER_SYM_MASK masks the options in trace_flags that
* control the output of kernel symbols.
*/
#define TRACE_ITER_SYM_MASK \
- (TRACE_ITER_PRINT_PARENT|TRACE_ITER_SYM_OFFSET|TRACE_ITER_SYM_ADDR)
+ (TRACE_ITER(PRINT_PARENT)|TRACE_ITER(SYM_OFFSET)|TRACE_ITER(SYM_ADDR))
extern struct tracer nop_trace;
@@ -1435,7 +1460,7 @@ extern int enable_branch_tracing(struct trace_array *tr);
extern void disable_branch_tracing(void);
static inline int trace_branch_enable(struct trace_array *tr)
{
- if (tr->trace_flags & TRACE_ITER_BRANCH)
+ if (tr->trace_flags & TRACE_ITER(BRANCH))
return enable_branch_tracing(tr);
return 0;
}
@@ -1531,6 +1556,23 @@ void trace_buffered_event_enable(void);
void early_enable_events(struct trace_array *tr, char *buf, bool disable_first);
+struct trace_user_buf;
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ size_t size;
+ int ref;
+};
+
+typedef int (*trace_user_buf_copy)(char *dst, const char __user *src,
+ size_t size, void *data);
+int trace_user_fault_init(struct trace_user_buf_info *tinfo, size_t size);
+int trace_user_fault_get(struct trace_user_buf_info *tinfo);
+int trace_user_fault_put(struct trace_user_buf_info *tinfo);
+void trace_user_fault_destroy(struct trace_user_buf_info *tinfo);
+char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ trace_user_buf_copy copy_func, void *data);
+
static inline void
__trace_event_discard_commit(struct trace_buffer *buffer,
struct ring_buffer_event *event)
@@ -1752,13 +1794,13 @@ extern void clear_event_triggers(struct trace_array *tr);
enum {
EVENT_TRIGGER_FL_PROBE = BIT(0),
+ EVENT_TRIGGER_FL_COUNT = BIT(1),
};
struct event_trigger_data {
unsigned long count;
int ref;
int flags;
- const struct event_trigger_ops *ops;
struct event_command *cmd_ops;
struct event_filter __rcu *filter;
char *filter_str;
@@ -1769,6 +1811,7 @@ struct event_trigger_data {
char *name;
struct list_head named_list;
struct event_trigger_data *named_data;
+ struct llist_node llist;
};
/* Avoid typos */
@@ -1783,6 +1826,10 @@ struct enable_trigger_data {
bool hist;
};
+bool event_trigger_count(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event);
+
extern int event_enable_trigger_print(struct seq_file *m,
struct event_trigger_data *data);
extern void event_enable_trigger_free(struct event_trigger_data *data);
@@ -1846,64 +1893,6 @@ extern void event_file_get(struct trace_event_file *file);
extern void event_file_put(struct trace_event_file *file);
/**
- * struct event_trigger_ops - callbacks for trace event triggers
- *
- * The methods in this structure provide per-event trigger hooks for
- * various trigger operations.
- *
- * The @init and @free methods are used during trigger setup and
- * teardown, typically called from an event_command's @parse()
- * function implementation.
- *
- * The @print method is used to print the trigger spec.
- *
- * The @trigger method is the function that actually implements the
- * trigger and is called in the context of the triggering event
- * whenever that event occurs.
- *
- * All the methods below, except for @init() and @free(), must be
- * implemented.
- *
- * @trigger: The trigger 'probe' function called when the triggering
- * event occurs. The data passed into this callback is the data
- * that was supplied to the event_command @reg() function that
- * registered the trigger (see struct event_command) along with
- * the trace record, rec.
- *
- * @init: An optional initialization function called for the trigger
- * when the trigger is registered (via the event_command reg()
- * function). This can be used to perform per-trigger
- * initialization such as incrementing a per-trigger reference
- * count, for instance. This is usually implemented by the
- * generic utility function @event_trigger_init() (see
- * trace_event_triggers.c).
- *
- * @free: An optional de-initialization function called for the
- * trigger when the trigger is unregistered (via the
- * event_command @reg() function). This can be used to perform
- * per-trigger de-initialization such as decrementing a
- * per-trigger reference count and freeing corresponding trigger
- * data, for instance. This is usually implemented by the
- * generic utility function @event_trigger_free() (see
- * trace_event_triggers.c).
- *
- * @print: The callback function invoked to have the trigger print
- * itself. This is usually implemented by a wrapper function
- * that calls the generic utility function @event_trigger_print()
- * (see trace_event_triggers.c).
- */
-struct event_trigger_ops {
- void (*trigger)(struct event_trigger_data *data,
- struct trace_buffer *buffer,
- void *rec,
- struct ring_buffer_event *rbe);
- int (*init)(struct event_trigger_data *data);
- void (*free)(struct event_trigger_data *data);
- int (*print)(struct seq_file *m,
- struct event_trigger_data *data);
-};
-
-/**
* struct event_command - callbacks and data members for event commands
*
* Event commands are invoked by users by writing the command name
@@ -1952,7 +1941,7 @@ struct event_trigger_ops {
*
* @reg: Adds the trigger to the list of triggers associated with the
* event, and enables the event trigger itself, after
- * initializing it (via the event_trigger_ops @init() function).
+ * initializing it (via the event_command @init() function).
* This is also where commands can use the @trigger_type value to
* make the decision as to whether or not multiple instances of
* the trigger should be allowed. This is usually implemented by
@@ -1961,7 +1950,7 @@ struct event_trigger_ops {
*
* @unreg: Removes the trigger from the list of triggers associated
* with the event, and disables the event trigger itself, after
- * initializing it (via the event_trigger_ops @free() function).
+ * initializing it (via the event_command @free() function).
* This is usually implemented by the generic utility function
* @unregister_trigger() (see trace_event_triggers.c).
*
@@ -1975,12 +1964,41 @@ struct event_trigger_ops {
* ignored. This is usually implemented by the generic utility
* function @set_trigger_filter() (see trace_event_triggers.c).
*
- * @get_trigger_ops: The callback function invoked to retrieve the
- * event_trigger_ops implementation associated with the command.
- * This callback function allows a single event_command to
- * support multiple trigger implementations via different sets of
- * event_trigger_ops, depending on the value of the @param
- * string.
+ * All the methods below, except for @init() and @free(), must be
+ * implemented.
+ *
+ * @trigger: The trigger 'probe' function called when the triggering
+ * event occurs. The data passed into this callback is the data
+ * that was supplied to the event_command @reg() function that
+ * registered the trigger (see struct event_command) along with
+ * the trace record, rec.
+ *
+ * @count_func: If defined and a numeric parameter is passed to the
+ * trigger, then this function will be called before @trigger
+ * is called. If this function returns false, then @trigger is not
+ * executed.
+ *
+ * @init: An optional initialization function called for the trigger
+ * when the trigger is registered (via the event_command reg()
+ * function). This can be used to perform per-trigger
+ * initialization such as incrementing a per-trigger reference
+ * count, for instance. This is usually implemented by the
+ * generic utility function @event_trigger_init() (see
+ * trace_event_triggers.c).
+ *
+ * @free: An optional de-initialization function called for the
+ * trigger when the trigger is unregistered (via the
+ * event_command @reg() function). This can be used to perform
+ * per-trigger de-initialization such as decrementing a
+ * per-trigger reference count and freeing corresponding trigger
+ * data, for instance. This is usually implemented by the
+ * generic utility function @event_trigger_free() (see
+ * trace_event_triggers.c).
+ *
+ * @print: The callback function invoked to have the trigger print
+ * itself. This is usually implemented by a wrapper function
+ * that calls the generic utility function @event_trigger_print()
+ * (see trace_event_triggers.c).
*/
struct event_command {
struct list_head list;
@@ -2001,7 +2019,18 @@ struct event_command {
int (*set_filter)(char *filter_str,
struct event_trigger_data *data,
struct trace_event_file *file);
- const struct event_trigger_ops *(*get_trigger_ops)(char *cmd, char *param);
+ void (*trigger)(struct event_trigger_data *data,
+ struct trace_buffer *buffer,
+ void *rec,
+ struct ring_buffer_event *rbe);
+ bool (*count_func)(struct event_trigger_data *data,
+ struct trace_buffer *buffer,
+ void *rec,
+ struct ring_buffer_event *rbe);
+ int (*init)(struct event_trigger_data *data);
+ void (*free)(struct event_trigger_data *data);
+ int (*print)(struct seq_file *m,
+ struct event_trigger_data *data);
};
/**
@@ -2022,7 +2051,7 @@ struct event_command {
* either committed or discarded. At that point, if any commands
* have deferred their triggers, those commands are finally
* invoked following the close of the current event. In other
- * words, if the event_trigger_ops @func() probe implementation
+ * words, if the event_command @func() probe implementation
* itself logs to the trace buffer, this flag should be set,
* otherwise it can be left unspecified.
*
@@ -2064,8 +2093,8 @@ extern const char *__stop___tracepoint_str[];
void trace_printk_control(bool enabled);
void trace_printk_start_comm(void);
-int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set);
-int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled);
+int trace_keep_overwrite(struct tracer *tracer, u64 mask, int set);
+int set_tracer_flag(struct trace_array *tr, u64 mask, int enabled);
/* Used from boot time tracer */
extern int trace_set_options(struct trace_array *tr, char *option);
@@ -2248,4 +2277,25 @@ static inline int rv_init_interface(void)
*/
#define FTRACE_TRAMPOLINE_MARKER ((unsigned long) INT_MAX)
+/*
+ * This is used to get the address of the args array based on
+ * the type of the entry.
+ */
+#define FGRAPH_ENTRY_ARGS(e) \
+ ({ \
+ unsigned long *_args; \
+ struct ftrace_graph_ent_entry *_e = e; \
+ \
+ if (IS_ENABLED(CONFIG_FUNCTION_GRAPH_RETADDR) && \
+ e->ent.type == TRACE_GRAPH_RETADDR_ENT) { \
+ struct fgraph_retaddr_ent_entry *_re; \
+ \
+ _re = (typeof(_re))_e; \
+ _args = _re->args; \
+ } else { \
+ _args = _e->args; \
+ } \
+ _args; \
+ })
+
#endif /* _LINUX_KERNEL_TRACE_H */
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index d06854bd32b3..c4dfbc293bae 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -144,9 +144,16 @@ static int create_dyn_event(const char *raw_command)
if (!ret || ret != -ECANCELED)
break;
}
- mutex_unlock(&dyn_event_ops_mutex);
- if (ret == -ECANCELED)
+ if (ret == -ECANCELED) {
+ static const char *err_msg[] = {"No matching dynamic event type"};
+
+ /* Wrong dynamic event. Leave an error message. */
+ tracing_log_err(NULL, "dynevent", raw_command, err_msg,
+ 0, 0);
ret = -EINVAL;
+ }
+
+ mutex_unlock(&dyn_event_ops_mutex);
return ret;
}
diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
index de294ae2c5c5..f6a8d29c0d76 100644
--- a/kernel/trace/trace_entries.h
+++ b/kernel/trace/trace_entries.h
@@ -80,11 +80,11 @@ FTRACE_ENTRY(funcgraph_entry, ftrace_graph_ent_entry,
F_STRUCT(
__field_struct( struct ftrace_graph_ent, graph_ent )
__field_packed( unsigned long, graph_ent, func )
- __field_packed( unsigned int, graph_ent, depth )
+ __field_packed( unsigned long, graph_ent, depth )
__dynamic_array(unsigned long, args )
),
- F_printk("--> %ps (%u)", (void *)__entry->func, __entry->depth)
+ F_printk("--> %ps (%lu)", (void *)__entry->func, __entry->depth)
);
#ifdef CONFIG_FUNCTION_GRAPH_RETADDR
@@ -95,13 +95,14 @@ FTRACE_ENTRY_PACKED(fgraph_retaddr_entry, fgraph_retaddr_ent_entry,
TRACE_GRAPH_RETADDR_ENT,
F_STRUCT(
- __field_struct( struct fgraph_retaddr_ent, graph_ent )
- __field_packed( unsigned long, graph_ent, func )
- __field_packed( unsigned int, graph_ent, depth )
- __field_packed( unsigned long, graph_ent, retaddr )
+ __field_struct( struct fgraph_retaddr_ent, graph_rent )
+ __field_packed( unsigned long, graph_rent.ent, func )
+ __field_packed( unsigned long, graph_rent.ent, depth )
+ __field_packed( unsigned long, graph_rent, retaddr )
+ __dynamic_array(unsigned long, args )
),
- F_printk("--> %ps (%u) <- %ps", (void *)__entry->func, __entry->depth,
+ F_printk("--> %ps (%lu) <- %ps", (void *)__entry->func, __entry->depth,
(void *)__entry->retaddr)
);
diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
index a1d402124836..f3e0442c3b96 100644
--- a/kernel/trace/trace_eprobe.c
+++ b/kernel/trace/trace_eprobe.c
@@ -484,13 +484,6 @@ static void eprobe_trigger_func(struct event_trigger_data *data,
__eprobe_trace_func(edata, rec);
}
-static const struct event_trigger_ops eprobe_trigger_ops = {
- .trigger = eprobe_trigger_func,
- .print = eprobe_trigger_print,
- .init = eprobe_trigger_init,
- .free = eprobe_trigger_free,
-};
-
static int eprobe_trigger_cmd_parse(struct event_command *cmd_ops,
struct trace_event_file *file,
char *glob, char *cmd,
@@ -513,12 +506,6 @@ static void eprobe_trigger_unreg_func(char *glob,
}
-static const struct event_trigger_ops *eprobe_trigger_get_ops(char *cmd,
- char *param)
-{
- return &eprobe_trigger_ops;
-}
-
static struct event_command event_trigger_cmd = {
.name = "eprobe",
.trigger_type = ETT_EVENT_EPROBE,
@@ -527,8 +514,11 @@ static struct event_command event_trigger_cmd = {
.reg = eprobe_trigger_reg_func,
.unreg = eprobe_trigger_unreg_func,
.unreg_all = NULL,
- .get_trigger_ops = eprobe_trigger_get_ops,
.set_filter = NULL,
+ .trigger = eprobe_trigger_func,
+ .print = eprobe_trigger_print,
+ .init = eprobe_trigger_init,
+ .free = eprobe_trigger_free,
};
static struct event_trigger_data *
@@ -548,7 +538,6 @@ new_eprobe_trigger(struct trace_eprobe *ep, struct trace_event_file *file)
trigger->flags = EVENT_TRIGGER_FL_PROBE;
trigger->count = -1;
- trigger->ops = &eprobe_trigger_ops;
/*
* EVENT PROBE triggers are not registered as commands with
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index e00da4182deb..9b07ad9eb284 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -845,13 +845,13 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file,
if (soft_disable)
set_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags);
- if (tr->trace_flags & TRACE_ITER_RECORD_CMD) {
+ if (tr->trace_flags & TRACE_ITER(RECORD_CMD)) {
cmd = true;
tracing_start_cmdline_record();
set_bit(EVENT_FILE_FL_RECORDED_CMD_BIT, &file->flags);
}
- if (tr->trace_flags & TRACE_ITER_RECORD_TGID) {
+ if (tr->trace_flags & TRACE_ITER(RECORD_TGID)) {
tgid = true;
tracing_start_tgid_record();
set_bit(EVENT_FILE_FL_RECORDED_TGID_BIT, &file->flags);
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 6bfaf1210dd2..289bdea98776 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -5696,7 +5696,7 @@ static void hist_trigger_show(struct seq_file *m,
seq_puts(m, "\n\n");
seq_puts(m, "# event histogram\n#\n# trigger info: ");
- data->ops->print(m, data);
+ data->cmd_ops->print(m, data);
seq_puts(m, "#\n\n");
hist_data = data->private_data;
@@ -6018,7 +6018,7 @@ static void hist_trigger_debug_show(struct seq_file *m,
seq_puts(m, "\n\n");
seq_puts(m, "# event histogram\n#\n# trigger info: ");
- data->ops->print(m, data);
+ data->cmd_ops->print(m, data);
seq_puts(m, "#\n\n");
hist_data = data->private_data;
@@ -6328,20 +6328,21 @@ static void event_hist_trigger_free(struct event_trigger_data *data)
free_hist_pad();
}
-static const struct event_trigger_ops event_hist_trigger_ops = {
- .trigger = event_hist_trigger,
- .print = event_hist_trigger_print,
- .init = event_hist_trigger_init,
- .free = event_hist_trigger_free,
-};
-
static int event_hist_trigger_named_init(struct event_trigger_data *data)
{
+ int ret;
+
data->ref++;
save_named_trigger(data->named_data->name, data);
- return event_hist_trigger_init(data->named_data);
+ ret = event_hist_trigger_init(data->named_data);
+ if (ret < 0) {
+ kfree(data->cmd_ops);
+ data->cmd_ops = &trigger_hist_cmd;
+ }
+
+ return ret;
}
static void event_hist_trigger_named_free(struct event_trigger_data *data)
@@ -6353,24 +6354,14 @@ static void event_hist_trigger_named_free(struct event_trigger_data *data)
data->ref--;
if (!data->ref) {
+ struct event_command *cmd_ops = data->cmd_ops;
+
del_named_trigger(data);
trigger_data_free(data);
+ kfree(cmd_ops);
}
}
-static const struct event_trigger_ops event_hist_trigger_named_ops = {
- .trigger = event_hist_trigger,
- .print = event_hist_trigger_print,
- .init = event_hist_trigger_named_init,
- .free = event_hist_trigger_named_free,
-};
-
-static const struct event_trigger_ops *event_hist_get_trigger_ops(char *cmd,
- char *param)
-{
- return &event_hist_trigger_ops;
-}
-
static void hist_clear(struct event_trigger_data *data)
{
struct hist_trigger_data *hist_data = data->private_data;
@@ -6564,13 +6555,24 @@ static int hist_register_trigger(char *glob,
data->paused = true;
if (named_data) {
+ struct event_command *cmd_ops;
+
data->private_data = named_data->private_data;
set_named_trigger_data(data, named_data);
- data->ops = &event_hist_trigger_named_ops;
+ /* Copy the command ops and update some of the functions */
+ cmd_ops = kmalloc(sizeof(*cmd_ops), GFP_KERNEL);
+ if (!cmd_ops) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ *cmd_ops = *data->cmd_ops;
+ cmd_ops->init = event_hist_trigger_named_init;
+ cmd_ops->free = event_hist_trigger_named_free;
+ data->cmd_ops = cmd_ops;
}
- if (data->ops->init) {
- ret = data->ops->init(data);
+ if (data->cmd_ops->init) {
+ ret = data->cmd_ops->init(data);
if (ret < 0)
goto out;
}
@@ -6684,8 +6686,8 @@ static void hist_unregister_trigger(char *glob,
}
}
- if (test && test->ops->free)
- test->ops->free(test);
+ if (test && test->cmd_ops->free)
+ test->cmd_ops->free(test);
if (hist_data->enable_timestamps) {
if (!hist_data->remove || test)
@@ -6737,8 +6739,8 @@ static void hist_unreg_all(struct trace_event_file *file)
update_cond_flag(file);
if (hist_data->enable_timestamps)
tracing_set_filter_buffering(file->tr, false);
- if (test->ops->free)
- test->ops->free(test);
+ if (test->cmd_ops->free)
+ test->cmd_ops->free(test);
}
}
}
@@ -6914,8 +6916,11 @@ static struct event_command trigger_hist_cmd = {
.reg = hist_register_trigger,
.unreg = hist_unregister_trigger,
.unreg_all = hist_unreg_all,
- .get_trigger_ops = event_hist_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = event_hist_trigger,
+ .print = event_hist_trigger_print,
+ .init = event_hist_trigger_init,
+ .free = event_hist_trigger_free,
};
__init int register_trigger_hist_cmd(void)
@@ -6947,66 +6952,6 @@ hist_enable_trigger(struct event_trigger_data *data,
}
}
-static void
-hist_enable_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
-{
- if (!data->count)
- return;
-
- if (data->count != -1)
- (data->count)--;
-
- hist_enable_trigger(data, buffer, rec, event);
-}
-
-static const struct event_trigger_ops hist_enable_trigger_ops = {
- .trigger = hist_enable_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops hist_enable_count_trigger_ops = {
- .trigger = hist_enable_count_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops hist_disable_trigger_ops = {
- .trigger = hist_enable_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops hist_disable_count_trigger_ops = {
- .trigger = hist_enable_count_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops *
-hist_enable_get_trigger_ops(char *cmd, char *param)
-{
- const struct event_trigger_ops *ops;
- bool enable;
-
- enable = (strcmp(cmd, ENABLE_HIST_STR) == 0);
-
- if (enable)
- ops = param ? &hist_enable_count_trigger_ops :
- &hist_enable_trigger_ops;
- else
- ops = param ? &hist_disable_count_trigger_ops :
- &hist_disable_trigger_ops;
-
- return ops;
-}
-
static void hist_enable_unreg_all(struct trace_event_file *file)
{
struct event_trigger_data *test, *n;
@@ -7016,8 +6961,8 @@ static void hist_enable_unreg_all(struct trace_event_file *file)
list_del_rcu(&test->list);
update_cond_flag(file);
trace_event_trigger_enable_disable(file, 0);
- if (test->ops->free)
- test->ops->free(test);
+ if (test->cmd_ops->free)
+ test->cmd_ops->free(test);
}
}
}
@@ -7029,8 +6974,12 @@ static struct event_command trigger_hist_enable_cmd = {
.reg = event_enable_register_trigger,
.unreg = event_enable_unregister_trigger,
.unreg_all = hist_enable_unreg_all,
- .get_trigger_ops = hist_enable_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = hist_enable_trigger,
+ .count_func = event_trigger_count,
+ .print = event_enable_trigger_print,
+ .init = event_trigger_init,
+ .free = event_enable_trigger_free,
};
static struct event_command trigger_hist_disable_cmd = {
@@ -7040,8 +6989,12 @@ static struct event_command trigger_hist_disable_cmd = {
.reg = event_enable_register_trigger,
.unreg = event_enable_unregister_trigger,
.unreg_all = hist_enable_unreg_all,
- .get_trigger_ops = hist_enable_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = hist_enable_trigger,
+ .count_func = event_trigger_count,
+ .print = event_enable_trigger_print,
+ .init = event_trigger_init,
+ .free = event_enable_trigger_free,
};
static __init void unregister_trigger_hist_enable_disable_cmds(void)
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index f24ee61f8884..2f19bbe73d27 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -359,7 +359,7 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
fmt = synth_field_fmt(se->fields[i]->type);
/* parameter types */
- if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
+ if (tr && tr->trace_flags & TRACE_ITER(VERBOSE))
trace_seq_printf(s, "%s ", fmt);
snprintf(print_fmt, sizeof(print_fmt), "%%s=%s%%s", fmt);
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index cbfc306c0159..96aad82b1628 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -6,6 +6,7 @@
*/
#include <linux/security.h>
+#include <linux/kthread.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -17,15 +18,77 @@
static LIST_HEAD(trigger_commands);
static DEFINE_MUTEX(trigger_cmd_mutex);
+static struct task_struct *trigger_kthread;
+static struct llist_head trigger_data_free_list;
+static DEFINE_MUTEX(trigger_data_kthread_mutex);
+
+/* Bulk garbage collection of event_trigger_data elements */
+static int trigger_kthread_fn(void *ignore)
+{
+ struct event_trigger_data *data, *tmp;
+ struct llist_node *llnodes;
+
+ /* Once this task starts, it lives forever */
+ for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ if (llist_empty(&trigger_data_free_list))
+ schedule();
+
+ __set_current_state(TASK_RUNNING);
+
+ llnodes = llist_del_all(&trigger_data_free_list);
+
+ /* make sure current triggers exit before free */
+ tracepoint_synchronize_unregister();
+
+ llist_for_each_entry_safe(data, tmp, llnodes, llist)
+ kfree(data);
+ }
+
+ return 0;
+}
+
void trigger_data_free(struct event_trigger_data *data)
{
if (data->cmd_ops->set_filter)
data->cmd_ops->set_filter(NULL, data, NULL);
- /* make sure current triggers exit before free */
- tracepoint_synchronize_unregister();
+ if (unlikely(!trigger_kthread)) {
+ guard(mutex)(&trigger_data_kthread_mutex);
+ /* Check again after taking mutex */
+ if (!trigger_kthread) {
+ struct task_struct *kthread;
+
+ kthread = kthread_create(trigger_kthread_fn, NULL,
+ "trigger_data_free");
+ if (!IS_ERR(kthread))
+ WRITE_ONCE(trigger_kthread, kthread);
+ }
+ }
- kfree(data);
+ if (!trigger_kthread) {
+ /* Do it the slow way */
+ tracepoint_synchronize_unregister();
+ kfree(data);
+ return;
+ }
+
+ llist_add(&data->llist, &trigger_data_free_list);
+ wake_up_process(trigger_kthread);
+}
+
+static inline void data_ops_trigger(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event)
+{
+ const struct event_command *cmd_ops = data->cmd_ops;
+
+ if (data->flags & EVENT_TRIGGER_FL_COUNT) {
+ if (!cmd_ops->count_func(data, buffer, rec, event))
+ return;
+ }
+
+ cmd_ops->trigger(data, buffer, rec, event);
}
/**
@@ -70,7 +133,7 @@ event_triggers_call(struct trace_event_file *file,
if (data->paused)
continue;
if (!rec) {
- data->ops->trigger(data, buffer, rec, event);
+ data_ops_trigger(data, buffer, rec, event);
continue;
}
filter = rcu_dereference_sched(data->filter);
@@ -80,7 +143,7 @@ event_triggers_call(struct trace_event_file *file,
tt |= data->cmd_ops->trigger_type;
continue;
}
- data->ops->trigger(data, buffer, rec, event);
+ data_ops_trigger(data, buffer, rec, event);
}
return tt;
}
@@ -122,7 +185,7 @@ event_triggers_post_call(struct trace_event_file *file,
if (data->paused)
continue;
if (data->cmd_ops->trigger_type & tt)
- data->ops->trigger(data, NULL, NULL, NULL);
+ data_ops_trigger(data, NULL, NULL, NULL);
}
}
EXPORT_SYMBOL_GPL(event_triggers_post_call);
@@ -191,7 +254,7 @@ static int trigger_show(struct seq_file *m, void *v)
}
data = list_entry(v, struct event_trigger_data, list);
- data->ops->print(m, data);
+ data->cmd_ops->print(m, data);
return 0;
}
@@ -245,7 +308,8 @@ int trigger_process_regex(struct trace_event_file *file, char *buff)
char *command, *next;
struct event_command *p;
- next = buff = skip_spaces(buff);
+ next = buff = strim(buff);
+
command = strsep(&next, ": \t");
if (next) {
next = skip_spaces(next);
@@ -282,8 +346,6 @@ static ssize_t event_trigger_regex_write(struct file *file,
if (IS_ERR(buf))
return PTR_ERR(buf);
- strim(buf);
-
guard(mutex)(&event_mutex);
event_file = event_file_file(file);
@@ -300,13 +362,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
static int event_trigger_regex_release(struct inode *inode, struct file *file)
{
- mutex_lock(&event_mutex);
-
if (file->f_mode & FMODE_READ)
seq_release(inode, file);
- mutex_unlock(&event_mutex);
-
return 0;
}
@@ -378,7 +436,37 @@ __init int unregister_event_command(struct event_command *cmd)
}
/**
- * event_trigger_print - Generic event_trigger_ops @print implementation
+ * event_trigger_count - Optional count function for event triggers
+ * @data: Trigger-specific data
+ * @buffer: The ring buffer that the event is being written to
+ * @rec: The trace entry for the event, NULL for unconditional invocation
+ * @event: The event meta data in the ring buffer
+ *
+ * For triggers that can take a count parameter that doesn't do anything
+ * special, they can use this function to assign to their .count_func
+ * field.
+ *
+ * This simply does a count down of the @data->count field.
+ *
+ * If the @data->count is greater than zero, it will decrement it.
+ *
+ * Returns false if @data->count is zero, otherwise true.
+ */
+bool event_trigger_count(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event)
+{
+ if (!data->count)
+ return false;
+
+ if (data->count != -1)
+ (data->count)--;
+
+ return true;
+}
+
+/**
+ * event_trigger_print - Generic event_command @print implementation
* @name: The name of the event trigger
* @m: The seq_file being printed to
* @data: Trigger-specific data
@@ -413,7 +501,7 @@ event_trigger_print(const char *name, struct seq_file *m,
}
/**
- * event_trigger_init - Generic event_trigger_ops @init implementation
+ * event_trigger_init - Generic event_command @init implementation
* @data: Trigger-specific data
*
* Common implementation of event trigger initialization.
@@ -430,7 +518,7 @@ int event_trigger_init(struct event_trigger_data *data)
}
/**
- * event_trigger_free - Generic event_trigger_ops @free implementation
+ * event_trigger_free - Generic event_command @free implementation
* @data: Trigger-specific data
*
* Common implementation of event trigger de-initialization.
@@ -492,8 +580,8 @@ clear_event_triggers(struct trace_array *tr)
list_for_each_entry_safe(data, n, &file->triggers, list) {
trace_event_trigger_enable_disable(file, 0);
list_del_rcu(&data->list);
- if (data->ops->free)
- data->ops->free(data);
+ if (data->cmd_ops->free)
+ data->cmd_ops->free(data);
}
}
}
@@ -556,8 +644,8 @@ static int register_trigger(char *glob,
return -EEXIST;
}
- if (data->ops->init) {
- ret = data->ops->init(data);
+ if (data->cmd_ops->init) {
+ ret = data->cmd_ops->init(data);
if (ret < 0)
return ret;
}
@@ -595,8 +683,8 @@ static bool try_unregister_trigger(char *glob,
}
if (data) {
- if (data->ops->free)
- data->ops->free(data);
+ if (data->cmd_ops->free)
+ data->cmd_ops->free(data);
return true;
}
@@ -807,9 +895,13 @@ int event_trigger_separate_filter(char *param_and_filter, char **param,
* @private_data: User data to associate with the event trigger
*
* Allocate an event_trigger_data instance and initialize it. The
- * @cmd_ops are used along with the @cmd and @param to get the
- * trigger_ops to assign to the event_trigger_data. @private_data can
- * also be passed in and associated with the event_trigger_data.
+ * @cmd_ops defines how the trigger will operate. If @param is set,
+ * and @cmd_ops->trigger_ops->count_func is non NULL, then the
+ * data->count is set to @param and before the trigger is executed, the
+ * @cmd_ops->trigger_ops->count_func() is called. If that function returns
+ * false, the @cmd_ops->trigger_ops->trigger() function will not be called.
+ * @private_data can also be passed in and associated with the
+ * event_trigger_data.
*
* Use trigger_data_free() to free an event_trigger_data object.
*
@@ -821,18 +913,16 @@ struct event_trigger_data *trigger_data_alloc(struct event_command *cmd_ops,
void *private_data)
{
struct event_trigger_data *trigger_data;
- const struct event_trigger_ops *trigger_ops;
-
- trigger_ops = cmd_ops->get_trigger_ops(cmd, param);
trigger_data = kzalloc(sizeof(*trigger_data), GFP_KERNEL);
if (!trigger_data)
return NULL;
trigger_data->count = -1;
- trigger_data->ops = trigger_ops;
trigger_data->cmd_ops = cmd_ops;
trigger_data->private_data = private_data;
+ if (param && cmd_ops->count_func)
+ trigger_data->flags |= EVENT_TRIGGER_FL_COUNT;
INIT_LIST_HEAD(&trigger_data->list);
INIT_LIST_HEAD(&trigger_data->named_list);
@@ -1271,31 +1361,28 @@ traceon_trigger(struct event_trigger_data *data,
tracing_on();
}
-static void
-traceon_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
+static bool
+traceon_count_func(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event)
{
struct trace_event_file *file = data->private_data;
if (file) {
if (tracer_tracing_is_on(file->tr))
- return;
+ return false;
} else {
if (tracing_is_on())
- return;
+ return false;
}
if (!data->count)
- return;
+ return false;
if (data->count != -1)
(data->count)--;
- if (file)
- tracer_tracing_on(file->tr);
- else
- tracing_on();
+ return true;
}
static void
@@ -1319,31 +1406,28 @@ traceoff_trigger(struct event_trigger_data *data,
tracing_off();
}
-static void
-traceoff_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
+static bool
+traceoff_count_func(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event)
{
struct trace_event_file *file = data->private_data;
if (file) {
if (!tracer_tracing_is_on(file->tr))
- return;
+ return false;
} else {
if (!tracing_is_on())
- return;
+ return false;
}
if (!data->count)
- return;
+ return false;
if (data->count != -1)
(data->count)--;
- if (file)
- tracer_tracing_off(file->tr);
- else
- tracing_off();
+ return true;
}
static int
@@ -1360,58 +1444,18 @@ traceoff_trigger_print(struct seq_file *m, struct event_trigger_data *data)
data->filter_str);
}
-static const struct event_trigger_ops traceon_trigger_ops = {
- .trigger = traceon_trigger,
- .print = traceon_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops traceon_count_trigger_ops = {
- .trigger = traceon_count_trigger,
- .print = traceon_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops traceoff_trigger_ops = {
- .trigger = traceoff_trigger,
- .print = traceoff_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops traceoff_count_trigger_ops = {
- .trigger = traceoff_count_trigger,
- .print = traceoff_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops *
-onoff_get_trigger_ops(char *cmd, char *param)
-{
- const struct event_trigger_ops *ops;
-
- /* we register both traceon and traceoff to this callback */
- if (strcmp(cmd, "traceon") == 0)
- ops = param ? &traceon_count_trigger_ops :
- &traceon_trigger_ops;
- else
- ops = param ? &traceoff_count_trigger_ops :
- &traceoff_trigger_ops;
-
- return ops;
-}
-
static struct event_command trigger_traceon_cmd = {
.name = "traceon",
.trigger_type = ETT_TRACE_ONOFF,
.parse = event_trigger_parse,
.reg = register_trigger,
.unreg = unregister_trigger,
- .get_trigger_ops = onoff_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = traceon_trigger,
+ .count_func = traceon_count_func,
+ .print = traceon_trigger_print,
+ .init = event_trigger_init,
+ .free = event_trigger_free,
};
static struct event_command trigger_traceoff_cmd = {
@@ -1421,8 +1465,12 @@ static struct event_command trigger_traceoff_cmd = {
.parse = event_trigger_parse,
.reg = register_trigger,
.unreg = unregister_trigger,
- .get_trigger_ops = onoff_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = traceoff_trigger,
+ .count_func = traceoff_count_func,
+ .print = traceoff_trigger_print,
+ .init = event_trigger_init,
+ .free = event_trigger_free,
};
#ifdef CONFIG_TRACER_SNAPSHOT
@@ -1439,20 +1487,6 @@ snapshot_trigger(struct event_trigger_data *data,
tracing_snapshot();
}
-static void
-snapshot_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
-{
- if (!data->count)
- return;
-
- if (data->count != -1)
- (data->count)--;
-
- snapshot_trigger(data, buffer, rec, event);
-}
-
static int
register_snapshot_trigger(char *glob,
struct event_trigger_data *data,
@@ -1484,34 +1518,18 @@ snapshot_trigger_print(struct seq_file *m, struct event_trigger_data *data)
data->filter_str);
}
-static const struct event_trigger_ops snapshot_trigger_ops = {
- .trigger = snapshot_trigger,
- .print = snapshot_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops snapshot_count_trigger_ops = {
- .trigger = snapshot_count_trigger,
- .print = snapshot_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops *
-snapshot_get_trigger_ops(char *cmd, char *param)
-{
- return param ? &snapshot_count_trigger_ops : &snapshot_trigger_ops;
-}
-
static struct event_command trigger_snapshot_cmd = {
.name = "snapshot",
.trigger_type = ETT_SNAPSHOT,
.parse = event_trigger_parse,
.reg = register_snapshot_trigger,
.unreg = unregister_snapshot_trigger,
- .get_trigger_ops = snapshot_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = snapshot_trigger,
+ .count_func = event_trigger_count,
+ .print = snapshot_trigger_print,
+ .init = event_trigger_init,
+ .free = event_trigger_free,
};
static __init int register_trigger_snapshot_cmd(void)
@@ -1558,20 +1576,6 @@ stacktrace_trigger(struct event_trigger_data *data,
trace_dump_stack(STACK_SKIP);
}
-static void
-stacktrace_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
-{
- if (!data->count)
- return;
-
- if (data->count != -1)
- (data->count)--;
-
- stacktrace_trigger(data, buffer, rec, event);
-}
-
static int
stacktrace_trigger_print(struct seq_file *m, struct event_trigger_data *data)
{
@@ -1579,26 +1583,6 @@ stacktrace_trigger_print(struct seq_file *m, struct event_trigger_data *data)
data->filter_str);
}
-static const struct event_trigger_ops stacktrace_trigger_ops = {
- .trigger = stacktrace_trigger,
- .print = stacktrace_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops stacktrace_count_trigger_ops = {
- .trigger = stacktrace_count_trigger,
- .print = stacktrace_trigger_print,
- .init = event_trigger_init,
- .free = event_trigger_free,
-};
-
-static const struct event_trigger_ops *
-stacktrace_get_trigger_ops(char *cmd, char *param)
-{
- return param ? &stacktrace_count_trigger_ops : &stacktrace_trigger_ops;
-}
-
static struct event_command trigger_stacktrace_cmd = {
.name = "stacktrace",
.trigger_type = ETT_STACKTRACE,
@@ -1606,8 +1590,12 @@ static struct event_command trigger_stacktrace_cmd = {
.parse = event_trigger_parse,
.reg = register_trigger,
.unreg = unregister_trigger,
- .get_trigger_ops = stacktrace_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = stacktrace_trigger,
+ .count_func = event_trigger_count,
+ .print = stacktrace_trigger_print,
+ .init = event_trigger_init,
+ .free = event_trigger_free,
};
static __init int register_trigger_stacktrace_cmd(void)
@@ -1642,24 +1630,24 @@ event_enable_trigger(struct event_trigger_data *data,
set_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &enable_data->file->flags);
}
-static void
-event_enable_count_trigger(struct event_trigger_data *data,
- struct trace_buffer *buffer, void *rec,
- struct ring_buffer_event *event)
+static bool
+event_enable_count_func(struct event_trigger_data *data,
+ struct trace_buffer *buffer, void *rec,
+ struct ring_buffer_event *event)
{
struct enable_trigger_data *enable_data = data->private_data;
if (!data->count)
- return;
+ return false;
/* Skip if the event is in a state we want to switch to */
if (enable_data->enable == !(enable_data->file->flags & EVENT_FILE_FL_SOFT_DISABLED))
- return;
+ return false;
if (data->count != -1)
(data->count)--;
- event_enable_trigger(data, buffer, rec, event);
+ return true;
}
int event_enable_trigger_print(struct seq_file *m,
@@ -1704,34 +1692,6 @@ void event_enable_trigger_free(struct event_trigger_data *data)
}
}
-static const struct event_trigger_ops event_enable_trigger_ops = {
- .trigger = event_enable_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops event_enable_count_trigger_ops = {
- .trigger = event_enable_count_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops event_disable_trigger_ops = {
- .trigger = event_enable_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
-static const struct event_trigger_ops event_disable_count_trigger_ops = {
- .trigger = event_enable_count_trigger,
- .print = event_enable_trigger_print,
- .init = event_trigger_init,
- .free = event_enable_trigger_free,
-};
-
int event_enable_trigger_parse(struct event_command *cmd_ops,
struct trace_event_file *file,
char *glob, char *cmd, char *param_and_filter)
@@ -1861,8 +1821,8 @@ int event_enable_register_trigger(char *glob,
}
}
- if (data->ops->init) {
- ret = data->ops->init(data);
+ if (data->cmd_ops->init) {
+ ret = data->cmd_ops->init(data);
if (ret < 0)
return ret;
}
@@ -1902,30 +1862,8 @@ void event_enable_unregister_trigger(char *glob,
}
}
- if (data && data->ops->free)
- data->ops->free(data);
-}
-
-static const struct event_trigger_ops *
-event_enable_get_trigger_ops(char *cmd, char *param)
-{
- const struct event_trigger_ops *ops;
- bool enable;
-
-#ifdef CONFIG_HIST_TRIGGERS
- enable = ((strcmp(cmd, ENABLE_EVENT_STR) == 0) ||
- (strcmp(cmd, ENABLE_HIST_STR) == 0));
-#else
- enable = strcmp(cmd, ENABLE_EVENT_STR) == 0;
-#endif
- if (enable)
- ops = param ? &event_enable_count_trigger_ops :
- &event_enable_trigger_ops;
- else
- ops = param ? &event_disable_count_trigger_ops :
- &event_disable_trigger_ops;
-
- return ops;
+ if (data && data->cmd_ops->free)
+ data->cmd_ops->free(data);
}
static struct event_command trigger_enable_cmd = {
@@ -1934,8 +1872,12 @@ static struct event_command trigger_enable_cmd = {
.parse = event_enable_trigger_parse,
.reg = event_enable_register_trigger,
.unreg = event_enable_unregister_trigger,
- .get_trigger_ops = event_enable_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = event_enable_trigger,
+ .count_func = event_enable_count_func,
+ .print = event_enable_trigger_print,
+ .init = event_trigger_init,
+ .free = event_enable_trigger_free,
};
static struct event_command trigger_disable_cmd = {
@@ -1944,8 +1886,12 @@ static struct event_command trigger_disable_cmd = {
.parse = event_enable_trigger_parse,
.reg = event_enable_register_trigger,
.unreg = event_enable_unregister_trigger,
- .get_trigger_ops = event_enable_get_trigger_ops,
.set_filter = set_trigger_filter,
+ .trigger = event_enable_trigger,
+ .count_func = event_enable_count_func,
+ .print = event_enable_trigger_print,
+ .init = event_trigger_init,
+ .free = event_enable_trigger_free,
};
static __init void unregister_trigger_enable_disable_cmds(void)
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 8001dbf16891..262c0556e4af 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -632,7 +632,7 @@ print_fentry_event(struct trace_iterator *iter, int flags,
trace_seq_printf(s, "%s: (", trace_probe_name(tp));
- if (!seq_print_ip_sym(s, field->ip, flags | TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_offset(s, field->ip, flags))
goto out;
trace_seq_putc(s, ')');
@@ -662,12 +662,12 @@ print_fexit_event(struct trace_iterator *iter, int flags,
trace_seq_printf(s, "%s: (", trace_probe_name(tp));
- if (!seq_print_ip_sym(s, field->ret_ip, flags | TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_offset(s, field->ret_ip, flags))
goto out;
trace_seq_puts(s, " <- ");
- if (!seq_print_ip_sym(s, field->func, flags & ~TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_no_offset(s, field->func, flags))
goto out;
trace_seq_putc(s, ')');
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index d17c18934445..c12795c2fb39 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -154,11 +154,11 @@ static int function_trace_init(struct trace_array *tr)
if (!tr->ops)
return -ENOMEM;
- func = select_trace_function(func_flags.val);
+ func = select_trace_function(tr->current_trace_flags->val);
if (!func)
return -EINVAL;
- if (!handle_func_repeats(tr, func_flags.val))
+ if (!handle_func_repeats(tr, tr->current_trace_flags->val))
return -ENOMEM;
ftrace_init_array_ops(tr, func);
@@ -459,14 +459,14 @@ func_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
u32 new_flags;
/* Do nothing if already set. */
- if (!!set == !!(func_flags.val & bit))
+ if (!!set == !!(tr->current_trace_flags->val & bit))
return 0;
/* We can change this flag only when not running. */
if (tr->current_trace != &function_trace)
return 0;
- new_flags = (func_flags.val & ~bit) | (set ? bit : 0);
+ new_flags = (tr->current_trace_flags->val & ~bit) | (set ? bit : 0);
func = select_trace_function(new_flags);
if (!func)
return -EINVAL;
@@ -491,7 +491,7 @@ static struct tracer function_trace __tracer_data =
.init = function_trace_init,
.reset = function_trace_reset,
.start = function_trace_start,
- .flags = &func_flags,
+ .default_flags = &func_flags,
.set_flag = func_set_flag,
.allow_instances = true,
#ifdef CONFIG_FTRACE_SELFTEST
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index a7f4b9a47a71..17c75cf2348e 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -16,9 +16,12 @@
#include "trace.h"
#include "trace_output.h"
-/* When set, irq functions will be ignored */
+/* When set, irq functions might be ignored */
static int ftrace_graph_skip_irqs;
+/* Do not record function time when task is sleeping */
+int fgraph_no_sleep_time;
+
struct fgraph_cpu_data {
pid_t last_pid;
int depth;
@@ -33,14 +36,19 @@ struct fgraph_ent_args {
unsigned long args[FTRACE_REGS_MAX_ARGS];
};
+struct fgraph_retaddr_ent_args {
+ struct fgraph_retaddr_ent_entry ent;
+ /* Force the sizeof of args[] to have FTRACE_REGS_MAX_ARGS entries */
+ unsigned long args[FTRACE_REGS_MAX_ARGS];
+};
+
struct fgraph_data {
struct fgraph_cpu_data __percpu *cpu_data;
/* Place to preserve last processed entry. */
union {
struct fgraph_ent_args ent;
- /* TODO allow retaddr to have args */
- struct fgraph_retaddr_ent_entry rent;
+ struct fgraph_retaddr_ent_args rent;
};
struct ftrace_graph_ret_entry ret;
int failed;
@@ -85,11 +93,6 @@ static struct tracer_opt trace_opts[] = {
/* Include sleep time (scheduled out) between entry and return */
{ TRACER_OPT(sleep-time, TRACE_GRAPH_SLEEP_TIME) },
-#ifdef CONFIG_FUNCTION_PROFILER
- /* Include time within nested functions */
- { TRACER_OPT(graph-time, TRACE_GRAPH_GRAPH_TIME) },
-#endif
-
{ } /* Empty entry */
};
@@ -97,13 +100,13 @@ static struct tracer_flags tracer_flags = {
/* Don't display overruns, proc, or tail by default */
.val = TRACE_GRAPH_PRINT_CPU | TRACE_GRAPH_PRINT_OVERHEAD |
TRACE_GRAPH_PRINT_DURATION | TRACE_GRAPH_PRINT_IRQS |
- TRACE_GRAPH_SLEEP_TIME | TRACE_GRAPH_GRAPH_TIME,
+ TRACE_GRAPH_SLEEP_TIME,
.opts = trace_opts
};
-static bool tracer_flags_is_set(u32 flags)
+static bool tracer_flags_is_set(struct trace_array *tr, u32 flags)
{
- return (tracer_flags.val & flags) == flags;
+ return (tr->current_trace_flags->val & flags) == flags;
}
/*
@@ -162,20 +165,32 @@ int __trace_graph_entry(struct trace_array *tr,
int __trace_graph_retaddr_entry(struct trace_array *tr,
struct ftrace_graph_ent *trace,
unsigned int trace_ctx,
- unsigned long retaddr)
+ unsigned long retaddr,
+ struct ftrace_regs *fregs)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer = tr->array_buffer.buffer;
struct fgraph_retaddr_ent_entry *entry;
+ int size;
+
+ /* If fregs is defined, add FTRACE_REGS_MAX_ARGS long size words */
+ size = sizeof(*entry) + (FTRACE_REGS_MAX_ARGS * !!fregs * sizeof(long));
event = trace_buffer_lock_reserve(buffer, TRACE_GRAPH_RETADDR_ENT,
- sizeof(*entry), trace_ctx);
+ size, trace_ctx);
if (!event)
return 0;
entry = ring_buffer_event_data(event);
- entry->graph_ent.func = trace->func;
- entry->graph_ent.depth = trace->depth;
- entry->graph_ent.retaddr = retaddr;
+ entry->graph_rent.ent = *trace;
+ entry->graph_rent.retaddr = retaddr;
+
+#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
+ if (fregs) {
+ for (int i = 0; i < FTRACE_REGS_MAX_ARGS; i++)
+ entry->args[i] = ftrace_regs_get_argument(fregs, i);
+ }
+#endif
+
trace_buffer_unlock_commit_nostack(buffer, event);
return 1;
@@ -184,17 +199,21 @@ int __trace_graph_retaddr_entry(struct trace_array *tr,
int __trace_graph_retaddr_entry(struct trace_array *tr,
struct ftrace_graph_ent *trace,
unsigned int trace_ctx,
- unsigned long retaddr)
+ unsigned long retaddr,
+ struct ftrace_regs *fregs)
{
return 1;
}
#endif
-static inline int ftrace_graph_ignore_irqs(void)
+static inline int ftrace_graph_ignore_irqs(struct trace_array *tr)
{
if (!ftrace_graph_skip_irqs || trace_recursion_test(TRACE_IRQ_BIT))
return 0;
+ if (tracer_flags_is_set(tr, TRACE_GRAPH_PRINT_IRQS))
+ return 0;
+
return in_hardirq();
}
@@ -238,16 +257,17 @@ static int graph_entry(struct ftrace_graph_ent *trace,
if (ftrace_graph_ignore_func(gops, trace))
return 0;
- if (ftrace_graph_ignore_irqs())
+ if (ftrace_graph_ignore_irqs(tr))
return 0;
- if (fgraph_sleep_time) {
- /* Only need to record the calltime */
- ftimes = fgraph_reserve_data(gops->idx, sizeof(ftimes->calltime));
- } else {
+ if (fgraph_no_sleep_time &&
+ !tracer_flags_is_set(tr, TRACE_GRAPH_SLEEP_TIME)) {
ftimes = fgraph_reserve_data(gops->idx, sizeof(*ftimes));
if (ftimes)
ftimes->sleeptime = current->ftrace_sleeptime;
+ } else {
+ /* Only need to record the calltime */
+ ftimes = fgraph_reserve_data(gops->idx, sizeof(ftimes->calltime));
}
if (!ftimes)
return 0;
@@ -263,9 +283,10 @@ static int graph_entry(struct ftrace_graph_ent *trace,
trace_ctx = tracing_gen_ctx();
if (IS_ENABLED(CONFIG_FUNCTION_GRAPH_RETADDR) &&
- tracer_flags_is_set(TRACE_GRAPH_PRINT_RETADDR)) {
+ tracer_flags_is_set(tr, TRACE_GRAPH_PRINT_RETADDR)) {
unsigned long retaddr = ftrace_graph_top_ret_addr(current);
- ret = __trace_graph_retaddr_entry(tr, trace, trace_ctx, retaddr);
+ ret = __trace_graph_retaddr_entry(tr, trace, trace_ctx,
+ retaddr, fregs);
} else {
ret = __graph_entry(tr, trace, trace_ctx, fregs);
}
@@ -333,11 +354,15 @@ void __trace_graph_return(struct trace_array *tr,
trace_buffer_unlock_commit_nostack(buffer, event);
}
-static void handle_nosleeptime(struct ftrace_graph_ret *trace,
+static void handle_nosleeptime(struct trace_array *tr,
+ struct ftrace_graph_ret *trace,
struct fgraph_times *ftimes,
int size)
{
- if (fgraph_sleep_time || size < sizeof(*ftimes))
+ if (size < sizeof(*ftimes))
+ return;
+
+ if (!fgraph_no_sleep_time || tracer_flags_is_set(tr, TRACE_GRAPH_SLEEP_TIME))
return;
ftimes->calltime += current->ftrace_sleeptime - ftimes->sleeptime;
@@ -366,7 +391,7 @@ void trace_graph_return(struct ftrace_graph_ret *trace,
if (!ftimes)
return;
- handle_nosleeptime(trace, ftimes, size);
+ handle_nosleeptime(tr, trace, ftimes, size);
calltime = ftimes->calltime;
@@ -379,6 +404,7 @@ static void trace_graph_thresh_return(struct ftrace_graph_ret *trace,
struct ftrace_regs *fregs)
{
struct fgraph_times *ftimes;
+ struct trace_array *tr;
int size;
ftrace_graph_addr_finish(gops, trace);
@@ -392,7 +418,8 @@ static void trace_graph_thresh_return(struct ftrace_graph_ret *trace,
if (!ftimes)
return;
- handle_nosleeptime(trace, ftimes, size);
+ tr = gops->private;
+ handle_nosleeptime(tr, trace, ftimes, size);
if (tracing_thresh &&
(trace_clock_local() - ftimes->calltime < tracing_thresh))
@@ -441,7 +468,7 @@ static int graph_trace_init(struct trace_array *tr)
{
int ret;
- if (tracer_flags_is_set(TRACE_GRAPH_ARGS))
+ if (tracer_flags_is_set(tr, TRACE_GRAPH_ARGS))
tr->gops->entryfunc = trace_graph_entry_args;
else
tr->gops->entryfunc = trace_graph_entry;
@@ -451,6 +478,12 @@ static int graph_trace_init(struct trace_array *tr)
else
tr->gops->retfunc = trace_graph_return;
+ if (!tracer_flags_is_set(tr, TRACE_GRAPH_PRINT_IRQS))
+ ftrace_graph_skip_irqs++;
+
+ if (!tracer_flags_is_set(tr, TRACE_GRAPH_SLEEP_TIME))
+ fgraph_no_sleep_time++;
+
/* Make gops functions visible before we start tracing */
smp_mb();
@@ -468,10 +501,6 @@ static int ftrace_graph_trace_args(struct trace_array *tr, int set)
{
trace_func_graph_ent_t entry;
- /* Do nothing if the current tracer is not this tracer */
- if (tr->current_trace != &graph_trace)
- return 0;
-
if (set)
entry = trace_graph_entry_args;
else
@@ -492,6 +521,16 @@ static int ftrace_graph_trace_args(struct trace_array *tr, int set)
static void graph_trace_reset(struct trace_array *tr)
{
+ if (!tracer_flags_is_set(tr, TRACE_GRAPH_PRINT_IRQS))
+ ftrace_graph_skip_irqs--;
+ if (WARN_ON_ONCE(ftrace_graph_skip_irqs < 0))
+ ftrace_graph_skip_irqs = 0;
+
+ if (!tracer_flags_is_set(tr, TRACE_GRAPH_SLEEP_TIME))
+ fgraph_no_sleep_time--;
+ if (WARN_ON_ONCE(fgraph_no_sleep_time < 0))
+ fgraph_no_sleep_time = 0;
+
tracing_stop_cmdline_record();
unregister_ftrace_graph(tr->gops);
}
@@ -634,13 +673,9 @@ get_return_for_leaf(struct trace_iterator *iter,
* Save current and next entries for later reference
* if the output fails.
*/
- if (unlikely(curr->ent.type == TRACE_GRAPH_RETADDR_ENT)) {
- data->rent = *(struct fgraph_retaddr_ent_entry *)curr;
- } else {
- int size = min((int)sizeof(data->ent), (int)iter->ent_size);
+ int size = min_t(int, sizeof(data->rent), iter->ent_size);
- memcpy(&data->ent, curr, size);
- }
+ memcpy(&data->rent, curr, size);
/*
* If the next event is not a return type, then
* we only care about what type it is. Otherwise we can
@@ -703,7 +738,7 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
addr >= (unsigned long)__irqentry_text_end)
return;
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO)) {
/* Absolute time */
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
print_graph_abs_time(iter->ts, s);
@@ -723,7 +758,7 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
}
/* Latency format */
- if (tr->trace_flags & TRACE_ITER_LATENCY_FMT)
+ if (tr->trace_flags & TRACE_ITER(LATENCY_FMT))
print_graph_lat_fmt(s, ent);
}
@@ -777,7 +812,7 @@ print_graph_duration(struct trace_array *tr, unsigned long long duration,
struct trace_seq *s, u32 flags)
{
if (!(flags & TRACE_GRAPH_PRINT_DURATION) ||
- !(tr->trace_flags & TRACE_ITER_CONTEXT_INFO))
+ !(tr->trace_flags & TRACE_ITER(CONTEXT_INFO)))
return;
/* No real adata, just filling the column with spaces */
@@ -818,7 +853,7 @@ static void print_graph_retaddr(struct trace_seq *s, struct fgraph_retaddr_ent_e
trace_seq_puts(s, " /*");
trace_seq_puts(s, " <-");
- seq_print_ip_sym(s, entry->graph_ent.retaddr, trace_flags | TRACE_ITER_SYM_OFFSET);
+ seq_print_ip_sym_offset(s, entry->graph_rent.retaddr, trace_flags);
if (comment)
trace_seq_puts(s, " */");
@@ -964,7 +999,7 @@ print_graph_entry_leaf(struct trace_iterator *iter,
trace_seq_printf(s, "%ps", (void *)ret_func);
if (args_size >= FTRACE_REGS_MAX_ARGS * sizeof(long)) {
- print_function_args(s, entry->args, ret_func);
+ print_function_args(s, FGRAPH_ENTRY_ARGS(entry), ret_func);
trace_seq_putc(s, ';');
} else
trace_seq_puts(s, "();");
@@ -1016,7 +1051,7 @@ print_graph_entry_nested(struct trace_iterator *iter,
args_size = iter->ent_size - offsetof(struct ftrace_graph_ent_entry, args);
if (args_size >= FTRACE_REGS_MAX_ARGS * sizeof(long))
- print_function_args(s, entry->args, func);
+ print_function_args(s, FGRAPH_ENTRY_ARGS(entry), func);
else
trace_seq_puts(s, "()");
@@ -1054,7 +1089,7 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s,
/* Interrupt */
print_graph_irq(iter, addr, type, cpu, ent->pid, flags);
- if (!(tr->trace_flags & TRACE_ITER_CONTEXT_INFO))
+ if (!(tr->trace_flags & TRACE_ITER(CONTEXT_INFO)))
return;
/* Absolute time */
@@ -1076,7 +1111,7 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s,
}
/* Latency format */
- if (tr->trace_flags & TRACE_ITER_LATENCY_FMT)
+ if (tr->trace_flags & TRACE_ITER(LATENCY_FMT))
print_graph_lat_fmt(s, ent);
return;
@@ -1198,11 +1233,14 @@ print_graph_entry(struct ftrace_graph_ent_entry *field, struct trace_seq *s,
/*
* print_graph_entry() may consume the current event,
* thus @field may become invalid, so we need to save it.
- * sizeof(struct ftrace_graph_ent_entry) is very small,
- * it can be safely saved at the stack.
+ * This function is shared by ftrace_graph_ent_entry and
+ * fgraph_retaddr_ent_entry, the size of the latter one
+ * is larger, but it is very small and can be safely saved
+ * at the stack.
*/
struct ftrace_graph_ent_entry *entry;
- u8 save_buf[sizeof(*entry) + FTRACE_REGS_MAX_ARGS * sizeof(long)];
+ struct fgraph_retaddr_ent_entry *rentry;
+ u8 save_buf[sizeof(*rentry) + FTRACE_REGS_MAX_ARGS * sizeof(long)];
/* The ent_size is expected to be as big as the entry */
if (iter->ent_size > sizeof(save_buf))
@@ -1431,12 +1469,17 @@ print_graph_function_flags(struct trace_iterator *iter, u32 flags)
}
#ifdef CONFIG_FUNCTION_GRAPH_RETADDR
case TRACE_GRAPH_RETADDR_ENT: {
- struct fgraph_retaddr_ent_entry saved;
+ /*
+ * ftrace_graph_ent_entry and fgraph_retaddr_ent_entry have
+ * similar functions and memory layouts. The only difference
+ * is that the latter one has an extra retaddr member, so
+ * they can share most of the logic.
+ */
struct fgraph_retaddr_ent_entry *rfield;
trace_assign_type(rfield, entry);
- saved = *rfield;
- return print_graph_entry((struct ftrace_graph_ent_entry *)&saved, s, iter, flags);
+ return print_graph_entry((struct ftrace_graph_ent_entry *)rfield,
+ s, iter, flags);
}
#endif
case TRACE_GRAPH_RET: {
@@ -1459,7 +1502,8 @@ print_graph_function_flags(struct trace_iterator *iter, u32 flags)
static enum print_line_t
print_graph_function(struct trace_iterator *iter)
{
- return print_graph_function_flags(iter, tracer_flags.val);
+ struct trace_array *tr = iter->tr;
+ return print_graph_function_flags(iter, tr->current_trace_flags->val);
}
static enum print_line_t
@@ -1495,7 +1539,7 @@ static void print_lat_header(struct seq_file *s, u32 flags)
static void __print_graph_headers_flags(struct trace_array *tr,
struct seq_file *s, u32 flags)
{
- int lat = tr->trace_flags & TRACE_ITER_LATENCY_FMT;
+ int lat = tr->trace_flags & TRACE_ITER(LATENCY_FMT);
if (lat)
print_lat_header(s, flags);
@@ -1535,7 +1579,10 @@ static void __print_graph_headers_flags(struct trace_array *tr,
static void print_graph_headers(struct seq_file *s)
{
- print_graph_headers_flags(s, tracer_flags.val);
+ struct trace_iterator *iter = s->private;
+ struct trace_array *tr = iter->tr;
+
+ print_graph_headers_flags(s, tr->current_trace_flags->val);
}
void print_graph_headers_flags(struct seq_file *s, u32 flags)
@@ -1543,10 +1590,10 @@ void print_graph_headers_flags(struct seq_file *s, u32 flags)
struct trace_iterator *iter = s->private;
struct trace_array *tr = iter->tr;
- if (!(tr->trace_flags & TRACE_ITER_CONTEXT_INFO))
+ if (!(tr->trace_flags & TRACE_ITER(CONTEXT_INFO)))
return;
- if (tr->trace_flags & TRACE_ITER_LATENCY_FMT) {
+ if (tr->trace_flags & TRACE_ITER(LATENCY_FMT)) {
/* print nothing if the buffers are empty */
if (trace_empty(iter))
return;
@@ -1613,17 +1660,56 @@ void graph_trace_close(struct trace_iterator *iter)
static int
func_graph_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
{
- if (bit == TRACE_GRAPH_PRINT_IRQS)
- ftrace_graph_skip_irqs = !set;
+/*
+ * The function profiler gets updated even if function graph
+ * isn't the current tracer. Handle it separately.
+ */
+#ifdef CONFIG_FUNCTION_PROFILER
+ if (bit == TRACE_GRAPH_SLEEP_TIME && (tr->flags & TRACE_ARRAY_FL_GLOBAL) &&
+ !!set == fprofile_no_sleep_time) {
+ if (set) {
+ fgraph_no_sleep_time--;
+ if (WARN_ON_ONCE(fgraph_no_sleep_time < 0))
+ fgraph_no_sleep_time = 0;
+ fprofile_no_sleep_time = false;
+ } else {
+ fgraph_no_sleep_time++;
+ fprofile_no_sleep_time = true;
+ }
+ }
+#endif
+
+ /* Do nothing if the current tracer is not this tracer */
+ if (tr->current_trace != &graph_trace)
+ return 0;
- if (bit == TRACE_GRAPH_SLEEP_TIME)
- ftrace_graph_sleep_time_control(set);
+ /* Do nothing if already set. */
+ if (!!set == !!(tr->current_trace_flags->val & bit))
+ return 0;
+
+ switch (bit) {
+ case TRACE_GRAPH_SLEEP_TIME:
+ if (set) {
+ fgraph_no_sleep_time--;
+ if (WARN_ON_ONCE(fgraph_no_sleep_time < 0))
+ fgraph_no_sleep_time = 0;
+ } else {
+ fgraph_no_sleep_time++;
+ }
+ break;
- if (bit == TRACE_GRAPH_GRAPH_TIME)
- ftrace_graph_graph_time_control(set);
+ case TRACE_GRAPH_PRINT_IRQS:
+ if (set)
+ ftrace_graph_skip_irqs--;
+ else
+ ftrace_graph_skip_irqs++;
+ if (WARN_ON_ONCE(ftrace_graph_skip_irqs < 0))
+ ftrace_graph_skip_irqs = 0;
+ break;
- if (bit == TRACE_GRAPH_ARGS)
+ case TRACE_GRAPH_ARGS:
return ftrace_graph_trace_args(tr, set);
+ }
return 0;
}
@@ -1660,7 +1746,7 @@ static struct tracer graph_trace __tracer_data = {
.reset = graph_trace_reset,
.print_line = print_graph_function,
.print_header = print_graph_headers,
- .flags = &tracer_flags,
+ .default_flags = &tracer_flags,
.set_flag = func_graph_set_flag,
.allow_instances = true,
#ifdef CONFIG_FTRACE_SELFTEST
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 4c45c49b06c8..17673905907c 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -63,7 +63,7 @@ irq_trace(void)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static int irqsoff_display_graph(struct trace_array *tr, int set);
-# define is_graph(tr) ((tr)->trace_flags & TRACE_ITER_DISPLAY_GRAPH)
+# define is_graph(tr) ((tr)->trace_flags & TRACE_ITER(DISPLAY_GRAPH))
#else
static inline int irqsoff_display_graph(struct trace_array *tr, int set)
{
@@ -485,8 +485,8 @@ static int register_irqsoff_function(struct trace_array *tr, int graph, int set)
{
int ret;
- /* 'set' is set if TRACE_ITER_FUNCTION is about to be set */
- if (function_enabled || (!set && !(tr->trace_flags & TRACE_ITER_FUNCTION)))
+ /* 'set' is set if TRACE_ITER(FUNCTION) is about to be set */
+ if (function_enabled || (!set && !(tr->trace_flags & TRACE_ITER(FUNCTION))))
return 0;
if (graph)
@@ -515,7 +515,7 @@ static void unregister_irqsoff_function(struct trace_array *tr, int graph)
static int irqsoff_function_set(struct trace_array *tr, u32 mask, int set)
{
- if (!(mask & TRACE_ITER_FUNCTION))
+ if (!(mask & TRACE_ITER(FUNCTION)))
return 0;
if (set)
@@ -536,7 +536,7 @@ static inline int irqsoff_function_set(struct trace_array *tr, u32 mask, int set
}
#endif /* CONFIG_FUNCTION_TRACER */
-static int irqsoff_flag_changed(struct trace_array *tr, u32 mask, int set)
+static int irqsoff_flag_changed(struct trace_array *tr, u64 mask, int set)
{
struct tracer *tracer = tr->current_trace;
@@ -544,7 +544,7 @@ static int irqsoff_flag_changed(struct trace_array *tr, u32 mask, int set)
return 0;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- if (mask & TRACE_ITER_DISPLAY_GRAPH)
+ if (mask & TRACE_ITER(DISPLAY_GRAPH))
return irqsoff_display_graph(tr, set);
#endif
@@ -582,10 +582,10 @@ static int __irqsoff_tracer_init(struct trace_array *tr)
save_flags = tr->trace_flags;
/* non overwrite screws up the latency tracers */
- set_tracer_flag(tr, TRACE_ITER_OVERWRITE, 1);
- set_tracer_flag(tr, TRACE_ITER_LATENCY_FMT, 1);
+ set_tracer_flag(tr, TRACE_ITER(OVERWRITE), 1);
+ set_tracer_flag(tr, TRACE_ITER(LATENCY_FMT), 1);
/* without pause, we will produce garbage if another latency occurs */
- set_tracer_flag(tr, TRACE_ITER_PAUSE_ON_TRACE, 1);
+ set_tracer_flag(tr, TRACE_ITER(PAUSE_ON_TRACE), 1);
tr->max_latency = 0;
irqsoff_trace = tr;
@@ -605,15 +605,15 @@ static int __irqsoff_tracer_init(struct trace_array *tr)
static void __irqsoff_tracer_reset(struct trace_array *tr)
{
- int lat_flag = save_flags & TRACE_ITER_LATENCY_FMT;
- int overwrite_flag = save_flags & TRACE_ITER_OVERWRITE;
- int pause_flag = save_flags & TRACE_ITER_PAUSE_ON_TRACE;
+ int lat_flag = save_flags & TRACE_ITER(LATENCY_FMT);
+ int overwrite_flag = save_flags & TRACE_ITER(OVERWRITE);
+ int pause_flag = save_flags & TRACE_ITER(PAUSE_ON_TRACE);
stop_irqsoff_tracer(tr, is_graph(tr));
- set_tracer_flag(tr, TRACE_ITER_LATENCY_FMT, lat_flag);
- set_tracer_flag(tr, TRACE_ITER_OVERWRITE, overwrite_flag);
- set_tracer_flag(tr, TRACE_ITER_PAUSE_ON_TRACE, pause_flag);
+ set_tracer_flag(tr, TRACE_ITER(LATENCY_FMT), lat_flag);
+ set_tracer_flag(tr, TRACE_ITER(OVERWRITE), overwrite_flag);
+ set_tracer_flag(tr, TRACE_ITER(PAUSE_ON_TRACE), pause_flag);
ftrace_reset_array_ops(tr);
irqsoff_busy = false;
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index 896ff78b8349..b30795f34079 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -31,7 +31,7 @@ static void ftrace_dump_buf(int skip_entries, long cpu_file)
old_userobj = tr->trace_flags;
/* don't look at user memory in panic mode */
- tr->trace_flags &= ~TRACE_ITER_SYM_USEROBJ;
+ tr->trace_flags &= ~TRACE_ITER(SYM_USEROBJ);
kdb_printf("Dumping ftrace buffer:\n");
if (skip_entries)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index ee8171b19bee..9953506370a5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1584,7 +1584,7 @@ print_kprobe_event(struct trace_iterator *iter, int flags,
trace_seq_printf(s, "%s: (", trace_probe_name(tp));
- if (!seq_print_ip_sym(s, field->ip, flags | TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_offset(s, field->ip, flags))
goto out;
trace_seq_putc(s, ')');
@@ -1614,12 +1614,12 @@ print_kretprobe_event(struct trace_iterator *iter, int flags,
trace_seq_printf(s, "%s: (", trace_probe_name(tp));
- if (!seq_print_ip_sym(s, field->ret_ip, flags | TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_offset(s, field->ret_ip, flags))
goto out;
trace_seq_puts(s, " <- ");
- if (!seq_print_ip_sym(s, field->func, flags & ~TRACE_ITER_SYM_OFFSET))
+ if (!seq_print_ip_sym_no_offset(s, field->func, flags))
goto out;
trace_seq_putc(s, ')');
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 14f86f0a8bc7..cc2d3306bb60 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -420,7 +420,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm,
}
mmap_read_unlock(mm);
}
- if (ret && ((sym_flags & TRACE_ITER_SYM_ADDR) || !file))
+ if (ret && ((sym_flags & TRACE_ITER(SYM_ADDR)) || !file))
trace_seq_printf(s, " <" IP_FMT ">", ip);
return !trace_seq_has_overflowed(s);
}
@@ -433,9 +433,9 @@ seq_print_ip_sym(struct trace_seq *s, unsigned long ip, unsigned long sym_flags)
goto out;
}
- trace_seq_print_sym(s, ip, sym_flags & TRACE_ITER_SYM_OFFSET);
+ trace_seq_print_sym(s, ip, sym_flags & TRACE_ITER(SYM_OFFSET));
- if (sym_flags & TRACE_ITER_SYM_ADDR)
+ if (sym_flags & TRACE_ITER(SYM_ADDR))
trace_seq_printf(s, " <" IP_FMT ">", ip);
out:
@@ -569,7 +569,7 @@ static int
lat_print_timestamp(struct trace_iterator *iter, u64 next_ts)
{
struct trace_array *tr = iter->tr;
- unsigned long verbose = tr->trace_flags & TRACE_ITER_VERBOSE;
+ unsigned long verbose = tr->trace_flags & TRACE_ITER(VERBOSE);
unsigned long in_ns = iter->iter_flags & TRACE_FILE_TIME_IN_NS;
unsigned long long abs_ts = iter->ts - iter->array_buffer->time_start;
unsigned long long rel_ts = next_ts - iter->ts;
@@ -636,7 +636,7 @@ int trace_print_context(struct trace_iterator *iter)
trace_seq_printf(s, "%16s-%-7d ", comm, entry->pid);
- if (tr->trace_flags & TRACE_ITER_RECORD_TGID) {
+ if (tr->trace_flags & TRACE_ITER(RECORD_TGID)) {
unsigned int tgid = trace_find_tgid(entry->pid);
if (!tgid)
@@ -647,7 +647,7 @@ int trace_print_context(struct trace_iterator *iter)
trace_seq_printf(s, "[%03d] ", iter->cpu);
- if (tr->trace_flags & TRACE_ITER_IRQ_INFO)
+ if (tr->trace_flags & TRACE_ITER(IRQ_INFO))
trace_print_lat_fmt(s, entry);
trace_print_time(s, iter, iter->ts);
@@ -661,7 +661,7 @@ int trace_print_lat_context(struct trace_iterator *iter)
struct trace_entry *entry, *next_entry;
struct trace_array *tr = iter->tr;
struct trace_seq *s = &iter->seq;
- unsigned long verbose = (tr->trace_flags & TRACE_ITER_VERBOSE);
+ unsigned long verbose = (tr->trace_flags & TRACE_ITER(VERBOSE));
u64 next_ts;
next_entry = trace_find_next_entry(iter, NULL, &next_ts);
@@ -950,7 +950,9 @@ static void print_fields(struct trace_iterator *iter, struct trace_event_call *c
int offset;
int len;
int ret;
+ int i;
void *pos;
+ char *str;
list_for_each_entry_reverse(field, head, link) {
trace_seq_printf(&iter->seq, " %s=", field->name);
@@ -977,8 +979,29 @@ static void print_fields(struct trace_iterator *iter, struct trace_event_call *c
trace_seq_puts(&iter->seq, "<OVERFLOW>");
break;
}
- pos = (void *)iter->ent + offset;
- trace_seq_printf(&iter->seq, "%.*s", len, (char *)pos);
+ str = (char *)iter->ent + offset;
+ /* Check if there's any non printable strings */
+ for (i = 0; i < len; i++) {
+ if (str[i] && !(isascii(str[i]) && isprint(str[i])))
+ break;
+ }
+ if (i < len) {
+ for (i = 0; i < len; i++) {
+ if (isascii(str[i]) && isprint(str[i]))
+ trace_seq_putc(&iter->seq, str[i]);
+ else
+ trace_seq_putc(&iter->seq, '.');
+ }
+ trace_seq_puts(&iter->seq, " (");
+ for (i = 0; i < len; i++) {
+ if (i)
+ trace_seq_putc(&iter->seq, ':');
+ trace_seq_printf(&iter->seq, "%02x", str[i]);
+ }
+ trace_seq_putc(&iter->seq, ')');
+ } else {
+ trace_seq_printf(&iter->seq, "%.*s", len, str);
+ }
break;
case FILTER_PTR_STRING:
if (!iter->fmt_size)
@@ -1127,7 +1150,7 @@ static void print_fn_trace(struct trace_seq *s, unsigned long ip,
if (args)
print_function_args(s, args, ip);
- if ((flags & TRACE_ITER_PRINT_PARENT) && parent_ip) {
+ if ((flags & TRACE_ITER(PRINT_PARENT)) && parent_ip) {
trace_seq_puts(s, " <-");
seq_print_ip_sym(s, parent_ip, flags);
}
@@ -1417,7 +1440,7 @@ static enum print_line_t trace_user_stack_print(struct trace_iterator *iter,
trace_seq_puts(s, "<user stack trace>\n");
- if (tr->trace_flags & TRACE_ITER_SYM_USEROBJ) {
+ if (tr->trace_flags & TRACE_ITER(SYM_USEROBJ)) {
struct task_struct *task;
/*
* we do the lookup on the thread group leader,
diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h
index 2e305364f2a9..99b676733d46 100644
--- a/kernel/trace/trace_output.h
+++ b/kernel/trace/trace_output.h
@@ -16,6 +16,17 @@ extern int
seq_print_ip_sym(struct trace_seq *s, unsigned long ip,
unsigned long sym_flags);
+static inline int seq_print_ip_sym_offset(struct trace_seq *s, unsigned long ip,
+ unsigned long sym_flags)
+{
+ return seq_print_ip_sym(s, ip, sym_flags | TRACE_ITER(SYM_OFFSET));
+}
+static inline int seq_print_ip_sym_no_offset(struct trace_seq *s, unsigned long ip,
+ unsigned long sym_flags)
+{
+ return seq_print_ip_sym(s, ip, sym_flags & ~TRACE_ITER(SYM_OFFSET));
+}
+
extern void trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset);
extern int trace_print_context(struct trace_iterator *iter);
extern int trace_print_lat_context(struct trace_iterator *iter);
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index e3f2e4f56faa..8faa73d3bba1 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -41,7 +41,7 @@ static void stop_func_tracer(struct trace_array *tr, int graph);
static int save_flags;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-# define is_graph(tr) ((tr)->trace_flags & TRACE_ITER_DISPLAY_GRAPH)
+# define is_graph(tr) ((tr)->trace_flags & TRACE_ITER(DISPLAY_GRAPH))
#else
# define is_graph(tr) false
#endif
@@ -247,8 +247,8 @@ static int register_wakeup_function(struct trace_array *tr, int graph, int set)
{
int ret;
- /* 'set' is set if TRACE_ITER_FUNCTION is about to be set */
- if (function_enabled || (!set && !(tr->trace_flags & TRACE_ITER_FUNCTION)))
+ /* 'set' is set if TRACE_ITER(FUNCTION) is about to be set */
+ if (function_enabled || (!set && !(tr->trace_flags & TRACE_ITER(FUNCTION))))
return 0;
if (graph)
@@ -277,7 +277,7 @@ static void unregister_wakeup_function(struct trace_array *tr, int graph)
static int wakeup_function_set(struct trace_array *tr, u32 mask, int set)
{
- if (!(mask & TRACE_ITER_FUNCTION))
+ if (!(mask & TRACE_ITER(FUNCTION)))
return 0;
if (set)
@@ -324,7 +324,7 @@ __trace_function(struct trace_array *tr,
trace_function(tr, ip, parent_ip, trace_ctx, NULL);
}
-static int wakeup_flag_changed(struct trace_array *tr, u32 mask, int set)
+static int wakeup_flag_changed(struct trace_array *tr, u64 mask, int set)
{
struct tracer *tracer = tr->current_trace;
@@ -332,7 +332,7 @@ static int wakeup_flag_changed(struct trace_array *tr, u32 mask, int set)
return 0;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- if (mask & TRACE_ITER_DISPLAY_GRAPH)
+ if (mask & TRACE_ITER(DISPLAY_GRAPH))
return wakeup_display_graph(tr, set);
#endif
@@ -681,8 +681,8 @@ static int __wakeup_tracer_init(struct trace_array *tr)
save_flags = tr->trace_flags;
/* non overwrite screws up the latency tracers */
- set_tracer_flag(tr, TRACE_ITER_OVERWRITE, 1);
- set_tracer_flag(tr, TRACE_ITER_LATENCY_FMT, 1);
+ set_tracer_flag(tr, TRACE_ITER(OVERWRITE), 1);
+ set_tracer_flag(tr, TRACE_ITER(LATENCY_FMT), 1);
tr->max_latency = 0;
wakeup_trace = tr;
@@ -725,15 +725,15 @@ static int wakeup_dl_tracer_init(struct trace_array *tr)
static void wakeup_tracer_reset(struct trace_array *tr)
{
- int lat_flag = save_flags & TRACE_ITER_LATENCY_FMT;
- int overwrite_flag = save_flags & TRACE_ITER_OVERWRITE;
+ int lat_flag = save_flags & TRACE_ITER(LATENCY_FMT);
+ int overwrite_flag = save_flags & TRACE_ITER(OVERWRITE);
stop_wakeup_tracer(tr);
/* make sure we put back any tasks we are tracing */
wakeup_reset(tr);
- set_tracer_flag(tr, TRACE_ITER_LATENCY_FMT, lat_flag);
- set_tracer_flag(tr, TRACE_ITER_OVERWRITE, overwrite_flag);
+ set_tracer_flag(tr, TRACE_ITER(LATENCY_FMT), lat_flag);
+ set_tracer_flag(tr, TRACE_ITER(OVERWRITE), overwrite_flag);
ftrace_reset_array_ops(tr);
wakeup_busy = false;
}
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 0f932b22f9ec..e96d0063cbcf 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include <trace/syscall.h>
#include <trace/events/syscalls.h>
+#include <linux/kernel_stat.h>
#include <linux/syscalls.h>
#include <linux/slab.h>
#include <linux/kernel.h>
@@ -123,6 +124,119 @@ const char *get_syscall_name(int syscall)
return entry->name;
}
+/* Added to user strings or arrays when max limit is reached */
+#define EXTRA "..."
+
+static void get_dynamic_len_ptr(struct syscall_trace_enter *trace,
+ struct syscall_metadata *entry,
+ int *offset_p, int *len_p, unsigned char **ptr_p)
+{
+ unsigned char *ptr;
+ int offset = *offset_p;
+ int val;
+
+ /* This arg points to a user space string */
+ ptr = (void *)trace->args + sizeof(long) * entry->nb_args + offset;
+ val = *(int *)ptr;
+
+ /* The value is a dynamic string (len << 16 | offset) */
+ ptr = (void *)trace + (val & 0xffff);
+ *len_p = val >> 16;
+ offset += 4;
+
+ *ptr_p = ptr;
+ *offset_p = offset;
+}
+
+static enum print_line_t
+sys_enter_openat_print(struct syscall_trace_enter *trace, struct syscall_metadata *entry,
+ struct trace_seq *s, struct trace_event *event)
+{
+ unsigned char *ptr;
+ int offset = 0;
+ int bits, len;
+ bool done = false;
+ static const struct trace_print_flags __flags[] =
+ {
+ { O_TMPFILE, "O_TMPFILE" },
+ { O_WRONLY, "O_WRONLY" },
+ { O_RDWR, "O_RDWR" },
+ { O_CREAT, "O_CREAT" },
+ { O_EXCL, "O_EXCL" },
+ { O_NOCTTY, "O_NOCTTY" },
+ { O_TRUNC, "O_TRUNC" },
+ { O_APPEND, "O_APPEND" },
+ { O_NONBLOCK, "O_NONBLOCK" },
+ { O_DSYNC, "O_DSYNC" },
+ { O_DIRECT, "O_DIRECT" },
+ { O_LARGEFILE, "O_LARGEFILE" },
+ { O_DIRECTORY, "O_DIRECTORY" },
+ { O_NOFOLLOW, "O_NOFOLLOW" },
+ { O_NOATIME, "O_NOATIME" },
+ { O_CLOEXEC, "O_CLOEXEC" },
+ { -1, NULL }
+ };
+
+ trace_seq_printf(s, "%s(", entry->name);
+
+ for (int i = 0; !done && i < entry->nb_args; i++) {
+
+ if (trace_seq_has_overflowed(s))
+ goto end;
+
+ if (i)
+ trace_seq_puts(s, ", ");
+
+ switch (i) {
+ case 2:
+ bits = trace->args[2];
+
+ trace_seq_puts(s, "flags: ");
+
+ /* No need to show mode when not creating the file */
+ if (!(bits & (O_CREAT|O_TMPFILE)))
+ done = true;
+
+ if (!(bits & O_ACCMODE)) {
+ if (!bits) {
+ trace_seq_puts(s, "O_RDONLY");
+ continue;
+ }
+ trace_seq_puts(s, "O_RDONLY|");
+ }
+
+ trace_print_flags_seq(s, "|", bits, __flags);
+ /*
+ * trace_print_flags_seq() adds a '\0' to the
+ * buffer, but this needs to append more to the seq.
+ */
+ if (!trace_seq_has_overflowed(s))
+ trace_seq_pop(s);
+
+ continue;
+ case 3:
+ trace_seq_printf(s, "%s: 0%03o", entry->args[i],
+ (unsigned int)trace->args[i]);
+ continue;
+ }
+
+ trace_seq_printf(s, "%s: %lu", entry->args[i],
+ trace->args[i]);
+
+ if (!(BIT(i) & entry->user_mask))
+ continue;
+
+ get_dynamic_len_ptr(trace, entry, &offset, &len, &ptr);
+ trace_seq_printf(s, " \"%.*s\"", len, ptr);
+ }
+
+ trace_seq_putc(s, ')');
+end:
+ trace_seq_putc(s, '\n');
+
+ return trace_handle_return(s);
+}
+
static enum print_line_t
print_syscall_enter(struct trace_iterator *iter, int flags,
struct trace_event *event)
@@ -132,7 +246,9 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
struct trace_entry *ent = iter->ent;
struct syscall_trace_enter *trace;
struct syscall_metadata *entry;
- int i, syscall;
+ int i, syscall, val, len;
+ unsigned char *ptr;
+ int offset = 0;
trace = (typeof(trace))ent;
syscall = trace->nr;
@@ -146,9 +262,20 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
goto end;
}
+ switch (entry->syscall_nr) {
+ case __NR_openat:
+ if (!tr || !(tr->trace_flags & TRACE_ITER(VERBOSE)))
+ return sys_enter_openat_print(trace, entry, s, event);
+ break;
+ default:
+ break;
+ }
+
trace_seq_printf(s, "%s(", entry->name);
for (i = 0; i < entry->nb_args; i++) {
+ bool printable = false;
+ char *str;
if (trace_seq_has_overflowed(s))
goto end;
@@ -157,7 +284,7 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
trace_seq_puts(s, ", ");
/* parameter types */
- if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
+ if (tr && tr->trace_flags & TRACE_ITER(VERBOSE))
trace_seq_printf(s, "%s ", entry->types[i]);
/* parameter values */
@@ -167,6 +294,48 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
else
trace_seq_printf(s, "%s: 0x%lx", entry->args[i],
trace->args[i]);
+
+ if (!(BIT(i) & entry->user_mask))
+ continue;
+
+ get_dynamic_len_ptr(trace, entry, &offset, &len, &ptr);
+
+ if (entry->user_arg_size < 0 || entry->user_arg_is_str) {
+ trace_seq_printf(s, " \"%.*s\"", len, ptr);
+ continue;
+ }
+
+ val = trace->args[entry->user_arg_size];
+
+ str = ptr;
+ trace_seq_puts(s, " (");
+ for (int x = 0; x < len; x++, ptr++) {
+ if (isascii(*ptr) && isprint(*ptr))
+ printable = true;
+ if (x)
+ trace_seq_putc(s, ':');
+ trace_seq_printf(s, "%02x", *ptr);
+ }
+ if (len < val)
+ trace_seq_printf(s, ", %s", EXTRA);
+
+ trace_seq_putc(s, ')');
+
+ /* If nothing is printable, don't bother printing anything */
+ if (!printable)
+ continue;
+
+ trace_seq_puts(s, " \"");
+ for (int x = 0; x < len; x++) {
+ if (isascii(str[x]) && isprint(str[x]))
+ trace_seq_putc(s, str[x]);
+ else
+ trace_seq_putc(s, '.');
+ }
+ if (len < val)
+ trace_seq_printf(s, "\"%s", EXTRA);
+ else
+ trace_seq_putc(s, '"');
}
trace_seq_putc(s, ')');
@@ -212,26 +381,107 @@ print_syscall_exit(struct trace_iterator *iter, int flags,
.size = sizeof(_type), .align = __alignof__(_type), \
.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER }
+/* When len=0, we just calculate the needed length */
+#define LEN_OR_ZERO (len ? len - pos : 0)
+
+static int __init
+sys_enter_openat_print_fmt(struct syscall_metadata *entry, char *buf, int len)
+{
+ int pos = 0;
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "\"dfd: 0x%%08lx, filename: 0x%%08lx \\\"%%s\\\", flags: %%s%%s, mode: 0%%03o\",");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " ((unsigned long)(REC->dfd)),");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " ((unsigned long)(REC->filename)),");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " __get_str(__filename_val),");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " (REC->flags & ~3) && !(REC->flags & 3) ? \"O_RDONLY|\" : \"\", ");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->flags ? __print_flags(REC->flags, \"|\", ");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_WRONLY\" }, ", O_WRONLY);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_RDWR\" }, ", O_RDWR);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_CREAT\" }, ", O_CREAT);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_EXCL\" }, ", O_EXCL);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_NOCTTY\" }, ", O_NOCTTY);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_TRUNC\" }, ", O_TRUNC);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_APPEND\" }, ", O_APPEND);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_NONBLOCK\" }, ", O_NONBLOCK);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_DSYNC\" }, ", O_DSYNC);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_DIRECT\" }, ", O_DIRECT);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_LARGEFILE\" }, ", O_LARGEFILE);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_DIRECTORY\" }, ", O_DIRECTORY);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_NOFOLLOW\" }, ", O_NOFOLLOW);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_NOATIME\" }, ", O_NOATIME);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{ 0x%x, \"O_CLOEXEC\" }) : \"O_RDONLY\", ", O_CLOEXEC);
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " ((unsigned long)(REC->mode))");
+ return pos;
+}
+
static int __init
__set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
{
+ bool is_string = entry->user_arg_is_str;
int i;
int pos = 0;
- /* When len=0, we just calculate the needed length */
-#define LEN_OR_ZERO (len ? len - pos : 0)
+ switch (entry->syscall_nr) {
+ case __NR_openat:
+ return sys_enter_openat_print_fmt(entry, buf, len);
+ default:
+ break;
+ }
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
for (i = 0; i < entry->nb_args; i++) {
- pos += snprintf(buf + pos, LEN_OR_ZERO, "%s: 0x%%0%zulx%s",
- entry->args[i], sizeof(unsigned long),
- i == entry->nb_args - 1 ? "" : ", ");
+ if (i)
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ", ");
+ pos += snprintf(buf + pos, LEN_OR_ZERO, "%s: 0x%%0%zulx",
+ entry->args[i], sizeof(unsigned long));
+
+ if (!(BIT(i) & entry->user_mask))
+ continue;
+
+ /* Add the format for the user space string or array */
+ if (entry->user_arg_size < 0 || is_string)
+ pos += snprintf(buf + pos, LEN_OR_ZERO, " \\\"%%s\\\"");
+ else
+ pos += snprintf(buf + pos, LEN_OR_ZERO, " (%%s)");
}
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
for (i = 0; i < entry->nb_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO,
", ((unsigned long)(REC->%s))", entry->args[i]);
+ if (!(BIT(i) & entry->user_mask))
+ continue;
+ /* The user space data for arg has name __<arg>_val */
+ if (entry->user_arg_size < 0 || is_string) {
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ", __get_str(__%s_val)",
+ entry->args[i]);
+ } else {
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ", __print_dynamic_array(__%s_val, 1)",
+ entry->args[i]);
+ }
}
#undef LEN_OR_ZERO
@@ -277,8 +527,11 @@ static int __init syscall_enter_define_fields(struct trace_event_call *call)
{
struct syscall_trace_enter trace;
struct syscall_metadata *meta = call->data;
+ unsigned long mask;
+ char *arg;
int offset = offsetof(typeof(trace), args);
int ret = 0;
+ int len;
int i;
for (i = 0; i < meta->nb_args; i++) {
@@ -291,9 +544,320 @@ static int __init syscall_enter_define_fields(struct trace_event_call *call)
offset += sizeof(unsigned long);
}
+ if (ret || !meta->user_mask)
+ return ret;
+
+ mask = meta->user_mask;
+
+ while (mask) {
+ int idx = ffs(mask) - 1;
+ mask &= ~BIT(idx);
+
+ /*
+ * User space data is faulted into a temporary buffer and then
+ * added as a dynamic string or array to the end of the event.
+ * The user space data name for the arg pointer is
+ * "__<arg>_val".
+ */
+ len = strlen(meta->args[idx]) + sizeof("___val");
+ arg = kmalloc(len, GFP_KERNEL);
+ if (WARN_ON_ONCE(!arg)) {
+ meta->user_mask = 0;
+ return -ENOMEM;
+ }
+
+ snprintf(arg, len, "__%s_val", meta->args[idx]);
+
+ ret = trace_define_field(call, "__data_loc char[]",
+ arg, offset, sizeof(int), 0,
+ FILTER_OTHER);
+ if (ret) {
+ kfree(arg);
+ break;
+ }
+ offset += 4;
+ }
return ret;
}
+/*
+ * Create a per CPU temporary buffer to copy user space pointers into.
+ *
+ * SYSCALL_FAULT_USER_MAX is the amount to copy from user space.
+ * (defined in kernel/trace/trace.h)
+
+ * SYSCALL_FAULT_ARG_SZ is the amount to copy from user space plus the
+ * nul terminating byte and possibly appended EXTRA (4 bytes).
+ *
+ * SYSCALL_FAULT_BUF_SZ holds the size of the per CPU buffer to use
+ * to copy memory from user space addresses into that will hold
+ * 3 args as only 3 args are allowed to be copied from system calls.
+ */
+#define SYSCALL_FAULT_ARG_SZ (SYSCALL_FAULT_USER_MAX + 1 + 4)
+#define SYSCALL_FAULT_MAX_CNT 3
+#define SYSCALL_FAULT_BUF_SZ (SYSCALL_FAULT_ARG_SZ * SYSCALL_FAULT_MAX_CNT)
+
+/* Use the tracing per CPU buffer infrastructure to copy from user space */
+struct syscall_user_buffer {
+ struct trace_user_buf_info buf;
+ struct rcu_head rcu;
+};
+
+static struct syscall_user_buffer *syscall_buffer;
+
+static int syscall_fault_buffer_enable(void)
+{
+ struct syscall_user_buffer *sbuf;
+ int ret;
+
+ lockdep_assert_held(&syscall_trace_lock);
+
+ if (syscall_buffer) {
+ trace_user_fault_get(&syscall_buffer->buf);
+ return 0;
+ }
+
+ sbuf = kmalloc(sizeof(*sbuf), GFP_KERNEL);
+ if (!sbuf)
+ return -ENOMEM;
+
+ ret = trace_user_fault_init(&sbuf->buf, SYSCALL_FAULT_BUF_SZ);
+ if (ret < 0) {
+ kfree(sbuf);
+ return ret;
+ }
+
+ WRITE_ONCE(syscall_buffer, sbuf);
+
+ return 0;
+}
+
+static void rcu_free_syscall_buffer(struct rcu_head *rcu)
+{
+ struct syscall_user_buffer *sbuf =
+ container_of(rcu, struct syscall_user_buffer, rcu);
+
+ trace_user_fault_destroy(&sbuf->buf);
+ kfree(sbuf);
+}
+
+
+static void syscall_fault_buffer_disable(void)
+{
+ struct syscall_user_buffer *sbuf = syscall_buffer;
+
+ lockdep_assert_held(&syscall_trace_lock);
+
+ if (trace_user_fault_put(&sbuf->buf))
+ return;
+
+ WRITE_ONCE(syscall_buffer, NULL);
+ call_rcu_tasks_trace(&sbuf->rcu, rcu_free_syscall_buffer);
+}
+
+struct syscall_args {
+ char *ptr_array[SYSCALL_FAULT_MAX_CNT];
+ int read[SYSCALL_FAULT_MAX_CNT];
+ int uargs;
+};
+
+static int syscall_copy_user(char *buf, const char __user *ptr,
+ size_t size, void *data)
+{
+ struct syscall_args *args = data;
+ int ret;
+
+ for (int i = 0; i < args->uargs; i++, buf += SYSCALL_FAULT_ARG_SZ) {
+ ptr = (char __user *)args->ptr_array[i];
+ ret = strncpy_from_user(buf, ptr, size);
+ args->read[i] = ret;
+ }
+ return 0;
+}
+
+static int syscall_copy_user_array(char *buf, const char __user *ptr,
+ size_t size, void *data)
+{
+ struct syscall_args *args = data;
+ int ret;
+
+ for (int i = 0; i < args->uargs; i++, buf += SYSCALL_FAULT_ARG_SZ) {
+ ptr = (char __user *)args->ptr_array[i];
+ ret = __copy_from_user(buf, ptr, size);
+ args->read[i] = ret ? -1 : size;
+ }
+ return 0;
+}
+
+static char *sys_fault_user(unsigned int buf_size,
+ struct syscall_metadata *sys_data,
+ struct syscall_user_buffer *sbuf,
+ unsigned long *args,
+ unsigned int data_size[SYSCALL_FAULT_MAX_CNT])
+{
+ trace_user_buf_copy syscall_copy = syscall_copy_user;
+ unsigned long mask = sys_data->user_mask;
+ unsigned long size = SYSCALL_FAULT_ARG_SZ - 1;
+ struct syscall_args sargs;
+ bool array = false;
+ char *buffer;
+ char *buf;
+ int ret;
+ int i = 0;
+
+ /* The extra is appended to the user data in the buffer */
+ BUILD_BUG_ON(SYSCALL_FAULT_USER_MAX + sizeof(EXTRA) >=
+ SYSCALL_FAULT_ARG_SZ);
+
+ /*
+ * If this system call event has a size argument, use
+ * it to define how much of user space memory to read,
+ * and read it as an array and not a string.
+ */
+ if (sys_data->user_arg_size >= 0) {
+ array = true;
+ size = args[sys_data->user_arg_size];
+ if (size > SYSCALL_FAULT_ARG_SZ - 1)
+ size = SYSCALL_FAULT_ARG_SZ - 1;
+ syscall_copy = syscall_copy_user_array;
+ }
+
+ while (mask) {
+ int idx = ffs(mask) - 1;
+ mask &= ~BIT(idx);
+
+ if (WARN_ON_ONCE(i == SYSCALL_FAULT_MAX_CNT))
+ break;
+
+ /* Get the pointer to user space memory to read */
+ sargs.ptr_array[i++] = (char *)args[idx];
+ }
+
+ sargs.uargs = i;
+
+ /* Clear the values that are not used */
+ for (; i < SYSCALL_FAULT_MAX_CNT; i++) {
+ data_size[i] = -1; /* Denotes no pointer */
+ }
+
+ /* A zero size means do not even try */
+ if (!buf_size)
+ return NULL;
+
+ buffer = trace_user_fault_read(&sbuf->buf, NULL, size,
+ syscall_copy, &sargs);
+ if (!buffer)
+ return NULL;
+
+ buf = buffer;
+ for (i = 0; i < sargs.uargs; i++, buf += SYSCALL_FAULT_ARG_SZ) {
+
+ ret = sargs.read[i];
+ if (ret < 0)
+ continue;
+ buf[ret] = '\0';
+
+ /* For strings, replace any non-printable characters with '.' */
+ if (!array) {
+ for (int x = 0; x < ret; x++) {
+ if (!isprint(buf[x]))
+ buf[x] = '.';
+ }
+
+ size = min(buf_size, SYSCALL_FAULT_USER_MAX);
+
+ /*
+ * If the text was truncated due to our max limit,
+ * add "..." to the string.
+ */
+ if (ret > size) {
+ strscpy(buf + size, EXTRA, sizeof(EXTRA));
+ ret = size + sizeof(EXTRA);
+ } else {
+ buf[ret++] = '\0';
+ }
+ } else {
+ ret = min((unsigned int)ret, buf_size);
+ }
+ data_size[i] = ret;
+ }
+
+ return buffer;
+}
+
+static int
+syscall_get_data(struct syscall_metadata *sys_data, unsigned long *args,
+ char **buffer, int *size, int *user_sizes, int *uargs,
+ int buf_size)
+{
+ struct syscall_user_buffer *sbuf;
+ int i;
+
+ /* If the syscall_buffer is NULL, tracing is being shutdown */
+ sbuf = READ_ONCE(syscall_buffer);
+ if (!sbuf)
+ return -1;
+
+ *buffer = sys_fault_user(buf_size, sys_data, sbuf, args, user_sizes);
+ /*
+ * user_size is the amount of data to append.
+ * Need to add 4 for the meta field that points to
+ * the user memory at the end of the event and also
+ * stores its size.
+ */
+ for (i = 0; i < SYSCALL_FAULT_MAX_CNT; i++) {
+ if (user_sizes[i] < 0)
+ break;
+ *size += user_sizes[i] + 4;
+ }
+ /* Save the number of user read arguments of this syscall */
+ *uargs = i;
+ return 0;
+}
+
+static void syscall_put_data(struct syscall_metadata *sys_data,
+ struct syscall_trace_enter *entry,
+ char *buffer, int size, int *user_sizes, int uargs)
+{
+ char *buf = buffer;
+ void *ptr;
+ int val;
+
+ /*
+ * Set the pointer to point to the meta data of the event
+ * that has information about the stored user space memory.
+ */
+ ptr = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
+
+ /*
+ * The meta data will store the offset of the user data from
+ * the beginning of the event. That is after the static arguments
+ * and the meta data fields.
+ */
+ val = (ptr - (void *)entry) + 4 * uargs;
+
+ for (int i = 0; i < uargs; i++) {
+
+ if (i)
+ val += user_sizes[i - 1];
+
+ /* Store the offset and the size into the meta data */
+ *(int *)ptr = val | (user_sizes[i] << 16);
+
+ /* Skip the meta data */
+ ptr += 4;
+ }
+
+ for (int i = 0; i < uargs; i++, buf += SYSCALL_FAULT_ARG_SZ) {
+ /* Nothing to do if the user space was empty or faulted */
+ if (!user_sizes[i])
+ continue;
+
+ memcpy(ptr, buf, user_sizes[i]);
+ ptr += user_sizes[i];
+ }
+}
+
static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
{
struct trace_array *tr = data;
@@ -302,15 +866,18 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
struct syscall_metadata *sys_data;
struct trace_event_buffer fbuffer;
unsigned long args[6];
+ char *user_ptr;
+ int user_sizes[SYSCALL_FAULT_MAX_CNT] = {};
int syscall_nr;
- int size;
+ int size = 0;
+ int uargs = 0;
+ bool mayfault;
/*
* Syscall probe called with preemption enabled, but the ring
* buffer and per-cpu data require preemption to be disabled.
*/
might_fault();
- guard(preempt_notrace)();
syscall_nr = trace_get_syscall_nr(current, regs);
if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
@@ -327,7 +894,20 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (!sys_data)
return;
- size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
+ /* Check if this syscall event faults in user space memory */
+ mayfault = sys_data->user_mask != 0;
+
+ guard(preempt_notrace)();
+
+ syscall_get_arguments(current, regs, args);
+
+ if (mayfault) {
+ if (syscall_get_data(sys_data, args, &user_ptr,
+ &size, user_sizes, &uargs, tr->syscall_buf_sz) < 0)
+ return;
+ }
+
+ size += sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
entry = trace_event_buffer_reserve(&fbuffer, trace_file, size);
if (!entry)
@@ -335,9 +915,12 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
entry = ring_buffer_event_data(fbuffer.event);
entry->nr = syscall_nr;
- syscall_get_arguments(current, regs, args);
+
memcpy(entry->args, args, sizeof(unsigned long) * sys_data->nb_args);
+ if (mayfault)
+ syscall_put_data(sys_data, entry, user_ptr, size, user_sizes, uargs);
+
trace_event_buffer_commit(&fbuffer);
}
@@ -386,39 +969,50 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
static int reg_event_syscall_enter(struct trace_event_file *file,
struct trace_event_call *call)
{
+ struct syscall_metadata *sys_data = call->data;
struct trace_array *tr = file->tr;
int ret = 0;
int num;
- num = ((struct syscall_metadata *)call->data)->syscall_nr;
+ num = sys_data->syscall_nr;
if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
return -ENOSYS;
- mutex_lock(&syscall_trace_lock);
- if (!tr->sys_refcount_enter)
+ guard(mutex)(&syscall_trace_lock);
+ if (sys_data->user_mask) {
+ ret = syscall_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+ }
+ if (!tr->sys_refcount_enter) {
ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
- if (!ret) {
- WRITE_ONCE(tr->enter_syscall_files[num], file);
- tr->sys_refcount_enter++;
+ if (ret < 0) {
+ if (sys_data->user_mask)
+ syscall_fault_buffer_disable();
+ return ret;
+ }
}
- mutex_unlock(&syscall_trace_lock);
- return ret;
+ WRITE_ONCE(tr->enter_syscall_files[num], file);
+ tr->sys_refcount_enter++;
+ return 0;
}
static void unreg_event_syscall_enter(struct trace_event_file *file,
struct trace_event_call *call)
{
+ struct syscall_metadata *sys_data = call->data;
struct trace_array *tr = file->tr;
int num;
- num = ((struct syscall_metadata *)call->data)->syscall_nr;
+ num = sys_data->syscall_nr;
if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
return;
- mutex_lock(&syscall_trace_lock);
+ guard(mutex)(&syscall_trace_lock);
tr->sys_refcount_enter--;
WRITE_ONCE(tr->enter_syscall_files[num], NULL);
if (!tr->sys_refcount_enter)
unregister_trace_sys_enter(ftrace_syscall_enter, tr);
- mutex_unlock(&syscall_trace_lock);
+ if (sys_data->user_mask)
+ syscall_fault_buffer_disable();
}
static int reg_event_syscall_exit(struct trace_event_file *file,
@@ -459,6 +1053,215 @@ static void unreg_event_syscall_exit(struct trace_event_file *file,
mutex_unlock(&syscall_trace_lock);
}
+/*
+ * For system calls that reference user space memory that can
+ * be recorded into the event, set the system call meta data's user_mask
+ * to the "args" index that points to the user space memory to retrieve.
+ */
+static void check_faultable_syscall(struct trace_event_call *call, int nr)
+{
+ struct syscall_metadata *sys_data = call->data;
+ unsigned long mask;
+
+ /* Only work on entry */
+ if (sys_data->enter_event != call)
+ return;
+
+ sys_data->user_arg_size = -1;
+
+ switch (nr) {
+ /* user arg 1 with size arg at 2 */
+ case __NR_write:
+#ifdef __NR_mq_timedsend
+ case __NR_mq_timedsend:
+#endif
+ case __NR_pwrite64:
+ sys_data->user_mask = BIT(1);
+ sys_data->user_arg_size = 2;
+ break;
+ /* user arg 0 with size arg at 1 as string */
+ case __NR_setdomainname:
+ case __NR_sethostname:
+ sys_data->user_mask = BIT(0);
+ sys_data->user_arg_size = 1;
+ sys_data->user_arg_is_str = 1;
+ break;
+#ifdef __NR_kexec_file_load
+ /* user arg 4 with size arg at 3 as string */
+ case __NR_kexec_file_load:
+ sys_data->user_mask = BIT(4);
+ sys_data->user_arg_size = 3;
+ sys_data->user_arg_is_str = 1;
+ break;
+#endif
+ /* user arg at position 0 */
+#ifdef __NR_access
+ case __NR_access:
+#endif
+ case __NR_acct:
+ case __NR_chdir:
+#ifdef __NR_chown
+ case __NR_chown:
+#endif
+#ifdef __NR_chmod
+ case __NR_chmod:
+#endif
+ case __NR_chroot:
+#ifdef __NR_creat
+ case __NR_creat:
+#endif
+ case __NR_delete_module:
+ case __NR_execve:
+ case __NR_fsopen:
+#ifdef __NR_lchown
+ case __NR_lchown:
+#endif
+#ifdef __NR_open
+ case __NR_open:
+#endif
+ case __NR_memfd_create:
+#ifdef __NR_mkdir
+ case __NR_mkdir:
+#endif
+#ifdef __NR_mknod
+ case __NR_mknod:
+#endif
+ case __NR_mq_open:
+ case __NR_mq_unlink:
+#ifdef __NR_readlink
+ case __NR_readlink:
+#endif
+#ifdef __NR_rmdir
+ case __NR_rmdir:
+#endif
+ case __NR_shmdt:
+#ifdef __NR_statfs
+ case __NR_statfs:
+#endif
+ case __NR_swapon:
+ case __NR_swapoff:
+#ifdef __NR_truncate
+ case __NR_truncate:
+#endif
+#ifdef __NR_unlink
+ case __NR_unlink:
+#endif
+ case __NR_umount2:
+#ifdef __NR_utime
+ case __NR_utime:
+#endif
+#ifdef __NR_utimes
+ case __NR_utimes:
+#endif
+ sys_data->user_mask = BIT(0);
+ break;
+ /* user arg at position 1 */
+ case __NR_execveat:
+ case __NR_faccessat:
+ case __NR_faccessat2:
+ case __NR_finit_module:
+ case __NR_fchmodat:
+ case __NR_fchmodat2:
+ case __NR_fchownat:
+ case __NR_fgetxattr:
+ case __NR_flistxattr:
+ case __NR_fsetxattr:
+ case __NR_fspick:
+ case __NR_fremovexattr:
+#ifdef __NR_futimesat
+ case __NR_futimesat:
+#endif
+ case __NR_inotify_add_watch:
+ case __NR_mkdirat:
+ case __NR_mknodat:
+ case __NR_mount_setattr:
+ case __NR_name_to_handle_at:
+#ifdef __NR_newfstatat
+ case __NR_newfstatat:
+#endif
+ case __NR_openat:
+ case __NR_openat2:
+ case __NR_open_tree:
+ case __NR_open_tree_attr:
+ case __NR_readlinkat:
+ case __NR_quotactl:
+ case __NR_syslog:
+ case __NR_statx:
+ case __NR_unlinkat:
+#ifdef __NR_utimensat
+ case __NR_utimensat:
+#endif
+ sys_data->user_mask = BIT(1);
+ break;
+ /* user arg at position 2 */
+ case __NR_init_module:
+ case __NR_fsconfig:
+ sys_data->user_mask = BIT(2);
+ break;
+ /* user arg at position 4 */
+ case __NR_fanotify_mark:
+ sys_data->user_mask = BIT(4);
+ break;
+ /* 2 user args, 0 and 1 */
+ case __NR_add_key:
+ case __NR_getxattr:
+ case __NR_lgetxattr:
+ case __NR_lremovexattr:
+#ifdef __NR_link
+ case __NR_link:
+#endif
+ case __NR_listxattr:
+ case __NR_llistxattr:
+ case __NR_lsetxattr:
+ case __NR_pivot_root:
+ case __NR_removexattr:
+#ifdef __NR_rename
+ case __NR_rename:
+#endif
+ case __NR_request_key:
+ case __NR_setxattr:
+#ifdef __NR_symlink
+ case __NR_symlink:
+#endif
+ sys_data->user_mask = BIT(0) | BIT(1);
+ break;
+ /* 2 user args, 0 and 2 */
+ case __NR_symlinkat:
+ sys_data->user_mask = BIT(0) | BIT(2);
+ break;
+ /* 2 user args, 1 and 3 */
+ case __NR_getxattrat:
+ case __NR_linkat:
+ case __NR_listxattrat:
+ case __NR_move_mount:
+#ifdef __NR_renameat
+ case __NR_renameat:
+#endif
+ case __NR_renameat2:
+ case __NR_removexattrat:
+ case __NR_setxattrat:
+ sys_data->user_mask = BIT(1) | BIT(3);
+ break;
+ case __NR_mount: /* Just dev_name and dir_name, TODO add type */
+ sys_data->user_mask = BIT(0) | BIT(1) | BIT(2);
+ break;
+ default:
+ sys_data->user_mask = 0;
+ return;
+ }
+
+ if (sys_data->user_arg_size < 0)
+ return;
+
+ /*
+ * The user_arg_size can only be used when the system call
+ * is reading only a single address from user space.
+ */
+ mask = sys_data->user_mask;
+ if (WARN_ON(mask & (mask - 1)))
+ sys_data->user_arg_size = -1;
+}
+
static int __init init_syscall_trace(struct trace_event_call *call)
{
int id;
@@ -471,6 +1274,8 @@ static int __init init_syscall_trace(struct trace_event_call *call)
return -ENOSYS;
}
+ check_faultable_syscall(call, num);
+
if (set_syscall_print_fmt(call) < 0)
return -ENOMEM;
@@ -598,9 +1403,14 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
struct hlist_head *head;
unsigned long args[6];
bool valid_prog_array;
+ bool mayfault;
+ char *user_ptr;
+ int user_sizes[SYSCALL_FAULT_MAX_CNT] = {};
+ int buf_size = CONFIG_TRACE_SYSCALL_BUF_SIZE_DEFAULT;
int syscall_nr;
int rctx;
- int size;
+ int size = 0;
+ int uargs = 0;
/*
* Syscall probe called with preemption enabled, but the ring
@@ -619,13 +1429,24 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
if (!sys_data)
return;
+ syscall_get_arguments(current, regs, args);
+
+ /* Check if this syscall event faults in user space memory */
+ mayfault = sys_data->user_mask != 0;
+
+ if (mayfault) {
+ if (syscall_get_data(sys_data, args, &user_ptr,
+ &size, user_sizes, &uargs, buf_size) < 0)
+ return;
+ }
+
head = this_cpu_ptr(sys_data->enter_event->perf_events);
valid_prog_array = bpf_prog_array_valid(sys_data->enter_event);
if (!valid_prog_array && hlist_empty(head))
return;
/* get the size after alignment with the u32 buffer size field */
- size = sizeof(unsigned long) * sys_data->nb_args + sizeof(*rec);
+ size += sizeof(unsigned long) * sys_data->nb_args + sizeof(*rec);
size = ALIGN(size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);
@@ -634,9 +1455,11 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
return;
rec->nr = syscall_nr;
- syscall_get_arguments(current, regs, args);
memcpy(&rec->args, args, sizeof(unsigned long) * sys_data->nb_args);
+ if (mayfault)
+ syscall_put_data(sys_data, rec, user_ptr, size, user_sizes, uargs);
+
if ((valid_prog_array &&
!perf_call_bpf_enter(sys_data->enter_event, fake_regs, sys_data, rec)) ||
hlist_empty(head)) {
@@ -651,36 +1474,46 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
static int perf_sysenter_enable(struct trace_event_call *call)
{
- int ret = 0;
+ struct syscall_metadata *sys_data = call->data;
int num;
+ int ret;
- num = ((struct syscall_metadata *)call->data)->syscall_nr;
+ num = sys_data->syscall_nr;
- mutex_lock(&syscall_trace_lock);
- if (!sys_perf_refcount_enter)
+ guard(mutex)(&syscall_trace_lock);
+ if (sys_data->user_mask) {
+ ret = syscall_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+ }
+ if (!sys_perf_refcount_enter) {
ret = register_trace_sys_enter(perf_syscall_enter, NULL);
- if (ret) {
- pr_info("event trace: Could not activate syscall entry trace point");
- } else {
- set_bit(num, enabled_perf_enter_syscalls);
- sys_perf_refcount_enter++;
+ if (ret) {
+ pr_info("event trace: Could not activate syscall entry trace point");
+ if (sys_data->user_mask)
+ syscall_fault_buffer_disable();
+ return ret;
+ }
}
- mutex_unlock(&syscall_trace_lock);
- return ret;
+ set_bit(num, enabled_perf_enter_syscalls);
+ sys_perf_refcount_enter++;
+ return 0;
}
static void perf_sysenter_disable(struct trace_event_call *call)
{
+ struct syscall_metadata *sys_data = call->data;
int num;
- num = ((struct syscall_metadata *)call->data)->syscall_nr;
+ num = sys_data->syscall_nr;
- mutex_lock(&syscall_trace_lock);
+ guard(mutex)(&syscall_trace_lock);
sys_perf_refcount_enter--;
clear_bit(num, enabled_perf_enter_syscalls);
if (!sys_perf_refcount_enter)
unregister_trace_sys_enter(perf_syscall_enter, NULL);
- mutex_unlock(&syscall_trace_lock);
+ if (sys_data->user_mask)
+ syscall_fault_buffer_disable();
}
static int perf_call_bpf_exit(struct trace_event_call *call, struct pt_regs *regs,
@@ -757,22 +1590,21 @@ static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
static int perf_sysexit_enable(struct trace_event_call *call)
{
- int ret = 0;
int num;
num = ((struct syscall_metadata *)call->data)->syscall_nr;
- mutex_lock(&syscall_trace_lock);
- if (!sys_perf_refcount_exit)
- ret = register_trace_sys_exit(perf_syscall_exit, NULL);
- if (ret) {
- pr_info("event trace: Could not activate syscall exit trace point");
- } else {
- set_bit(num, enabled_perf_exit_syscalls);
- sys_perf_refcount_exit++;
+ guard(mutex)(&syscall_trace_lock);
+ if (!sys_perf_refcount_exit) {
+ int ret = register_trace_sys_exit(perf_syscall_exit, NULL);
+ if (ret) {
+ pr_info("event trace: Could not activate syscall exit trace point");
+ return ret;
+ }
}
- mutex_unlock(&syscall_trace_lock);
- return ret;
+ set_bit(num, enabled_perf_exit_syscalls);
+ sys_perf_refcount_exit++;
+ return 0;
}
static void perf_sysexit_disable(struct trace_event_call *call)
@@ -781,12 +1613,11 @@ static void perf_sysexit_disable(struct trace_event_call *call)
num = ((struct syscall_metadata *)call->data)->syscall_nr;
- mutex_lock(&syscall_trace_lock);
+ guard(mutex)(&syscall_trace_lock);
sys_perf_refcount_exit--;
clear_bit(num, enabled_perf_exit_syscalls);
if (!sys_perf_refcount_exit)
unregister_trace_sys_exit(perf_syscall_exit, NULL);
- mutex_unlock(&syscall_trace_lock);
}
#endif /* CONFIG_PERF_EVENTS */