Commit e82894f8 authored by Tom Zanussi's avatar Tom Zanussi Committed by Linus Torvalds

[PATCH] relayfs

Here's the latest version of relayfs, against linux-2.6.11-mm2.  I'm hoping
you'll consider putting this version back into your tree - the previous
rounds of comment seem to have shaken out all the API issues and the number
of comments on the code itself have also steadily dwindled.

This patch is essentially the same as the relayfs redux part 5 patch, with
some minor changes based on reviewer comments.  Thanks again to Pekka
Enberg for those.  The patch size without documentation is now a little
smaller at just over 40k.  Here's a detailed list of the changes:

- removed the attribute_flags in relay open and changed it to a
  boolean specifying either overwrite or no-overwrite mode, and removed
  everything referencing the attribute flags.
- added a check for NULL names in relayfs_create_entry()
- got rid of the unnecessary multiple labels in relay_create_buf()
- some minor simplification of relay_alloc_buf() which got rid of a
  couple params
- updated the Documentation

In addition, this version (through code contained in the relay-apps tarball
linked to below, not as part of the relayfs patch) tries to make it as easy
as possible to create the cooperating kernel/user pieces of a typical and
common type of logging application, one where kernel logging is kicked off
when a user space data collection app starts and stops when the collection
app exits, with the data being automatically logged to disk in between.  To
create this type of application, you basically just include a header file
(relay-app.h, included in the relay-apps tarball) in your kernel module,
define a couple of callbacks and call an initialization function, and on
the user side call a single function that sets up and continuously monitors
the buffers, and writes data to files as it becomes available.  Channels
are created when the collection app is started and destroyed when it exits,
not when the kernel module is inserted, so different channel buffer sizes
can be specified for each separate run via command-line options.  See the
README in the relay-apps tarball for details.

Also included in the relay-apps tarball are a couple examples
demonstrating how you can use this to create quick and dirty kernel
logging/debugging applications.  They are:

- tprintk, short for 'tee printk', which temporarily puts a kprobe on
  printk() and writes a duplicate stream of printk output to a relayfs
  channel.  This could be used anywhere there's printk() debugging code
  in the kernel which you'd like to exercise, but would rather not have
  your system logs cluttered with debugging junk.  You'd probably want
  to kill klogd while you do this, otherwise there wouldn't be much
  point (since putting a kprobe on printk() doesn't change the output
  of printk()).  I've used this method to temporarily divert the packet
  logging output of the iptables LOG target from the system logs to
  relayfs files instead, for instance.

- klog, which just provides a printk-like formatted logging function
  on top of relayfs.  Again, you can use this to keep stuff out of your
  system logs if used in place of printk.

The example applications can be found here:

http://prdownloads.sourceforge.net/dprobes/relay-apps.tar.gz?download

From: Christoph Hellwig <hch@lst.de>

  avoid lookup_hash usage in relayfs
Signed-off-by: default avatarTom Zanussi <zanussi@us.ibm.com>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 8446f1d3
This diff is collapsed.
......@@ -816,6 +816,18 @@ config RAMFS
To compile this as a module, choose M here: the module will be called
ramfs.
config RELAYFS_FS
tristate "Relayfs file system support"
---help---
Relayfs is a high-speed data relay filesystem designed to provide
an efficient mechanism for tools and facilities to relay large
amounts of data from kernel space to user space.
To compile this code as a module, choose M here: the module will be
called relayfs.
If unsure, say N.
endmenu
menu "Miscellaneous filesystems"
......
......@@ -90,6 +90,7 @@ obj-$(CONFIG_AUTOFS_FS) += autofs/
obj-$(CONFIG_AUTOFS4_FS) += autofs4/
obj-$(CONFIG_ADFS_FS) += adfs/
obj-$(CONFIG_UDF_FS) += udf/
obj-$(CONFIG_RELAYFS_FS) += relayfs/
obj-$(CONFIG_SUN_OPENPROMFS) += openpromfs/
obj-$(CONFIG_JFS_FS) += jfs/
obj-$(CONFIG_XFS_FS) += xfs/
......
obj-$(CONFIG_RELAYFS_FS) += relayfs.o
relayfs-y := relay.o inode.o buffers.o
/*
* RelayFS buffer management code.
*
* Copyright (C) 2002-2005 - Tom Zanussi (zanussi@us.ibm.com), IBM Corp
* Copyright (C) 1999-2005 - Karim Yaghmour (karim@opersys.com)
*
* This file is released under the GPL.
*/
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
#include <linux/relayfs_fs.h>
#include "relay.h"
#include "buffers.h"
/*
* close() vm_op implementation for relayfs file mapping.
*/
static void relay_file_mmap_close(struct vm_area_struct *vma)
{
struct rchan_buf *buf = vma->vm_private_data;
buf->chan->cb->buf_unmapped(buf, vma->vm_file);
}
/*
* nopage() vm_op implementation for relayfs file mapping.
*/
static struct page *relay_buf_nopage(struct vm_area_struct *vma,
unsigned long address,
int *type)
{
struct page *page;
struct rchan_buf *buf = vma->vm_private_data;
unsigned long offset = address - vma->vm_start;
if (address > vma->vm_end)
return NOPAGE_SIGBUS; /* Disallow mremap */
if (!buf)
return NOPAGE_OOM;
page = vmalloc_to_page(buf->start + offset);
if (!page)
return NOPAGE_OOM;
get_page(page);
if (type)
*type = VM_FAULT_MINOR;
return page;
}
/*
* vm_ops for relay file mappings.
*/
static struct vm_operations_struct relay_file_mmap_ops = {
.nopage = relay_buf_nopage,
.close = relay_file_mmap_close,
};
/**
* relay_mmap_buf: - mmap channel buffer to process address space
* @buf: relay channel buffer
* @vma: vm_area_struct describing memory to be mapped
*
* Returns 0 if ok, negative on error
*
* Caller should already have grabbed mmap_sem.
*/
int relay_mmap_buf(struct rchan_buf *buf, struct vm_area_struct *vma)
{
unsigned long length = vma->vm_end - vma->vm_start;
struct file *filp = vma->vm_file;
if (!buf)
return -EBADF;
if (length != (unsigned long)buf->chan->alloc_size)
return -EINVAL;
vma->vm_ops = &relay_file_mmap_ops;
vma->vm_private_data = buf;
buf->chan->cb->buf_mapped(buf, filp);
return 0;
}
/**
* relay_alloc_buf - allocate a channel buffer
* @buf: the buffer struct
* @size: total size of the buffer
*
* Returns a pointer to the resulting buffer, NULL if unsuccessful
*/
static void *relay_alloc_buf(struct rchan_buf *buf, unsigned long size)
{
void *mem;
unsigned int i, j, n_pages;
size = PAGE_ALIGN(size);
n_pages = size >> PAGE_SHIFT;
buf->page_array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
if (!buf->page_array)
return NULL;
for (i = 0; i < n_pages; i++) {
buf->page_array[i] = alloc_page(GFP_KERNEL);
if (unlikely(!buf->page_array[i]))
goto depopulate;
}
mem = vmap(buf->page_array, n_pages, GFP_KERNEL, PAGE_KERNEL);
if (!mem)
goto depopulate;
memset(mem, 0, size);
buf->page_count = n_pages;
return mem;
depopulate:
for (j = 0; j < i; j++)
__free_page(buf->page_array[j]);
kfree(buf->page_array);
return NULL;
}
/**
* relay_create_buf - allocate and initialize a channel buffer
* @alloc_size: size of the buffer to allocate
* @n_subbufs: number of sub-buffers in the channel
*
* Returns channel buffer if successful, NULL otherwise
*/
struct rchan_buf *relay_create_buf(struct rchan *chan)
{
struct rchan_buf *buf = kcalloc(1, sizeof(struct rchan_buf), GFP_KERNEL);
if (!buf)
return NULL;
buf->padding = kmalloc(chan->n_subbufs * sizeof(size_t *), GFP_KERNEL);
if (!buf->padding)
goto free_buf;
buf->start = relay_alloc_buf(buf, chan->alloc_size);
if (!buf->start)
goto free_buf;
buf->chan = chan;
kref_get(&buf->chan->kref);
return buf;
free_buf:
kfree(buf->padding);
kfree(buf);
return NULL;
}
/**
* relay_destroy_buf - destroy an rchan_buf struct and associated buffer
* @buf: the buffer struct
*/
void relay_destroy_buf(struct rchan_buf *buf)
{
struct rchan *chan = buf->chan;
unsigned int i;
if (likely(buf->start)) {
vunmap(buf->start);
for (i = 0; i < buf->page_count; i++)
__free_page(buf->page_array[i]);
kfree(buf->page_array);
}
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
}
/**
* relay_remove_buf - remove a channel buffer
*
* Removes the file from the relayfs fileystem, which also frees the
* rchan_buf_struct and the channel buffer. Should only be called from
* kref_put().
*/
void relay_remove_buf(struct kref *kref)
{
struct rchan_buf *buf = container_of(kref, struct rchan_buf, kref);
relayfs_remove(buf->dentry);
}
#ifndef _BUFFERS_H
#define _BUFFERS_H
/* This inspired by rtai/shmem */
#define FIX_SIZE(x) (((x) - 1) & PAGE_MASK) + PAGE_SIZE
extern int relay_mmap_buf(struct rchan_buf *buf, struct vm_area_struct *vma);
extern struct rchan_buf *relay_create_buf(struct rchan *chan);
extern void relay_destroy_buf(struct rchan_buf *buf);
extern void relay_remove_buf(struct kref *kref);
#endif/* _BUFFERS_H */
This diff is collapsed.
This diff is collapsed.
#ifndef _RELAY_H
#define _RELAY_H
struct dentry *relayfs_create_file(const char *name,
struct dentry *parent,
int mode,
struct rchan *chan);
extern int relayfs_remove(struct dentry *dentry);
extern int relay_buf_empty(struct rchan_buf *buf);
extern void relay_destroy_channel(struct kref *kref);
#endif /* _RELAY_H */
/*
* linux/include/linux/relayfs_fs.h
*
* Copyright (C) 2002, 2003 - Tom Zanussi (zanussi@us.ibm.com), IBM Corp
* Copyright (C) 1999, 2000, 2001, 2002 - Karim Yaghmour (karim@opersys.com)
*
* RelayFS definitions and declarations
*/
#ifndef _LINUX_RELAYFS_FS_H
#define _LINUX_RELAYFS_FS_H
#include <linux/config.h>
#include <linux/types.h>
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/list.h>
#include <linux/fs.h>
#include <linux/poll.h>
#include <linux/kref.h>
/*
* Tracks changes to rchan_buf struct
*/
#define RELAYFS_CHANNEL_VERSION 5
/*
* Per-cpu relay channel buffer
*/
struct rchan_buf
{
void *start; /* start of channel buffer */
void *data; /* start of current sub-buffer */
size_t offset; /* current offset into sub-buffer */
size_t subbufs_produced; /* count of sub-buffers produced */
size_t subbufs_consumed; /* count of sub-buffers consumed */
struct rchan *chan; /* associated channel */
wait_queue_head_t read_wait; /* reader wait queue */
struct work_struct wake_readers; /* reader wake-up work struct */
struct dentry *dentry; /* channel file dentry */
struct kref kref; /* channel buffer refcount */
struct page **page_array; /* array of current buffer pages */
unsigned int page_count; /* number of current buffer pages */
unsigned int finalized; /* buffer has been finalized */
size_t *padding; /* padding counts per sub-buffer */
size_t prev_padding; /* temporary variable */
size_t bytes_consumed; /* bytes consumed in cur read subbuf */
unsigned int cpu; /* this buf's cpu */
} ____cacheline_aligned;
/*
* Relay channel data structure
*/
struct rchan
{
u32 version; /* the version of this struct */
size_t subbuf_size; /* sub-buffer size */
size_t n_subbufs; /* number of sub-buffers per buffer */
size_t alloc_size; /* total buffer size allocated */
struct rchan_callbacks *cb; /* client callbacks */
struct kref kref; /* channel refcount */
void *private_data; /* for user-defined data */
struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
};
/*
* Relayfs inode
*/
struct relayfs_inode_info
{
struct inode vfs_inode;
struct rchan_buf *buf;
};
static inline struct relayfs_inode_info *RELAYFS_I(struct inode *inode)
{
return container_of(inode, struct relayfs_inode_info, vfs_inode);
}
/*
* Relay channel client callbacks
*/
struct rchan_callbacks
{
/*
* subbuf_start - called on buffer-switch to a new sub-buffer
* @buf: the channel buffer containing the new sub-buffer
* @subbuf: the start of the new sub-buffer
* @prev_subbuf: the start of the previous sub-buffer
* @prev_padding: unused space at the end of previous sub-buffer
*
* The client should return 1 to continue logging, 0 to stop
* logging.
*
* NOTE: subbuf_start will also be invoked when the buffer is
* created, so that the first sub-buffer can be initialized
* if necessary. In this case, prev_subbuf will be NULL.
*
* NOTE: the client can reserve bytes at the beginning of the new
* sub-buffer by calling subbuf_start_reserve() in this callback.
*/
int (*subbuf_start) (struct rchan_buf *buf,
void *subbuf,
void *prev_subbuf,
size_t prev_padding);
/*
* buf_mapped - relayfs buffer mmap notification
* @buf: the channel buffer
* @filp: relayfs file pointer
*
* Called when a relayfs file is successfully mmapped
*/
void (*buf_mapped)(struct rchan_buf *buf,
struct file *filp);
/*
* buf_unmapped - relayfs buffer unmap notification
* @buf: the channel buffer
* @filp: relayfs file pointer
*
* Called when a relayfs file is successfully unmapped
*/
void (*buf_unmapped)(struct rchan_buf *buf,
struct file *filp);
};
/*
* relayfs kernel API, fs/relayfs/relay.c
*/
struct rchan *relay_open(const char *base_filename,
struct dentry *parent,
size_t subbuf_size,
size_t n_subbufs,
struct rchan_callbacks *cb);
extern void relay_close(struct rchan *chan);
extern void relay_flush(struct rchan *chan);
extern void relay_subbufs_consumed(struct rchan *chan,
unsigned int cpu,
size_t consumed);
extern void relay_reset(struct rchan *chan);
extern int relay_buf_full(struct rchan_buf *buf);
extern size_t relay_switch_subbuf(struct rchan_buf *buf,
size_t length);
extern struct dentry *relayfs_create_dir(const char *name,
struct dentry *parent);
extern int relayfs_remove_dir(struct dentry *dentry);
/**
* relay_write - write data into the channel
* @chan: relay channel
* @data: data to be written
* @length: number of bytes to write
*
* Writes data into the current cpu's channel buffer.
*
* Protects the buffer by disabling interrupts. Use this
* if you might be logging from interrupt context. Try
* __relay_write() if you know you won't be logging from
* interrupt context.
*/
static inline void relay_write(struct rchan *chan,
const void *data,
size_t length)
{
unsigned long flags;
struct rchan_buf *buf;
local_irq_save(flags);
buf = chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
local_irq_restore(flags);
}
/**
* __relay_write - write data into the channel
* @chan: relay channel
* @data: data to be written
* @length: number of bytes to write
*
* Writes data into the current cpu's channel buffer.
*
* Protects the buffer by disabling preemption. Use
* relay_write() if you might be logging from interrupt
* context.
*/
static inline void __relay_write(struct rchan *chan,
const void *data,
size_t length)
{
struct rchan_buf *buf;
buf = chan->buf[get_cpu()];
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
put_cpu();
}
/**
* relay_reserve - reserve slot in channel buffer
* @chan: relay channel
* @length: number of bytes to reserve
*
* Returns pointer to reserved slot, NULL if full.
*
* Reserves a slot in the current cpu's channel buffer.
* Does not protect the buffer at all - caller must provide
* appropriate synchronization.
*/
static inline void *relay_reserve(struct rchan *chan, size_t length)
{
void *reserved;
struct rchan_buf *buf = chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
return NULL;
}
reserved = buf->data + buf->offset;
buf->offset += length;
return reserved;
}
/**
* subbuf_start_reserve - reserve bytes at the start of a sub-buffer
* @buf: relay channel buffer
* @length: number of bytes to reserve
*
* Helper function used to reserve bytes at the beginning of
* a sub-buffer in the subbuf_start() callback.
*/
static inline void subbuf_start_reserve(struct rchan_buf *buf,
size_t length)
{
BUG_ON(length >= buf->chan->subbuf_size - 1);
buf->offset = length;
}
/*
* exported relayfs file operations, fs/relayfs/inode.c
*/
extern struct file_operations relayfs_file_operations;
#endif /* _LINUX_RELAYFS_FS_H */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment