12 Dec 2013
Last time I started to rewrite the VFS layer of my hobby kernel - again. This time I'll take a look at the system call couplings.
Since a while, I've had a cross compiler and newlib for my kernel, which means I have some basic syscall interfaces to start from.
Newlib requires the following syscalls:
int close(int file) int fstat(int file, struct stat *st) int isatty(int file) int link(char *old, char *new) int lseek(int file, int ptr, int dir) int open(const char *name, int flags, int mode) int read(int file, char *ptr, int len) int stat(const char *file, struct stat *st) int unlink(char *name) int write(int file, char *ptr, int len)
Everything starts with open
, so let's look at that first.
In order to keep track of the files that are opened by a process, we need a new data structure, though; the file descriptor.
typedef struct { INODE ino; uint32_t offset; uint32_t flags; uint32_t users; } file_desc_t;
The file descriptor keeps track of our position in the file as well as
the mode it was opened in. File descriptors can also be shared between
processes (after a fork()
for example), and it therefore has a use
counter. Two macros are used to manipulate the use counter
#define fd_get(fd) { (fd)->users++ } #define fd_put(fd) { (fd)->users--; if(!(fd)->users)free(fd) }
Each process descriptor has an array of pointers to file descriptors
file_desc_t *fd[NUM_FILEDES];
open
starts by finding a free file descriptor. It then finds the file,
opens the file and returns the index of the file descriptor it used:
int open(const char *name, int flags, int mode) { int fd; // Find unused file descriptor process_t *p = current->proc; int i; for(i=0; i < NUM_FILEDES; i++) { if(p->fd[i]) continue; fd = i; p->fd[fd] = calloc(1, sizeof(file_desc_t)); fd_get(p->fd[fd]); break; } // Find file INODE ino = vfs_namei(name); // Open file vfs_open(name, flags); // Setup file descriptor p->fd[fd]->ino = ino; p->fd[fd]->offset = 0; p->fd[fd]->flags = flags; return fd; }
I stripped away all of the sanity checking and error handling code here. With that code, the function is more than twice as long.
close
is even easier:
int close(int file) { int retval = vfs_close(p->fd[file]->ino); if(!p->fd[file]->ino->parent) free(p->fd[file]->ino); fd_put(p->fd[file]); p->fd[file] = 0; return retval; }
I always check if an inode has a parent before freeing it. If it has a parent, it's part of the vfs mount tree, and should be kept around.
Next, let's look at read. It's actually really simple (excluding sanity checking and error handling):
int read(int file, char *ptr, int len) { process_t *p = current->proc; INODE node = p->fd[file]->ino; int ret = vfs_read(node, ptr, len, p->fd[file]->offset); p->fd[file]->offset += ret; return ret; }
Write is pretty much the same:
int write(int file, char *ptr, int len) { process_t *p = current->proc; INODE node = p->fd[file]->ino; int ret = vfs_write(node, ptr, len, p->fd[file]->offset); p->fd[file]->offset += ret; return ret; }
fstat
and isatty
just passes on the information to the corresponding
vfs functions:
int fstat(int file, struct stat *st) { process_t *p = current->proc; INODE node = p->fd[file]->ino; return vfs_fstat(node, st); } :::c int isatty(int file) { process_t *p = current->proc; INODE node = p->fd[file]->ino; return vfs_isatty(node); }
stat
performs a namei
lookup to get the node instead of taking it
from the process' file descriptor table.
int stat(const char *file, struct stat *st) { INODE node = vfs_namei(file); int retval = vfs_fstat(node, st); if(!node->parent) free(node); return retval; }
The final function I'll look at now is lseek
which sets the current
position in the file:
int lseek(int file, int ptr, int dir) { process_t *p = current->proc; if(dir == SEEK_SET) { p->fd[file]->offset = ptr; } if(dir == SEEK_CUR) { p->fd[file]->offset += ptr; } if(dir == SEEK_END) { p->fd[file]->offset = p->fd[file]->ino->length + ptr; } return p->fd[file]->offset; }
I'll leave link
and unlink
for now, and come back to them when
I need them (i.e. I wish to implement a user writeable filesystem).
In the next post, we'll mount an actual filesystem.