Building an operating system kernel offers a rare look beneath the surface of modern computing systems. As part of the 15-410 Operating System course, we developed a Unix-like kernel entirely from scratch, with C and Assembly. This post explores the foundational concepts of OS kernels and shares insights from our implementation, including key design decisions, development strategies, and lessons learned from debugging low-level code.
Part 1: Fundamental Building Blocks of an OS Kernel
Virtual Memory and Paging
Virtual memory allows each process to run in its own isolated address space, providing abstraction and protection. On the x86-32 architecture, paging implements virtual memory by translating virtual addresses to physical ones.
A virtual address in this architecture is divided into three components: the top 10 bits index into the page directory, the next 10 bits index into a page table, and the final 12 bits serve as an offset within the 4KB physical page.
The translation process begins with the CR3 register, which holds the base address of the page directory. The CPU uses the directory index to locate the page table, then uses the table index to find the physical page. The physical address is formed by combining the frame address with the offset.
Each page table entry occupies 4 bytes and includes status flags, such as present, read / write, and user / supervisor, which control memory access permissions.
Context Switching and Scheduling
Context switching enables multitasking by saving the state of the currently running thread and restoring the state of another. On x86 systems, this includes saving general-purpose registers (EAX, EBX, etc.), segment registers, the instruction pointer (EIP), the stack pointer (ESP), and the EFLAGS.
The scheduler keeps track of runnable threads. When a context switch is triggered—either voluntarily or by a timer interrupt—the kernel saves the current thread’s state in its Thread Control Block (TCB) and restores the next thread’s state before transferring control.
Interrupt Handling and Mode Switching
Interrupts allow the CPU to respond to asynchronous events, such as hardware signals or exceptions. When an interrupt occurs, the CPU consults the Interrupt Descriptor Table (IDT) to determine the appropriate handler.
If the interrupt requires a privilege level change (e.g., switching from user mode to kernel mode), the CPU loads the kernel stack pointer from the Task State Segment (TSS) and pushes several values onto the stack, including the segment selector, stack pointer, flags, and the instruction pointer. For some exceptions, an error code is also pushed.
Control is then transferred to the handler function defined in the IDT. To return from the interrupt, the IRET instruction restores the processor state and resumes execution.
Timer Interrupts vs. Voluntary Context Switches
Timer interrupts occur at regular intervals and allow the kernel to preempt the currently running thread. These are handled in the interrupt context and typically result in a context switch if another thread is ready to run.
In contrast, voluntary context switches occur when a thread explicitly yields control, such as by calling a yield()
system call. These switches are initiated by the thread itself during system call execution and do not rely on hardware interrupts.
Device Interfaces and Controller Blocks
Device drivers manage communication between the kernel and hardware components such as disks and keyboards. They use I/O ports or memory-mapped registers to send commands and receive responses from devices.
The operating system maintains controller data structures to track the status of each device and manage pending operations. For example, a disk driver might issue a command to the controller, wait for an interrupt signaling completion, and update an I/O request block to reflect the operation’s status.