blogJun 7, 2026·38 min read

50+ OS Interview Questions That Actually Get Asked (2026)

CodersNote Team

CodersNote Contributor

Battle-tested OS questions on processes, memory, deadlocks, and modern topics like containers — with code examples and comparison tables.

OS questions show up in almost every tech interview. Doesn't matter if you're interviewing at Google or a 50-person startup — someone is going to ask about processes, threads, and probably deadlocks.

Here's the thing about most OS interview guides: they give you textbook definitions. You memorize "a process is an instance of a running program," walk into the interview, get hit with "so what actually happens in memory when you fork a process?" — and freeze.

This guide covers 50+ questions organized by topic, with the kind of answers that survive follow-up questions. You'll find comparison tables, code snippets, and scenario-based questions that interviewers actually care about in 2026.

OS Fundamentals

1. What is an operating system?

An OS sits between your code and the actual hardware. It handles the messy parts — scheduling which program gets the CPU, managing memory so processes don't stomp on each other, and giving you a consistent interface so you don't have to talk to disk controllers directly.

Think of it as a resource manager. Your machine has a finite CPU, finite RAM, and finite disk. Multiple programs want all of it. The OS decides who gets what, when, and for how long.

Common examples: Linux, Windows, macOS, Android (which runs a modified Linux kernel underneath).

2. What are the main types of operating systems?

Type	How It Works	Example
Batch OS	Groups similar jobs together, runs them without user interaction	Early mainframes, payroll systems
Time-Sharing OS	CPU time sliced across users/processes — everyone gets a turn	Unix, Multics
Real-Time OS (RTOS)	Guarantees response within strict deadlines	VxWorks, FreeRTOS (used in medical devices, aircraft)
Distributed OS	Multiple machines appear as one unified system	Google's internal systems, Amoeba
Embedded OS	Minimal, runs on constrained hardware	Firmware in your microwave or car ECU

Interview tip: Don't just list the types. Mention a real-world example for each — it signals you actually understand the differences.

3. What is the difference between a kernel and a shell?

The kernel is the core of the OS. It runs in privileged mode and directly manages hardware — CPU scheduling, memory allocation, device drivers, all of it. It is the first thing loaded after the bootloader and it stays in memory the entire time the system is on.

The shell is just a user interface that talks to the kernel. Bash, Zsh, PowerShell — these are shells. When you type ls in a terminal, the shell parses that command, makes system calls to the kernel, and the kernel does the actual work of reading the directory from disk.

Quick way to remember: the kernel does the work, the shell takes the orders.

4. What are system calls?

System calls are how your programs ask the kernel to do things they can't do themselves. Your user-space code can't directly access hardware, so it makes a request through a well-defined interface — that's a system call.

Common categories:

Process control: fork(), exec(), wait(), exit()
File operations: open(), read(), write(), close()
Memory: mmap(), brk()
Communication: pipe(), socket(), shmget()

When a system call happens, the CPU switches from user mode to kernel mode, executes the requested operation, then switches back. This mode switch has a cost — it is not free — which is why high-performance code tries to minimize system calls (e.g., buffered I/O instead of writing one byte at a time).

5. What is the difference between user mode and kernel mode?

The CPU physically has (at least) two privilege levels. In user mode, your code runs with restricted access — it can't touch hardware, can't access other processes' memory, can't execute privileged CPU instructions. If it tries, the CPU raises a trap.

In kernel mode, the OS has full access to everything: all memory, all hardware, all CPU instructions. The kernel handles interrupts, manages page tables, and talks to devices here.

The separation exists for protection. Without it, a buggy app could overwrite kernel memory and crash the entire system. The boundary between user and kernel mode is the most important security boundary on your machine.

6. What is an interrupt?

An interrupt is a signal to the CPU that something needs attention right now. The CPU pauses what it is doing, saves its current state, jumps to an interrupt handler, deals with the event, then resumes what it was doing.

Two types:

Hardware interrupts: Triggered by external devices. Your keyboard just sent a keypress. Your disk finished a read operation. Your network card received a packet. The device fires an interrupt so the CPU knows.
Software interrupts (traps): Triggered by the program itself. Division by zero. Invalid memory access. Or intentionally — system calls use software interrupts (e.g., int 0x80 on older Linux x86).

Interrupts are how the OS stays responsive without constantly polling every device. Instead of asking "any new keypress? any new packet?" a thousand times per second, the hardware just tells the CPU when something happens.

7. Monolithic kernel vs microkernel — what is the difference?

Feature	Monolithic Kernel	Microkernel
Architecture	All OS services run in kernel space	Only essential services in kernel; rest runs in user space
Performance	Faster — no IPC overhead between services	Slower — services communicate via message passing
Reliability	One buggy driver can crash the whole kernel	A faulty service crashes only itself, not the kernel
Examples	Linux, older Unix	Minix, QNX, L4
Size	Larger — everything in one address space	Smaller kernel, modular user-space services

In practice, most production OSes are hybrid. macOS uses XNU (Mach microkernel + BSD monolithic components). Windows NT has a hybrid architecture too. Pure microkernels are mostly in embedded/research systems.

Worth knowing: Linux is monolithic but uses loadable kernel modules (LKMs) so you can add/remove drivers without rebooting. This gives it some of the flexibility of a microkernel without the IPC overhead.

8. What happens during the booting process?

When you press the power button, here is what actually happens:

BIOS/UEFI: Firmware runs a Power-On Self Test (POST), checks hardware, then looks for a bootable device.
Bootloader: GRUB (on Linux) or Windows Boot Manager loads. It knows where the kernel lives on disk and loads it into memory.
Kernel initialization: The kernel takes over. It sets up memory management, initializes device drivers, mounts the root filesystem.
Init process: The kernel starts the first user-space process (PID 1). On modern Linux, that is systemd. On older systems, it was init.
User space: Init spawns system services (networking, logging, display manager), and eventually you get a login prompt or desktop.

The whole thing takes seconds on modern hardware, but there is a lot happening underneath.

Process Management

9. What is a process?

A process is a program in execution. Your Python script sitting on disk is just a file. When you run it, the OS creates a process — it allocates memory, sets up a stack and heap, assigns a process ID (PID), and starts executing instructions.

Each process gets its own address space. Process A cannot read process B's memory (unless they explicitly set up shared memory). This isolation is fundamental to system stability.

A process contains: the program code (text section), a program counter, CPU registers, a stack (function calls, local variables), a heap (dynamic memory), and open file descriptors.

10. Process vs thread — what is the actual difference?

Feature	Process	Thread
Memory	Own address space	Shares address space with other threads in the same process
Creation cost	Heavy — OS must duplicate memory mappings	Light — just needs a new stack and register set
Communication	IPC required (pipes, sockets, shared memory)	Can directly read/write shared memory
Crash isolation	One process crashes, others survive	One thread crashes, the whole process dies
Context switch	Expensive — involves TLB flush, page table swap	Cheaper — same address space, no TLB flush needed
Use case	Chrome tabs (process isolation for security)	A web server handling concurrent requests

Interview tip: When asked this, give a concrete example. "Chrome runs each tab as a separate process so a crash in one tab doesn't take down the browser. A web server like Nginx uses threads (or async I/O) within a process because the overhead of per-request processes would be too high."

11. What are the different process states?

A process cycles through these states during its lifetime:

New: Being created. Memory is being allocated, PCB is being set up.
Ready: Loaded in memory, waiting for the CPU. It could run right now if the scheduler picks it.
Running: Currently executing on a CPU core.
Waiting (Blocked): Can't proceed until something happens — waiting for disk I/O, network response, user input, or a lock.
Terminated: Done executing. Resources being cleaned up.

The transitions that matter most: Ready → Running (scheduler picks this process), Running → Waiting (process needs I/O), Waiting → Ready (I/O completed, back in the queue).

On a single-core CPU, only one process is in the Running state at any moment. The rest are either Ready or Waiting.

12. What is a Process Control Block (PCB)?

The PCB is the kernel's data structure for tracking a process. Every process has one, and it stores everything the OS needs to manage that process:

Process ID (PID) and parent process ID
Process state (ready, running, waiting)
CPU registers and program counter (saved during context switches)
Memory management info (page tables, segment tables)
I/O status (open files, pending I/O requests)
Scheduling info (priority, time quantum remaining)

When the OS switches from process A to process B, it saves A's register values into A's PCB and loads B's register values from B's PCB. That is the core of a context switch.

13. What is context switching?

Context switching is the OS saving the state of the currently running process and loading the state of the next one. It happens constantly — every time the scheduler decides a different process should run.

The steps:

Save the current process's registers, program counter, and stack pointer into its PCB.
Update the process state (Running → Ready or Waiting).
Select the next process from the ready queue (this is where the scheduling algorithm matters).
Load the new process's state from its PCB into the CPU registers.
Switch the memory mapping (page tables) to the new process's address space.
Jump to the new process's program counter and resume execution.

Context switches are pure overhead. No useful work happens during a switch. A typical switch costs 1-10 microseconds on modern hardware, which sounds tiny — but if you're doing thousands per second, it adds up.

14. What are zombie and orphan processes?

Zombie process: A process that has finished execution but still has an entry in the process table. This happens when the parent process hasn't called wait() to read the child's exit status. The child is dead, but its PID and exit code stick around. Too many zombies can exhaust the PID table.

Orphan process: A child process whose parent has terminated. When this happens, the OS reassigns the orphan to the init process (PID 1), which becomes its new parent and will eventually call wait() on it. Orphans aren't inherently problematic — the OS handles them automatically.

Common follow-up: "How do you prevent zombie processes?" Answer: The parent should call wait() or waitpid(), or set up a SIGCHLD handler. In most production code, you'd use a process supervisor like systemd that handles this properly.

15. What is Inter-Process Communication (IPC)?

Since processes have isolated address spaces, they need explicit mechanisms to share data. The main IPC methods:

Pipes: One-way data stream between processes. The output of process A becomes the input of process B. This is what the | operator does in your shell: cat file.txt | grep "error".
Named pipes (FIFOs): Like pipes, but with a filesystem entry so unrelated processes can use them.
Shared memory: The fastest IPC method. Two processes map the same physical memory region into their address spaces. No copying, no kernel involvement after setup. The tradeoff: you need to handle synchronization yourself.
Message queues: Processes send discrete messages to a queue. The kernel manages the queue and ensures ordering.
Sockets: Work across machines (TCP/UDP) or locally (Unix domain sockets). This is how your web browser talks to a web server.
Signals: Lightweight notifications. SIGKILL, SIGTERM, SIGINT (Ctrl+C). Not for data transfer — just for notifying a process that something happened.

CPU Scheduling

16. Why is CPU scheduling needed?

Your machine usually has far more processes than CPU cores. Right now, your computer probably has 200+ processes running on 4-16 cores. The scheduler decides which process runs on which core and for how long.

Good scheduling balances competing goals: throughput (finish as many jobs as possible), latency (respond to interactive users quickly), fairness (don't starve any process), and efficiency (keep the CPU busy, not idle).

These goals conflict. Optimizing for throughput means running long batch jobs without interruption. Optimizing for latency means interrupting them frequently to serve interactive requests. Every scheduling algorithm is a tradeoff.

17. Explain FCFS, SJF, and Round Robin scheduling.

Algorithm	How It Works	Pros	Cons	Best For
FCFS	First process in the queue runs until it finishes	Simple, fair in arrival order	Convoy effect — short jobs stuck behind long ones	Batch processing
SJF	Pick the process with the shortest burst time	Optimal average waiting time	Requires knowing burst times in advance (impractical). Can starve long jobs	Theoretical benchmarking
Round Robin	Each process gets a fixed time slice (quantum), then goes to the back of the queue	Fair, good response time for interactive users	High context switch overhead if quantum is too small	Time-sharing systems

Follow-up you'll get: "What is a good time quantum for Round Robin?" Too small (1ms) means constant context switching — overhead dominates. Too large (1 second) and it degenerates into FCFS. Typical values: 10-100ms. Linux's default is around 6ms for interactive processes.

18. What is Priority Scheduling and how do you prevent starvation?

Priority scheduling assigns each process a priority number. The highest-priority ready process gets the CPU. Simple enough, but there is a nasty problem: starvation. If high-priority processes keep arriving, low-priority ones never run.

The fix is aging: gradually increase the priority of waiting processes over time. A process that has been waiting for 5 minutes gets a priority boost. This guarantees every process eventually runs, no matter its initial priority.

Priority can be preemptive (a higher-priority process arriving interrupts the current one) or non-preemptive (current process finishes its burst first). Most modern systems use preemptive priority scheduling.

19. What is the Multilevel Feedback Queue?

This is what most real operating systems actually use. It solves the problem that SJF needs to know burst times in advance — the MLFQ figures it out by observing behavior.

The idea: multiple queues with different priority levels. New processes start in the highest-priority queue. If a process uses its entire time quantum (CPU-bound), it gets demoted to a lower queue. If it gives up the CPU early (I/O-bound), it stays at the same level or gets promoted.

The result: interactive processes (short CPU bursts, lots of I/O) naturally stay in high-priority queues and get fast response times. CPU-bound batch jobs sink to lower queues and get longer time slices but lower priority.

No advance knowledge needed. The scheduler adapts to actual process behavior.

20. What is the Completely Fair Scheduler (CFS)?

CFS is Linux's default scheduler since kernel 2.6.23 (2007). Instead of fixed time slices and priority queues, it tracks how much CPU time each process has received and always picks the process with the least accumulated CPU time.

Under the hood, CFS uses a red-black tree sorted by "virtual runtime" — the amount of CPU time a process has consumed, weighted by its priority. The leftmost node (lowest virtual runtime) is always the next to run. Insertion and lookup are O(log n).

Why this matters in interviews: CFS represents a different philosophy from traditional schedulers. Instead of assigning time slices, it models an "ideal" fair CPU where every process runs simultaneously, then approximates that on real hardware.

Real-world detail: When you run nice -n 19 make -j8 to compile code at low priority, you're adjusting the weight that CFS uses to compute virtual runtime. The compilation still runs — but CFS gives it less CPU time relative to other processes.

Process Synchronization

21. What is a critical section?

A critical section is a chunk of code that accesses shared resources — shared memory, a file, a database connection — that must not be accessed by more than one thread at a time. If two threads modify a shared variable simultaneously without coordination, you get corrupted data.

A correct solution to the critical section problem must satisfy three conditions:

Mutual exclusion: Only one process in the critical section at a time.
Progress: If no one is in the critical section and some processes want to enter, the decision can't be postponed forever.
Bounded waiting: A process can't be forced to wait indefinitely while others repeatedly enter.

22. What is a race condition?

A race condition happens when the outcome of a program depends on the timing of thread execution — which is non-deterministic. The classic example:


// Shared variable: counter = 0
// Thread A:              Thread B:
   read counter (0)         read counter (0)
   add 1 (= 1)              add 1 (= 1)
   write counter (1)        write counter (1)

// Expected result: counter = 2
// Actual result: counter = 1 (lost update)

Both threads read the old value before either writes the new one. This is a lost update — a textbook race condition. The fix: protect the shared variable with a lock so only one thread can read-modify-write at a time.

Race conditions are brutal to debug because they're timing-dependent. Your code might work fine in testing and fail in production under load.

23. What is a mutex?

A mutex (mutual exclusion) is a lock. A thread acquires the mutex before entering a critical section and releases it when done. If another thread tries to acquire a mutex that is already held, it blocks (goes to sleep) until the mutex is released.

Key property: ownership. Only the thread that locked a mutex can unlock it. This prevents accidental unlocking by other threads.


// Pseudocode
mutex_lock(m);
// critical section — safe to modify shared data
counter++;
mutex_unlock(m);

Mutexes are the most common synchronization primitive. Every time you see synchronized in Java or threading.Lock() in Python, that is a mutex underneath.

24. What is a semaphore?

A semaphore is an integer counter with two atomic operations: wait() (decrement, block if zero) and signal() (increment, wake a blocked thread).

Two types:

Binary semaphore: Value is 0 or 1. Behaves like a mutex, but without ownership — any thread can signal it.
Counting semaphore: Value can be any non-negative integer. Used to control access to a pool of resources. If you have 5 database connections, initialize the semaphore to 5. Each thread decrements it when taking a connection and increments it when returning one. When it hits 0, the next thread blocks until one frees up.

25. Mutex vs semaphore — what is the real difference?

Feature	Mutex	Semaphore
Purpose	Mutual exclusion — one thread at a time	Signaling or limiting concurrent access
Ownership	Yes — only the locking thread can unlock	No — any thread can call signal()
Count	Binary only (locked/unlocked)	Can be any non-negative integer
Use case	Protecting a shared variable	Controlling access to a pool (e.g., 5 DB connections) or producer-consumer signaling
Priority inversion	Supports priority inheritance to fix it	Does not — no ownership tracking

How to answer in an interview: "Use a mutex when you need exclusive access to a resource. Use a semaphore when you need to coordinate between producers and consumers, or limit access to a counted pool of resources."

26. Explain the Producer-Consumer problem.

You have two types of threads sharing a fixed-size buffer. Producers add items to the buffer. Consumers remove items. The constraints:

Producers must wait if the buffer is full.
Consumers must wait if the buffer is empty.
Only one thread should modify the buffer at a time.

Classic solution uses two counting semaphores and a mutex:


// Shared: buffer[N], mutex, empty = N, full = 0

Producer:                     Consumer:
  wait(empty)                   wait(full)
  wait(mutex)                   wait(mutex)
  buffer[in] = item             item = buffer[out]
  in = (in + 1) % N             out = (out + 1) % N
  signal(mutex)                 signal(mutex)
  signal(full)                  signal(empty)

empty tracks available slots. full tracks available items. The mutex protects the buffer itself. The order of wait() calls matters — reversing them can cause deadlock.

Deadlocks

27. What is a deadlock?

A deadlock is when two or more processes are each waiting for a resource held by another, creating a cycle where nobody can proceed. Everyone is stuck forever.

The classic example: Thread A holds Lock 1 and wants Lock 2. Thread B holds Lock 2 and wants Lock 1. Neither can proceed because they are waiting for each other.


// Thread A:               Thread B:
   lock(mutex_1)              lock(mutex_2)
   // ... does work            // ... does work
   lock(mutex_2)  // BLOCKS    lock(mutex_1)  // BLOCKS
   // waiting for B            // waiting for A
   // DEADLOCK                 // DEADLOCK

In production, deadlocks usually involve more than two threads and more complex resource chains, which makes them harder to spot.

28. What are the four necessary conditions for a deadlock?

All four must be true simultaneously for a deadlock to occur (Coffman's conditions):

Mutual exclusion: At least one resource can only be held by one process at a time.
Hold and wait: A process holds at least one resource while waiting for others.
No preemption: Resources can't be forcibly taken away — a process must voluntarily release them.
Circular wait: A circular chain of processes, each waiting for a resource held by the next one in the chain.

The practical takeaway: If you can break any one of these four conditions, deadlock becomes impossible. Most real systems target the circular wait condition by imposing a global ordering on lock acquisition.

29. How do you prevent deadlocks?

Each prevention strategy targets one of the four conditions:

Break mutual exclusion: Make resources shareable. Not always possible — some resources (printers, locks) are inherently exclusive.
Break hold and wait: Require processes to request all resources at once before starting. Problem: resource utilization drops because you're holding things you don't need yet.
Allow preemption: If a process can't get a resource, forcibly take its held resources. Works for CPU and memory, not for printers or database locks.
Break circular wait: Impose a total ordering on resources. Every process must acquire locks in the same order (e.g., always lock A before lock B). This is the most commonly used approach in practice.

In real codebases, the lock ordering approach is the standard. Many organizations enforce a "lock hierarchy" — every lockable resource gets a number, and you must acquire locks in ascending order.

30. What is the Banker's Algorithm?

The Banker's Algorithm is a deadlock avoidance strategy (not prevention — there's a difference). Before granting a resource request, the OS simulates the allocation and checks whether the system would remain in a "safe state."

A safe state means there exists at least one sequence in which all processes can finish without deadlocking. If granting a request would put the system in an unsafe state, the request is denied and the process waits.

Here is a simplified example:

Process	Allocated	Max Need	Remaining Need
P1	2	5	3
P2	3	6	3
P3	1	3	2

Available resources: 3. The algorithm finds a safe sequence: P3 (needs 2, finishes, releases 1+2=3), then P1 (needs 3, finishes, releases 2+3=5), then P2 (needs 3, finishes). Safe state confirmed.

In practice, the Banker's Algorithm is rarely used in general-purpose OSes because it requires knowing maximum resource needs in advance. But it shows up in embedded systems and database lock managers.

31. What is priority inversion?

Priority inversion happens when a high-priority task is indirectly blocked by a low-priority task. Here is how:

Low-priority task L acquires a lock.
High-priority task H tries to acquire the same lock — it blocks.
Medium-priority task M runs because H is blocked and M has higher priority than L.
M effectively delays H even though H has higher priority than M.

This famously caused a system reset on the Mars Pathfinder in 1997. The fix: priority inheritance — temporarily boost L's priority to H's level while L holds the lock, so M can't preempt it. L finishes faster, releases the lock, and H runs.

Memory Management

32. What is the difference between logical and physical addresses?

A logical (virtual) address is what your program sees. When your C code accesses a pointer at address 0x7fff5fbff8ac, that is a virtual address. Every process thinks it has the full address space to itself.

A physical address is the actual location in RAM hardware. The Memory Management Unit (MMU) translates virtual addresses to physical addresses at runtime using page tables.

This indirection is powerful: two processes can both use virtual address 0x400000 and the MMU maps them to completely different physical locations. It also enables features like virtual memory, copy-on-write, and memory-mapped files.

33. What is paging?

Paging divides physical memory into fixed-size blocks called frames (typically 4 KB) and the logical address space into same-sized blocks called pages. A page table maps each page to a frame.

When the CPU generates a virtual address, the MMU splits it into a page number and an offset. The page number looks up the frame number in the page table, and the offset points to the exact byte within that frame.

The key benefit: you don't need contiguous physical memory. A process's pages can be scattered across RAM in any order. The page table handles the translation transparently.

The TLB (Translation Lookaside Buffer) is a hardware cache for recent page table lookups. Without it, every memory access would require an extra memory read for the page table. A TLB miss is expensive — it is a major performance consideration.

34. What is segmentation?

Segmentation divides memory into variable-size segments based on logical divisions in the program: code segment, data segment, stack segment, heap segment. Each segment has a base address and a length.

Unlike paging (which is invisible to the programmer), segmentation reflects how programmers actually think about memory. The downside: variable-size segments cause external fragmentation — you end up with small, unusable gaps between allocated segments.

35. Paging vs segmentation — comparison.

Feature	Paging	Segmentation
Division size	Fixed (4 KB typical)	Variable (based on logical units)
Fragmentation	Internal (wasted space within a page)	External (gaps between segments)
Programmer visibility	Transparent — programmer doesn't see it	Visible — reflects program structure
Address translation	Page number + offset	Segment number + offset
Used in practice	Dominant in modern OSes (Linux, Windows)	Rarely used alone; x86-64 effectively disabled segmentation

Most modern systems use paging exclusively. x86-64 CPUs technically support segmentation but Linux sets all segment bases to 0, making them irrelevant. Paging won because fixed-size blocks are much easier to manage and don't cause external fragmentation.

36. What is virtual memory?

Virtual memory gives each process the illusion of having a large, contiguous address space — even if physical RAM is smaller. It works by storing some pages in RAM and others on disk (in a swap file or swap partition).

On a 64-bit system, each process gets a 48-bit virtual address space (256 TB on x86-64). Your 16 GB of physical RAM is obviously smaller, but virtual memory makes it work by only keeping actively used pages in RAM and swapping the rest to disk as needed.

Benefits: process isolation (each process has its own address space), efficient memory use (only load what's needed), programs can be larger than physical RAM.

37. What is a page fault and how does the OS handle it?

A page fault occurs when a process accesses a page that isn't currently in physical RAM. The MMU triggers a trap, and the OS takes over:

Check if the access is valid (is this address part of the process's allocated space?). If not, segfault — kill the process.
If valid, find a free frame in RAM. If no free frames exist, use a page replacement algorithm to pick a victim.
If the victim frame is dirty (modified since loaded), write it to disk first.
Read the requested page from disk into the free frame.
Update the page table to map the virtual page to the new frame.
Restart the instruction that caused the fault.

Page faults are expensive — they involve disk I/O, which is orders of magnitude slower than RAM. A process with frequent page faults runs extremely slowly.

38. What are the main page replacement algorithms?

Algorithm	Strategy	Pros	Cons
FIFO	Evict the page that's been in memory the longest	Simple to implement	Can evict frequently used pages. Suffers from Belady's Anomaly (more frames can cause MORE page faults)
LRU (Least Recently Used)	Evict the page that hasn't been used for the longest time	Good approximation of optimal. No Belady's Anomaly	Expensive to implement exactly (need timestamps or stack tracking for every memory access)
Optimal	Evict the page that won't be used for the longest time in the future	Provably optimal — fewest possible page faults	Impossible to implement — requires knowing the future. Used only as a benchmark
Clock (Second Chance)	FIFO with a reference bit — give each page a "second chance" if recently used	Good balance of performance and implementation cost	Not as good as true LRU

What to say in interviews: "In practice, most OSes use approximations of LRU, like the Clock algorithm, because true LRU is too expensive to maintain in hardware. Linux uses a two-list approach with active and inactive page lists."

39. What is thrashing?

Thrashing happens when the system spends more time swapping pages in and out of memory than actually executing processes. The CPU utilization drops to near zero even though the system is technically "busy."

Here's how it spirals: too many processes are competing for too little RAM. Each process keeps page-faulting because its pages keep getting evicted by other processes. The disk becomes the bottleneck — constant read/write to swap. The system crawls.

How to fix it:

Add more RAM (the obvious fix).
Reduce the number of active processes (the OS can suspend some).
Use the working set model — only keep a process in memory if all of its actively used pages (its "working set") fit. If they don't fit, suspend the process entirely rather than letting it thrash.

You can detect thrashing by monitoring page fault rates. If the rate spikes while CPU utilization drops, you are thrashing.

File Systems and I/O

40. What are the main file allocation methods?

Contiguous allocation: Each file occupies consecutive blocks on disk. Fast sequential reads, but causes external fragmentation and makes file growth difficult.
Linked allocation: Each block contains a pointer to the next block. No fragmentation, but random access is slow (you must traverse the chain). FAT file system uses this approach.
Indexed allocation: A separate index block stores pointers to all of a file's data blocks. Supports random access without fragmentation. This is what modern file systems like ext4 use (via inodes with direct, indirect, and double-indirect block pointers).

41. What are disk scheduling algorithms?

When multiple I/O requests are pending, the disk scheduler decides the order. The goal: minimize head movement (seek time), which is the slowest part of mechanical disk I/O.

Algorithm	How It Works	Analogy
FCFS	Serve requests in order of arrival	A taxi going to destinations in the order passengers called
SSTF	Serve the closest request next	A taxi always going to the nearest passenger — efficient but can starve far-away passengers
SCAN (Elevator)	Move head in one direction, serving requests along the way, then reverse	An elevator going up, stopping at each floor, then going back down
C-SCAN	Like SCAN but only serves in one direction, then jumps back to the start	An elevator that only goes up, then drops to ground floor and starts over
LOOK / C-LOOK	Like SCAN/C-SCAN but reverses at the last request instead of going to the end of the disk	An elevator that reverses at the highest requested floor, not the top floor

Modern context: These algorithms matter less with SSDs since there is no physical head to move. With SSDs, the OS scheduler focuses on queue depth and parallelism rather than seek time optimization.

42. What is RAID?

RAID (Redundant Array of Independent Disks) combines multiple disks for performance, redundancy, or both.

Level	Technique	Min Disks	Benefit	Tradeoff
RAID 0	Striping (data split across disks)	2	Best performance	Zero redundancy — one disk fails, everything is lost
RAID 1	Mirroring (identical copies)	2	Full redundancy	50% storage overhead
RAID 5	Striping with distributed parity	3	Good balance of speed and redundancy	Can survive one disk failure, slow rebuild
RAID 6	Striping with double parity	4	Survives two disk failures	Higher write overhead than RAID 5
RAID 10	Mirroring + Striping	4	Best performance + redundancy	50% storage overhead, expensive

43. What is spooling?

Spooling (Simultaneous Peripheral Operations Online) uses a buffer — usually on disk — to hold data between a fast producer and a slow consumer. The most common example: print spooling. When you print a document, the data goes to a spool file on disk, and the printer reads from it at its own pace. Your application doesn't have to wait for the printer to finish.

Without spooling, your word processor would freeze until the printer finishes every page. With spooling, you're back to editing immediately.

Modern OS Concepts

These questions are increasingly common in 2026 interviews, especially for backend, DevOps, and infrastructure roles. Most traditional OS interview guides don't cover them at all.

44. Containers vs Virtual Machines — what is the difference?

Feature	Virtual Machine	Container
Isolation level	Full hardware virtualization — separate kernel per VM	OS-level virtualization — shares the host kernel
Startup time	Minutes (boots an entire OS)	Seconds (starts a process)
Resource overhead	High (each VM runs its own kernel, drivers)	Low (just the application + its dependencies)
Size	Gigabytes	Megabytes
Security	Stronger — separate kernel means deeper isolation	Weaker — a kernel exploit can escape the container
Use case	Running different OSes, multi-tenant cloud (AWS EC2)	Microservices, CI/CD, consistent dev environments (Docker)

The key insight: containers are just regular OS processes with extra isolation applied through kernel features. A Docker container is not a VM — it is a process running with restricted namespaces and resource limits. This is why containers start in seconds while VMs take minutes.

45. What are namespaces and cgroups?

These are the two Linux kernel features that make containers possible.

Namespaces provide isolation. Each namespace type gives the container its own view of a specific resource:

PID namespace: The container sees its own process tree (PID 1 is the container's init, not the host's).
Network namespace: Own network stack, IP addresses, ports.
Mount namespace: Own filesystem view.
UTS namespace: Own hostname.
User namespace: Own UID/GID mapping (root in the container can be non-root on the host).

Cgroups (Control Groups) provide resource limits. They cap how much CPU, memory, disk I/O, and network bandwidth a container can use. Without cgroups, a runaway container could starve the host and other containers.

Docker is essentially a user-friendly wrapper around namespaces, cgroups, and a layered filesystem (OverlayFS).

46. What is Copy-on-Write (CoW)?

Copy-on-Write is an optimization where a copy of data isn't actually made until someone modifies it. Instead of duplicating data immediately, both the original and the "copy" point to the same physical memory. Only when one of them writes does the OS create an actual copy of the affected page.

Where it matters:

fork(): When a process forks, the child doesn't get a full copy of the parent's memory. Both share the same pages, marked read-only. Only when either process writes to a page does the OS copy that specific page. This makes fork() fast even for processes using gigabytes of memory.
Container images: Docker image layers use CoW. When you run a container from an image, you don't copy the entire image. Writes go to a thin layer on top. Multiple containers from the same image share the underlying layers.

47. What is a Real-Time Operating System (RTOS)?

An RTOS guarantees that tasks complete within strict time deadlines. The correctness of the system depends not just on producing the right output, but producing it on time.

Two types:

Hard real-time: Missing a deadline is a system failure. Examples: aircraft flight control, anti-lock braking systems, pacemakers. If the braking system responds 50ms late, people die.
Soft real-time: Missing a deadline degrades quality but isn't catastrophic. Examples: video streaming (a dropped frame is noticeable but not fatal), audio processing.

RTOSes achieve this through deterministic scheduling (usually priority-based preemptive), minimal interrupt latency, and avoiding non-deterministic operations like garbage collection. FreeRTOS, QNX, and VxWorks are common in the industry.

48. OS-level security — what are the key mechanisms?

Operating systems enforce security at multiple levels:

Access Control Lists (ACLs): Define which users/groups can read, write, or execute each file. The rwxr-xr-- you see in ls -l output.
Capabilities: Instead of giving a process full root access, give it only the specific capabilities it needs. A web server needs to bind port 80 but doesn't need to load kernel modules. Linux capabilities let you grant CAP_NET_BIND_SERVICE without full root.
Sandboxing: Restrict what a process can do. Browsers sandbox each tab (using seccomp-bpf on Linux, App Sandbox on macOS) so a compromised web page can't access your files.
ASLR (Address Space Layout Randomization): Randomize where code and data are loaded in memory. Makes buffer overflow exploits harder because attackers can't predict where things are.
Mandatory Access Control: SELinux and AppArmor enforce security policies that even root can't override. Used in production servers to limit blast radius of a compromise.

Scenario-Based Questions

These questions test whether you can connect OS concepts to real-world problems. Interviewers use them to separate candidates who understand theory from candidates who can actually apply it.

49. What happens when you execute a program?

Walk through the full lifecycle when you type ./myprogram in a terminal:

Shell parses the command: The shell reads your input, identifies ./myprogram as an executable path.
fork(): The shell calls fork() to create a child process. The child is a copy of the shell process (with CoW optimization).
exec(): The child calls exec("./myprogram"). This replaces the child's code, data, and stack with the contents of myprogram. The process ID stays the same.
Loader maps the binary: The OS reads the ELF header (on Linux), maps the code segment, data segment, and BSS into virtual memory. If there are shared libraries, the dynamic linker loads those too.
Execution begins: The CPU jumps to the program's entry point (usually _start, which calls main()). The process is now Running.
System calls during execution: Every time the program does I/O, allocates memory, or creates threads, it makes system calls to the kernel.
Termination: The program calls exit(). The kernel cleans up resources (closes files, frees memory, removes page table entries). The parent shell calls wait() to collect the exit status.

Why interviewers ask this: This single question touches fork, exec, virtual memory, paging, system calls, and process lifecycle. Answering it well shows you understand how everything connects.

50. How would you investigate high CPU usage on a Linux server?

This is a common DevOps/SRE interview question, and the answer reveals whether you've worked with real systems.

Identify the culprit: Run top or htop. Look at the %CPU column. Sort by CPU usage. Note the PID of the offending process.
Determine what it's doing: Use strace -p <PID> to see which system calls it's making. Is it CPU-bound (computation) or thrashing (constant page faults)?
Check if it's user or kernel time: Look at %us vs %sy in top. High user time means the application is doing heavy computation. High system time means lots of system calls or kernel work.
Analyze the process: Use perf top -p <PID> to see which functions are hot. This points you to the exact code responsible.
Check for scheduling issues: Is the process at an abnormally high priority? Is there a runaway loop? Are there too many threads competing for CPU?

51. What happens when you type "ls" in a terminal?

This is a condensed version of question 49, but interviewers use it to see how deep you can go:

The shell reads "ls" from stdin, searches $PATH for the binary (typically /usr/bin/ls).
fork() creates a child process.
exec("/usr/bin/ls") loads the ls binary into the child's address space.
ls calls opendir() and readdir() — system calls that ask the kernel to read directory entries from the filesystem.
The kernel looks up the inode for the current directory, reads the directory entries from disk (or from the page cache if recently accessed).
ls formats the output and calls write() to send it to stdout (the terminal).
ls calls exit(0). The kernel cleans up. The parent shell calls wait() and displays the next prompt.

Quick Reference Tables

Use these for last-minute revision. They condense the key comparisons that come up repeatedly in interviews.

Scheduling Algorithms at a Glance

Algorithm	Preemptive?	Starvation Risk?	Key Strength	Key Weakness
FCFS	No	No	Simplicity	Convoy effect
SJF	Can be both	Yes (long jobs)	Optimal avg. wait time	Needs burst time prediction
Round Robin	Yes	No	Fairness, good response time	Context switch overhead
Priority	Can be both	Yes (low-priority)	Urgency handling	Starvation without aging
MLFQ	Yes	Mitigated	Adaptive, no prediction needed	Complex to tune
CFS	Yes	No	Fair by design	Not ideal for hard real-time

Synchronization Primitives Compared

Primitive	Ownership?	Counter?	Blocks?	Use Case
Mutex	Yes	No (binary)	Yes	Exclusive access to shared data
Binary Semaphore	No	No (0 or 1)	Yes	Signaling between threads
Counting Semaphore	No	Yes	Yes	Limiting pool access (e.g., DB connections)
Spinlock	Yes	No	Busy-wait	Very short critical sections in kernel code
Monitor	Implicit	No	Yes	Higher-level abstraction (Java synchronized)

Memory Concepts Cheat Sheet

Concept	What It Is	Key Detail
Paging	Fixed-size memory blocks	Eliminates external fragmentation, causes internal fragmentation
Segmentation	Variable-size logical units	Matches program structure, causes external fragmentation
Virtual Memory	Abstraction over physical RAM + disk	Enables isolation and overcommit
TLB	Cache for page table entries	TLB miss is expensive — triggers page table walk
Page Fault	Accessing a page not in RAM	Involves disk I/O — very slow compared to cache hit
Thrashing	Excessive page faulting	CPU idle while disk is 100% busy
Copy-on-Write	Defer copying until modification	Makes fork() fast, used in container layers

Frequently Asked Questions

How should I prepare for OS interview questions?

Focus on understanding, not memorization. Interviewers ask follow-up questions, and memorized definitions fall apart fast. For each topic, make sure you can explain why it works that way, not just what it is.

Start with processes and threads (they come up in every interview), then memory management (paging, virtual memory), then deadlocks. If you're applying for backend or DevOps roles, add containers and Linux internals to your prep.

Practice explaining concepts out loud. If you can explain virtual memory to a friend who has never heard of it, you can explain it to an interviewer.

How many OS questions should I expect in an interview?

Typically 3-5 questions in a dedicated OS round, or 1-2 questions mixed into a general CS fundamentals round. The depth varies by role — a systems engineer will get harder questions than a frontend developer.

For product companies (especially FAANG), expect questions that go deeper into one topic rather than surface-level coverage of many topics. They might spend 15 minutes just on virtual memory or deadlocks.

Which OS topics are asked most frequently?

Based on frequency across real interviews, here is the ranking:

Process vs Thread — shows up in almost every interview
Deadlocks — conditions, prevention, real examples
Virtual Memory & Paging — especially page faults and TLB
Synchronization — mutex, semaphore, race conditions
CPU Scheduling — Round Robin, CFS, MLFQ
Containers — increasingly asked for backend/DevOps roles

Is there a difference in OS questions for freshers vs experienced candidates?

Yes. Freshers get more "what is X?" questions — define a process, list scheduling algorithms, explain paging. The expected depth is moderate.

Experienced candidates get scenario-based and design questions — "How would you debug a memory leak?", "Why did you choose threads over processes in your system?", "Walk me through what happens when your service OOM-kills." The interviewer expects you to connect OS concepts to production experience.

Do I need to know Linux internals specifically?

For most roles, knowing general OS concepts is enough. But if you can mention Linux-specific implementations (CFS scheduler, cgroups, namespaces, OverlayFS), it makes a strong impression.

For systems engineering, SRE, or kernel development roles, Linux internals are expected. You should know the basics of how the Linux kernel handles scheduling, memory, and process management.

operating systemsinterview questionsoscomputer scienceprocessesmemory managementdeadlocks