Concurrent IO – Main Summary
These sections explain how a server handles many clients at the same time.
There are three main approaches:
| Method | Idea | Scalability |
|---|---|---|
| Thread per connection | One thread handles one client | Low |
| Process per connection | One process handles one client | Low |
| Event loop | One thread handles many sockets | High |
Modern systems like Redis, Nginx, and Node.js use event loops.
5.1 Thread-based Concurrency
Idea
Each client connection runs in a separate thread.
Flow:
Server
|
accept connection
|
create new thread
|
thread handles requests
Pseudo idea:
accept()
create thread
thread:
read request
process
write response
Problems
1. High memory usage
Each thread has a stack.
Example:
10,000 clients
→ 10,000 threads
→ huge memory
2. CPU overhead
Creating/destroying threads costs:
-
CPU
-
latency
Especially when connections are short-lived.
3. Processes are even heavier
Old servers used:
fork()
One process per client → even slower.
5.2 Event-based Concurrency
Instead of many threads, use one thread + event loop.
Important concept:
Sockets have kernel buffers.
Incoming packets:
network
↓
kernel TCP stack
↓
socket read buffer
When program calls:
read()
It copies data from this buffer.
Event Loop Idea
Instead of waiting for one socket:
wait for ANY socket to be ready
Pseudo structure:
while running:
wait until some sockets are ready
read from ready sockets
write to ready sockets
This is called an event loop.
3 OS mechanisms required
1️⃣ Readiness notification
Wait until socket is ready.
Examples:
poll()
epoll()
select()
2️⃣ Non-blocking read
Read data without waiting.
read()
returns immediately.
3️⃣ Non-blocking write
Write data without waiting.
write()
returns immediately.
5.3 Non-blocking IO
Normal IO = blocking.
Example:
read(fd)
If no data:
thread waits
Non-blocking IO:
read(fd)
If no data:
errno = EAGAIN
Meaning:
try again later
Write behavior
If buffer full:
write()
returns:
EAGAIN
or performs partial write.
Non-blocking accept
accept() removes connections from a kernel queue.
If queue empty:
accept()
→ EAGAIN
Enabling non-blocking mode
Sockets are blocking by default.
Enable using:
fcntl(fd, F_GETFL)
fcntl(fd, F_SETFL)
Add flag:
O_NONBLOCK
5.4 Readiness APIs
Servers need to know which socket is ready.
General idea:
wait_for_readiness()
Linux provides several APIs.
1. poll()
Simple API.
Structure:
struct pollfd {
fd
events
revents
}
Meaning:
| Field | Meaning |
|---|---|
| events | what we want |
| revents | what happened |
Example flags:
POLLIN → ready to read
POLLOUT → ready to write
2. select()
Old API.
Problem:
max 1024 sockets
So it should not be used in modern servers.
3. epoll
Linux high-performance API.
Difference:
poll → pass fd list every time
epoll → fd list stored in kernel
So it scales better.
Used by:
-
Redis
-
Nginx
4. kqueue
Used on:
-
BSD
-
macOS
Similar to epoll.
Readiness APIs Cannot Be Used With Files
They work only for:
sockets
pipes
special kernel objects
Not for disk files.
Reason:
Socket:
kernel buffer exists
So kernel knows if data is available.
File:
data must be read from disk
So readiness cannot be predicted.
Solution for file IO
Servers use:
thread pool
Example:
event loop thread
|
send file read task
|
worker thread reads file
New Linux solution
Linux introduced:
io_uring
Features:
async file IO
async socket IO
high performance
But it is more complex.
5.5 Final Comparison
| Type | Method | API | Scalability |
|---|---|---|---|
| Socket | Thread per connection | pthread | Low |
| Socket | Process per connection | fork() | Low |
| Socket | Event loop | poll / epoll | High |
| File IO | Thread pool | pthread | Medium |
| Any IO | Event loop | io_uring | High |
No comments:
Post a Comment