Friday, 13 March 2026

Concurrent IO – Main Summary

 

Concurrent IO – Main Summary

These sections explain how a server handles many clients at the same time.

There are three main approaches:

MethodIdeaScalability
Thread per connectionOne thread handles one clientLow
Process per connectionOne process handles one clientLow
Event loopOne thread handles many socketsHigh

Modern systems like Redis, Nginx, and Node.js use event loops.


5.1 Thread-based Concurrency

Idea

Each client connection runs in a separate thread.

Flow:

Server
|
accept connection
|
create new thread
|
thread handles requests

Pseudo idea:

accept()
create thread
thread:
read request
process
write response

Problems

1. High memory usage

Each thread has a stack.

Example:

10,000 clients
→ 10,000 threads
→ huge memory

2. CPU overhead

Creating/destroying threads costs:

  • CPU

  • latency

Especially when connections are short-lived.


3. Processes are even heavier

Old servers used:

fork()

One process per client → even slower.


5.2 Event-based Concurrency

Instead of many threads, use one thread + event loop.

Important concept:

Sockets have kernel buffers.

Incoming packets:

network

kernel TCP stack

socket read buffer

When program calls:

read()

It copies data from this buffer.


Event Loop Idea

Instead of waiting for one socket:

wait for ANY socket to be ready

Pseudo structure:

while running:
wait until some sockets are ready

read from ready sockets
write to ready sockets

This is called an event loop.


3 OS mechanisms required

1️⃣ Readiness notification

Wait until socket is ready.

Examples:

poll()
epoll()
select()

2️⃣ Non-blocking read

Read data without waiting.

read()

returns immediately.


3️⃣ Non-blocking write

Write data without waiting.

write()

returns immediately.


5.3 Non-blocking IO

Normal IO = blocking.

Example:

read(fd)

If no data:

thread waits

Non-blocking IO:

read(fd)

If no data:

errno = EAGAIN

Meaning:

try again later

Write behavior

If buffer full:

write()

returns:

EAGAIN

or performs partial write.


Non-blocking accept

accept() removes connections from a kernel queue.

If queue empty:

accept()
→ EAGAIN

Enabling non-blocking mode

Sockets are blocking by default.

Enable using:

fcntl(fd, F_GETFL)
fcntl(fd, F_SETFL)

Add flag:

O_NONBLOCK

5.4 Readiness APIs

Servers need to know which socket is ready.

General idea:

wait_for_readiness()

Linux provides several APIs.


1. poll()

Simple API.

Structure:

struct pollfd {
fd
events
revents
}

Meaning:

FieldMeaning
eventswhat we want
reventswhat happened

Example flags:

POLLIN → ready to read
POLLOUT → ready to write

2. select()

Old API.

Problem:

max 1024 sockets

So it should not be used in modern servers.


3. epoll

Linux high-performance API.

Difference:

poll → pass fd list every time
epoll → fd list stored in kernel

So it scales better.

Used by:

  • Redis

  • Nginx


4. kqueue

Used on:

  • BSD

  • macOS

Similar to epoll.


Readiness APIs Cannot Be Used With Files

They work only for:

sockets
pipes
special kernel objects

Not for disk files.

Reason:

Socket:

kernel buffer exists

So kernel knows if data is available.

File:

data must be read from disk

So readiness cannot be predicted.


Solution for file IO

Servers use:

thread pool

Example:

event loop thread
|
send file read task
|
worker thread reads file

New Linux solution

Linux introduced:

io_uring

Features:

async file IO
async socket IO
high performance

But it is more complex.


5.5 Final Comparison

TypeMethodAPIScalability
SocketThread per connectionpthreadLow
SocketProcess per connectionfork()Low
SocketEvent looppoll / epollHigh
File IOThread poolpthreadMedium
Any IOEvent loopio_uringHigh

No comments:

Post a Comment

SDE floyd-warshall algorithm

 // User function template for C++ class Solution {   public:     void floydWarshall(vector<vector<int>> &dist) {         //...