Kafka Broker OS Tuning๐
Virtual Memory Concepts๐
Ideally Linux virtual memory system will autoscale and adjust itself depending on workload. We can tweak and make adjustments on how swap space is handled to suite Kafka needs.
-
Swapping = bad
-
When a machine runs out of RAM, the OS can โswapโ memory pages out to disk.
- Disk is way slower than RAM.
-
So if Kafkaโs memory gets swapped, everything slows down badly.
-
Kafka depends on page cache
-
Kafka doesnโt keep all messages in JVM heap.
- Instead, it relies on the Linux page cache (OS memory used to cache disk files).
-
This makes reading/writing logs super fast (like โRAM-speed diskโ).
-
If swapping happens
-
It means RAM is too small.
- Now the OS uses disk for memory โ very slow.
- And since RAM is busy with swapping, thereโs less room left for page cache.
- Result: Kafka loses its main performance advantage.
๐น Simple analogy๐
Think of RAM as a kitchen counter:
- Kafka keeps its working tools and most-used ingredients on the counter (page cache).
- If the counter is too small, the chef (OS) starts moving things to the basement (disk swap).
- Every time Kafka needs something, the chef has to run to the basement and back โ huge slowdown.
- Plus, with stuff in the basement, thereโs even less room left on the counter โ workflow collapses.
โ In short: Swapping in Kafka is terrible because:
- It makes memory operations slow (disk instead of RAM).
- It steals memory from the OS page cache, which Kafka relies on for fast log access.
RAM vs Disk vs Page Cache๐
๐น 1. RAM (Physical Memory)๐
- This is the actual physical memory chips installed in your machine.
- Super fast (nanoseconds).
- Used for active data โ what your CPU is working on right now.
๐น 2. OS Memory๐
- When people say โOS memory,โ they usually mean the portion of RAM managed by the Operating System.
-
The OS decides:
-
Which processes get how much RAM.
- What part of RAM to use for page cache (caching disk files).
- Whether to swap out inactive memory pages to disk if RAM runs low.
- So OS memory is not separate from RAM โ itโs RAM under the OSโs control.
๐น 3. Disk (Persistent Storage)๐
- Completely different from RAM.
- Much slower (milliseconds).
- Stores data permanently (files, logs, databases).
- Examples: HDD, SSD, NVMe.
๐น How they relate๐
- RAM = fast but limited, wiped when machine restarts.
- Disk = big, slow, permanent.
- OS memory management = decides how to best use RAM + when to move (swap) stuff to disk if RAM runs out.
๐น Analogy๐
- RAM = desk space where you keep the stuff youโre working on right now.
- Disk = filing cabinet in the basement where you store everything long term.
- OS memory management = office manager who decides what stays on your desk (RAM), what gets cached nearby (page cache), and what gets moved to the basement (swap).
โ In short:
- OS memory is just RAM managed by the operating system.
- RAM and disk are very different: RAM = fast, temporary; Disk = slow, permanent.
- Swapping happens when the OS moves data from RAM to disk because RAM is full โ thatโs what hurts Kafka.
There can be lot of performance issues having pages swapped to disk. If the VM system is swapping to disk then there is not enough memory being allocated to page cache.
๐น 1. What is swap space?๐
- Swap space = a portion of your disk reserved to act like extra RAM.
- If RAM is full, the OS can โswap outโ some memory pages (inactive ones) to this disk space.
- This frees up RAM for active work.
โ Good: prevents crashes when memory is tight. โ Bad: disk is way slower than RAM โ performance tanks if swapping happens.
๐น 2. Why swap is not required๐
- A system can run without swap configured at all.
- If RAM runs out and no swap exists โ the OS has no choice but to kill processes (OOM Killer).
- This is safer for performance-critical apps like Kafka, because it avoids the slowdown from swapping.
๐น 3. Why some swap is still useful๐
- Swap acts as a safety net.
- If something unexpected happens (like a memory leak), instead of instantly killing Kafka, the OS can temporarily push some memory to disk.
- This may keep the system alive long enough for you to fix the issue.
๐น 4. What is vm.swappiness
๐
vm.swappiness
= Linux setting that controls how aggressively the OS uses swap.-
Range: 0โ100.
-
0
โ avoid swap as much as possible. 100
โ swap aggressively, even if RAM is free.-
For Kafka and other high-throughput apps, best practice is:
-
Keep swap configured (safety net).
- But set
vm.swappiness=1
โ OS will only swap as a last resort.
๐น 5. Analogy๐
- RAM = your desk (fast access).
- Swap space = basement storage (slow to reach).
-
Swappiness = how eager the office manager is to move stuff to the basement.
-
High swappiness โ manager keeps clearing desk too early (slow).
- Low swappiness โ manager only uses basement if the desk is completely full.
โ
In short:
Swap space is disk space used as backup RAM. You donโt have to configure it, but itโs a good safety net. In Kafka, you donโt want the OS to swap unless itโs absolutely necessary โ thatโs why the recommendation is to set vm.swappiness=1
.
Swap vs Page Cache Drop Trade Off?๐
๐น 1. Page cache refresher๐
- Kafka writes logs to disk files.
- Linux keeps recently used file data in RAM (this is the page cache).
- Page cache makes reads/writes much faster, because you donโt always go to disk.
- So: more RAM for page cache = better Kafka performance.
๐น 2. What vm.swappiness
controls๐
-
When RAM is running low, Linux has two choices:
-
Drop some page cache (free up memory by forgetting cached file data).
-
Swap out memory pages from applications to disk (push part of their memory into swap).
-
vm.swappiness
decides which strategy Linux prefers. -
High value (e.g. 60, default) โ Linux is more likely to use swap.
- Low value (e.g. 1) โ Linux is more likely to drop page cache instead of swapping.
๐น 3. Why dropping page cache is better than swapping๐
-
Dropping page cache:
-
You lose some cached file data.
- But next time you need it, you just fetch from disk again (slower than cache, but predictable).
-
Swapping memory to disk:
-
Takes active memory pages (from Kafka or other processes) and moves them to disk.
- If Kafka needs those pages back โ huge stall (disk is thousands of times slower than RAM).
- Causes unpredictable latency spikes โ very bad for Kafka.
๐ So the recommendation: better to reduce page cache than to start using swap.
๐น 4. Simplified analogy๐
- Imagine RAM as your desk space.
- Page cache = reference books you keep on your desk for quick access.
- Kafkaโs working memory = active notes youโre writing on.
When desk space runs low:
- Option A (drop cache): Put away a few reference books (page cache). If you need them again, you fetch them from the library (disk).
- Option B (swap): Force yourself to put away half-written notes (swap). When you need them again, you must slowly re-read and re-write them from storage.
๐ Option A (drop cache) slows you down a little. ๐ Option B (swap) can freeze you mid-sentence.
โ In short:
vm.swappiness
controls whether Linux prefers to swap memory to disk or drop page cache when RAM is low.- For Kafka, itโs always better to drop page cache than to use swap, because swapping makes performance unpredictable.
Swap is controlled by Linuz๐
๐น 1. What swap actually does๐
- The Linux kernel decides which memory pages to swap out to disk when RAM is tight.
- It doesnโt only swap โunusedโ memory โ it can also swap out memory from processes that are still running.
- If the process suddenly needs that page again โ it has to page fault and reload it from disk.
- That reload can take milliseconds (vs nanoseconds from RAM) โ a huge delay.
๐น 2. Why this is unpredictable๐
- The kernelโs swapping decision depends on heuristics (like least-recently-used pages), but it isnโt perfect.
- A page Kafka really needs (e.g., part of a producer buffer or replica fetcher state) might get swapped out.
- Kafka doesnโt control which memory is swapped โ the OS does.
- So you can suddenly get a latency spike even though Kafka is โhealthyโ otherwise.
๐น 3. Impact on Kafka๐
- Producer โ sends data โ broker stalls (waiting for swapped memory). Producer sees high latency.
- Consumer โ fetch request delayed because Kafkaโs fetch buffer got swapped.
- GC (garbage collector) โ if its metadata gets swapped, GC pauses are even worse.
๐ This is why in Kafka best practices:
- Swap is treated as a last resort only (swappiness = 1).
- Or completely disabled on dedicated Kafka brokers.
๐น Analogy๐
Itโs like your notes are on your desk (RAM). The office manager (OS) decides, โI think you donโt need this notebook right nowโ โ and puts it in the basement (swap). When you actually do need it, you have to run to the basement, fetch it back, and only then continue โ unpredictable stall.
Yes - even actively used memory pages can be swapped to disk, and when Kafka needs them back, performance stalls unpredictably.
TLDR!!!๐
๐น 1. Page cache drop๐
- The OS discards cached file data from RAM.
- That data is still on disk already (Kafka log segments).
- If Kafka needs it again โ just read from disk normally.
- Cost: one normal disk read.
- Predictable: performance hit is known (disk I/O latency).
๐น 2. Swap๐
- The OS actively writes process memory pages (e.g., Kafkaโs JVM heap objects, control structures) to swap space on disk.
- Those pages do not exist on disk already โ the kernel must write them out before freeing RAM.
- If Kafka needs them back โ it has to pause until the kernel reloads them from swap.
- Cost: one disk write and one disk read.
- Unpredictable: Kafka may stall at random, because it doesnโt control which memory pages get swapped.
๐น 3. Why Kafka cares๐
- Kafka log data (page cache) โ dropping it is okay, since the log is durable on disk.
- Kafkaโs heap memory (swap) โ swapping it causes random stalls, because suddenly the broker canโt access in-use objects until theyโre paged back.
๐น Analogy๐
- Page cache drop = You borrowed a reference book and left it on your desk (RAM). The office manager takes it away. If you need it again, you just check it out from the library (disk). No harm done.
- Swap = Youโre actively writing in your notebook (heap). The office manager snatches it, boxes it, and sends it to the basement (swap). If you need it mid-thought, youโre frozen until itโs retrieved.
โ So the difference:
- Dropping page cache = safe, predictable slowdown (just a disk read).
- Swapping = unsafe, unpredictable stalls (extra writes, random process freezes).