Cassandra memory consist of Heap memory + Off Heap Memory
Off-Heap Memory(Native Memory or Direct memory, which is managed by OS)
Heap Memory(Which is managed by Java)
Following are the part of Off-Heap Memory
Partition key Cache Row cache Chunk cache Memtable space
Note:Depending upon memtable_allocation_type memtable, space can be in off-heap or in heap memory
Java take cares of heap memory and we have some control over heap space, Offheap or native memory is controlled by the OS(Operating System).
Here are the few parameters where we can control the heap space and off heap space
We can set the maximum heap size in the jvm.option file on a single node. For example:
-Xms48G
-Xmx48G
Set the min (-Xms) and max (-Xmx) heap sizes to the same value to avoid GC pauses during resize, and to lock the heap in memory on startup.
We can set memtable size and threshold for the flush of memtable data in cassandra.yaml on single node. For example
memtable_cleanup_threshold: 0.50 -- Once it reaches 50% the memtable will be flushed from memory memtable_space_in_mb: 4096 --- Default heap memory is 1/4th of the heap
We can manage off heap space or native memory or max direct memory in jvm.options or cassandra-env.sh, for example in jvm.options we can set as below.
-XX:MaxDirectMemorySize=1M
Note: The default value when MAX_DIRECT_MEMORY(XX:MaxDirectMemorySize) is not set is (MAX_SYSTEM_MEMORY – MAX_HEAP_SIZE) / 2, When this property file_cache_size_in_mb. is not set, the chunk cache size is set to 1/2 of the maximum direct memory,If no maximum direct memory is set, then the cache size will be set to ⅓ of the system memory
Scripts for troubleshooting OOM issues
The following will help to check if bloomfilter is taking huge space
for i in {1..10}; do nodetool sjk mx -mg -b 'org.apache.cassandra.metrics:type=Table,name=BloomFilterOffHeapMemoryUsed' -f Value; date ; sleep 10; done
The following will help the native memory is getting used
for i in {1..10}; do nodetool sjk mxdump -q "org.apache.cassandra.metrics:type=NativeMemory,name=*"; date ; sleep 10; done > $(hostname -i)_NativeMemory
The following will help for leaks detection.
for i in {1..20}; do nodetool leaksdetection; sleep 30; done > $(hostname -i)_leaksdetection.txt
Try to reduce heap and native memory to control OOM issues.