Optimising RAM Usage With Python

0
6
RAM Optimising

Discover how we can make better use of RAM by applying various Python optimisation techniques.

In today’s world, with the abundance of RAM available, we rarely think about optimising our code. But sooner or later, and before we can realise it, we hit the limits of the hardware.
Let’s begin with how to measure the amount of RAM used by a Python program.  There is always more than one way to do so and each has its merits.

Using sys.getsizeof

This is a built-in function from the sys module that returns the shallow size of a single Python object in bytes.

What it measures

It measures only the memory directly attributed to the object itself (e.g., the list structure), excluding the size of any referenced objects (e.g., the integers or strings inside a list).
The pros are:
  • Simple, fast and lightweight.
  • Useful for quick comparisons of basic object overhead (e.g., empty list vs small list).
Cons/limitations:
  • Underestimates containers like lists, dicts, or custom objects with reference values in them.
  • As it is not recursive, it doesn’t help with complex structures or total memory footprint.

When to use

Use it for basic, one-off checks on simple objects or understanding Python’s object overhead. It’s not suitable for profiling real memory usage in applications.
import sys


def get_object_size_mb(obj):
“””
Returns the size of a Python object in megabytes (MB).
“””
size_bytes = sys.getsizeof(obj)
size_mb = size_bytes / (1024 * 1024) # Convert bytes to MB
return size_mb


# create huge list
a = [i for i in range(5_000_000)]
size_of_list_mb = get_object_size_mb(a)
print(f”Size python object is: {size_of_list_mb:.2f} MB”)


##OUTPUT
% /usr/bin/python3 getsizeof_sample.py Size python object is: 38.35 MB

Using tracemalloc

This built-in standard library module (since Python 3.4) traces Python memory allocations.

What it measures

It tracks memory blocks allocated by Python, providing detailed statistics (size, count, traceback) grouped by line, file, or traceback. You can take snapshots and compare them to find leaks or differences.
The pros are:
  • No external dependencies.
  • Precise for Python-allocated memory (lower overhead than some tools).
  • Excellent for debugging leaks; shows exactly where allocations happen.
  • Supports peak/current traced memory and object-specific tracebacks.
Cons:
  • Only tracks Python allocations (misses C extensions like NumPy unless they use Python’s allocator).
  • Requires starting tracing early (ideally at program start).
  • More low-level; needs code to take/compute snapshots.

When to use

Use it when investigating memory leaks, finding allocation hotspots, or comparing memory before/after code sections. It’s great for detailed, programmatic analysis in production-like debugging.
import tracemalloc


tracemalloc.start()


# create huge list
a = [i for i in range(5_000_000)]


current, peak = tracemalloc.get_traced_memory()
print(f”Current memory: {current / 10**6:.2f} MB; Peak: {peak / 10**6:.2f} MB”)


tracemalloc.stop()


##OUTPUT
% /usr/bin/python3 tracemalloc_sample.py
Current memory: 180.21 MB; Peak: 180.21 MB

Using memory_profiler

This is a third-party package (pip install memory_profiler) for line-by-line memory profiling.

What it measures

Measures memory usage (typically Resident Set Size — the portion of a process that is living in RAM right now) after each line in decorated functions, showing increments. Uses psutil as the default backend (optional switch to tracemalloc for more precise Python-only allocations).
The pros are:
  • Easy line-by-line view (similar to line_profiler for time).
  • Works cross-platform (Windows/Linux/Mac).
  • Plots memory over time with mprof.
  • Backend flexible (can switch between psutil and tracemalloc).
Cons:
  • External dependency (and often needs psutil).
  • Higher overhead due to frequent sampling of process memory.
  • RSS is coarse and may overestimate memory (includes shared libs, OS allocations).
  • Less precise than tracemalloc for pinpointing pure Python object allocations.

When to use

Good for quick, high-level overview of memory growth per line in functions/scripts. Ideal for exploratory profiling for scripts or functions where you only need to know which lines are causing memory jumps, not deep allocation traces.
from memory_profiler import memory_usage


def measure_ram(func):
def wrapper(*args, **kwargs):
mem_before = memory_usage()[0]
result = func(*args, **kwargs)
mem_after = memory_usage()[0]
print(f”RAM used: {mem_after - mem_before:.2f} MB”)
return result
return wrapper


@measure_ram
def sample_func():
a = [i for i in range(5_000_000)]


sample_func()


##OUTPUT
% /usr/bin/python3 memory_profiler_sample.py
RAM used: 3.60 MB

Techniques for optimising memory usage

Effective data structure usage

Choosing the right data structure is one of the most important ways to reduce RAM usage and improve performance. Different structures store data differently and have different memory footprints.

Array vs list

Using a regular list to store many numbers is much less efficient in RAM than using an array object. More memory allocations occur in a list, each of which take time; calculations on larger objects are less cache-friendly and more RAM is used overall, making less of it available to other programs.
from memory_profiler_sample import measure_ram #Note: imported function is from above sample
from array import array
# Create many integers
N = 5_000_000


@measure_ram
def sample_list():
print(“List Creation”)
data_list = list(range(N))


sample_list()


@measure_ram
def sample_arr():
print(“Array Creation”)
data_array = array(‘i’, range(N))


sample_arr()


##OUTPUT
% /usr/bin/python3 list_VS_arr_sample.py
List Creation
RAM used: 41.45 MB
Array Creation
RAM used: 19.16 MB

Tuple vs list

Tuples use less memory as they cannot be resized and no extra capacity is reserved.  They are faster and more cache-friendly. The same is true of set vs dict.

Choose the smallest numeric type

When working with large datasets in Python, NumPy arrays are far more memory-efficient than pure Python lists or Pandas Series (which often default to higher-precision types). The key is to select the smallest dtype that can safely represent your data without overflow or loss of precision. Here are a few options: int64 uses 8 bytes, int32 uses 4 bytes, int16 uses 2 bytes.

Use sparse data structures

Use libraries like scipy.sparse or pandas.SparseArray when the data holds most values as zero — storing all zeros does not make effective use of RAM. Hence sparse matrices store only non-zero entries.

Bytes vs Unicode

Python distinguishes between text (human-readable characters) and binary data (raw bytes — images, files, network packets, compressed data, encrypted data). Understanding the difference is essential for file handling, networking, APIs, and encoding issues. Unicode strings use more memory because character metadata needs to be stored.  Bytes are compact as they use only 1 byte per element.
import sys


def get_object_size_mb(obj):
“””
Returns the size of a Python object in megabytes (MB).
“””
size_bytes = sys.getsizeof(obj)
size_mb = size_bytes / (1024 * 1024) # Convert bytes to MB
return size_mb


# Create N string
N = 500_000_000


s = “hello “


# Unicode string (each character may take multiple bytes)
def sample_text():
print(“Text Creation”)
data_text = s*N
print(f”Size is: {get_object_size_mb(data_text):.2f} MB”)


sample_text()


# Bytes version (UTF-8 encoded)
def sample_bytes():
print(“Bytes Creation”)
data_text = s*N
data_bytes = data_text.encode(“utf-8”)
print(f”Size is: {get_object_size_mb(data_bytes):.2f} MB”)


sample_bytes()


##OUTPUT
% /usr/bin/python3 text_VS_unicode_sample.py
Text Creation
Size is: 13351.44 MB
Bytes Creation
Size is: 4768.37 MB

Handling large text data stores

Handling massive text data (logs, documents, transcripts, crawled data, chat history, etc) requires storage formats and structures that are memory-efficient, compressed, and scalable.

Compression (gzip, bz2, zstd)

Use of compression reduces the size of the text by 10-30X — storing raw text wastes huge amounts of space.

mmap

Map a file (or part of it) directly into memory and treat gigabytes of data as a giant string or byte array. This is super-fast for large files with low memory usage. This gives a bytes-like object that can be terabytes in size but uses almost no RAM until you touch the pages, resulting in no RAM explosion.

DAWG (Directed Acyclic Word Graph)

A DAWG is a highly compressed data structure used to store large sets of strings efficiently. It is especially powerful for dictionaries, NLP vocabularies, autocomplete engines, and prefix-based searches.
import dawg


words = [“cat”, “car”, “cart”, “dog”, “doing”]


d = dawg.CompletionDAWG(words)


print(d.keys(“ca”)) # prefix search: [‘cat’, ‘car’, ‘cart’]
print(“dog” in d) # True

Tries

This refers to a tree-based data structure also known as a prefix tree or digital tree. It is used to efficiently store and retrieve a dynamic collection of strings by sharing common prefixes. Prefix-based string data structures play a crucial role in text indexing, NLP, search engines, auto-complete systems and large dictionary storage. Given below are a few commonly used structures.
Marisa-Trie (Matching Algorithm with Recursively Implemented StorAge): This static compressed trie uses LOUDS (Level-Order Unary Degree Sequence) representation.
It has an extremely small memory footprint (often 5-10% of original text), and supports predictive search (common prefix search) and exact-match very efficiently. Once built, the structure is read-only, hence not suitable for frequent insertions/deletions. The Python module is marisa-tri.
Double-Array Trie (DAT): The Double-Array Trie (DAT) is a compact Trie representation that uses two arrays:
a. BASE array determines offset for child transitions
base[i]: starting index for children of node i
b. CHECK array stores the parent index to validate transitions
check[i]: parent node of node i (for collision detection)
It has very fast transitions (just array indexing) and supports dynamic insertion/deletion (though slower than lookup). The Python module is datrie.
HAT-Trie (Hash-Array mapped Trie): A HAT-Trie combines a trie for prefix organisation and hash buckets for fast leaf-node lookups. It starts with a hash table at the top levels, then ‘bursts’ into small trie containers when buckets fill. It is extremely cache-efficient because burst containers are small and contiguous. It supports insertion, deletion, prefix search, and can store values/frequencies. The Python module is hat-trie

General tips

Avoid copies

Avoid making unnecessary copies of the objects, which will fill RAM with redundant data. Instead, use references or views from NumPy, or use iterators. Remember, for strings Python reuses identical immutable strings; this avoids storing the same string many times.

Profiling

Use the profiling methods to check memory usage while programming. These will help to detect early issues like excessive allocations, memory leaks, and inefficient structures.
Use generators instead of storing everything in memory: A generator yields one item at a time as we don’t need to store the entire dataset in RAM.

NumPy

While working with numeric data, use NumPy arrays as they offer many fast algorithms.

Bitarray

If your code requires lots of bit strings, both numpy and bitarray packages provide efficient representation of bits packed into bytes.

Micro Python

Good for working with embedded systems, its tiny memory footprint allows developers to write high-level, readable Python code while still running on hardware traditionally restricted to low-level C/C++. It brings much of the core Python 3 functionality to devices with extremely limited resources—often with as little as 16-256KB of RAM and flash storage.

Unit test suite

Even as you focus on optimisation do not forget the fundamentals — the purpose of the code. Make sure to have unit test suite in place before you make algorithmic changes.
Just to recap, profiling helps you see the problem; effective structures and techniques help you fix it. Together, they form a solid foundation for writing memory-efficient, high-performance Python programs. The general tips and practices help improve performance, reduce memory pressure, and make applications more scalable—especially when working with large datasets or in resource-constrained environments.

LEAVE A REPLY

Please enter your comment!
Please enter your name here