Python Global Interpreter Lock : Is it good or bad?

Chandra Shekhar Sahoo
4 min readJun 20, 2021

In this blog we will learn how python GIL works and what are its advantages and disadvantages and how to overcome its shortcomings.

So, first question will be what is exactly GIL (Global Interpreter Lock) means in python?

GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.

In simple words, GIL makes sure there is, at any time, only one thread running. Let’ visualize this:

How GIL works

In above figure, there are three threads initially the thread 1 is running and it has acquired the GIL, and when an I/O operation is done, thread 1 releases the GIL and it is then acquired by thread 2. The cycle keeps on going, and GIL keeps changing threads until the execution of the program.

It is clear that what is GIL but why GIL is actually required?

The answer is — to avoid race conditions.

Suppose a situation a variable has value :

c = 5

And we have two task and we allotted both task to two different threads

Task 1: c= c * c => Thread 1

Task 2: c= c + c => Thread 2

If task 1 runs first followed by second task:

c= 5* 5 = 25 => Task 1

c= 25 + 25 = 50 => Task 2 (Final value in c is 50)

If task 2 runs first followed by first task:

c= 5 + 5 = 10=> Task 2

c= 10 * 10= 100 => Task 1(Final value in c is 100)

So, we can clearly see here how the running sequence of the threads changes the final value in ‘c’. This situation is nothing but race condition.

Another reason of GIL , Python uses reference counting for memory management. It means there reference count variable that keeps track of the number of references that point to the object. When this count reaches zero, the memory occupied by the object is released. Let’s see this in code :

import sys
x = [3,4]
y = x
sys.getrefcount(x)
Output
3

We need to protect the reference count variable from race conditions where two threads increase or decrease its value simultaneously. Otherwise, it can cause either leaked memory that is never released or, incorrectly release the memory while a reference to that object still exists.

Impact of GIL on multi-threaded Python programs

Generally CPU gets two types of task , one is CPU bound and other is I/O bound .CPU bound pushes the program to its limits by performing many operations simultaneously whereas I/O program had to spend time waiting for Input/Output. So , CPU bound threads can’t utilizes the power of multithreading as GIL restricts only for single thread execution at any point of time. While I/O threads can show little bit good performance due to GIL.

So, high intense CPU task threads are impacted as GIL allows only for single thread execution at any point of time. Python threads cannot be run in parallel on multiple CPU cores due to the global interpreter lock (GIL).

How to deal with GIL shortcomings?

  1. For I/O Bound tasks:

GIL doesn't impact much on the I/O bound threads as maximum time for these threads are taken for I/O operation and thread needs to wait for some amount of time till then the GIL can execute other threads. We can understand from the code:

import requests
from concurrent.futures import ThreadPoolExecutor

list_html_download = [
{'name': 'google', 'url': 'http://google.com'},
{'name': 'reddit', 'url': 'http://youtube.com'}
]

def page_download(page):
'''Downloads and saves the webpage'''
r = requests.get(page['url'])
with open(page['name'] + '.html', 'w') as save_file:
save_file.write(r.text)


if __name__ == '__main__':
pool = ThreadPoolExecutor(max_workers=5)

for page in list_html_download:
pool.submit(page_download, page)

Here all the 5 threads are I/O bound as it requests to the URL and waits for its response after that it can resumes its work. Using ThreadPoolExecutor we can implement the multithreading in this scenario.

2. For CPU Bound tasks:

As GIL limits in implementing multithreading for CPU bound tasks , so in this situation we can take the help of python multiprocessing.

How python multithreading is different from multiprogramming?

In multi-processing approach we can use multiple processes instead of threads. Each Python process gets its own Python interpreter and memory space so the GIL won’t be a problem. Each Python Interpreter has its own GIL and each process has a single thread. Hence we can achieve the performance of multithreading using multiprocessing. We have implemented the same in the following code.

'''Download webpages in threads, using `multiprocessing`.'''
import requests
import time
import multiprocessing

download_list = [
{'name': 'google', 'url': 'http://google.com'},
{'name': 'reddit', 'url': 'http://youtube.com'}
]

def status_update():
'''Print 'Still downloading' at regular intervals.'''
while True:
print('Still downloading')
time.sleep(0.1)


def download_page(page_info):
'''Download and save webpage.'''
r = requests.get(page_info['url'])
with open(page_info['name'] + '.html', 'w') as save_file:
save_file.write(r.text)


if __name__ == '__main__':
for download in download_list:
downloader = multiprocessing.Process(target=download_page,
args=(download,))
downloader.start()

status = multiprocessing.Process(target=status_update)
status.daemon = True
status.start()

Replacing ThreadPoolExecutor with ProcessPoolExecutor will launch processes as opposed to threads.

Conclusion

In simple words we can say , In Python if tasks are CPU intensive go for multiprocessing and if tasks are I/O intensive go for multithreading. As per this blog we came to know about GIL, why it is used by python, how it impacts multithreading and ways to overcome the shortcomings of GIL.

Hope this blog might be helpful, stay tuned to learn some exciting topics in future blogs. Till then:

Code Everyday and Learn Everyday ! ! !

--

--

Chandra Shekhar Sahoo

Freelancers, Python Developer by Profession and ML Enthusiasts