Faster Execution using Threads

Chandra Shekhar Sahoo
3 min readOct 4, 2020

In this blog, we will learn multithreading concepts using Python ThreadPoolExecutor. Using these concepts makes the optimal usage of system hardware by operating system. In programming world, efficiency of any system or architecture is evaluated by taking execution time into consideration and the above two concepts helps in increasing code execution efficiency.

Let’s deep dive into it …..

Multi-threading

Definition : In computer architecture, multithreading is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to provide multiple threads of execution concurrently, supported by the operating system.

In this multiple threads are created for a process to increase its computation speed of the system. All threads spawned are executed simultaneously. Multithreading is more efficient when system is doing high I/O tasks like opening multiple files, writing in multiple files. It aims to increase utilization of a single core by using thread-level parallelism as well as instruction-level parallelism.

Multithreading and Multiprocessing

Multi-threading implementation in Python

Using ThreadPoolExecutorfrom concurrent.futures Python standard library threading .

For demonstrating the efficiency of multi-threading, we can validate this using two cases:

1) 5 tasks with 1 threads

from concurrent.futures import ThreadPoolExecutor, as_completed
import datetime, time


def some_work(x):
time.sleep(0.6)
print(f"Thread {x} executed")


if __name__ == '__main__':
# Case 1 : 5 tasks with 5 threads
n_tasks = 5
n_threads = 1
start_time = datetime.datetime.utcnow()
with ThreadPoolExecutor(max_workers=n_threads) as s:
for x in range(n_tasks):
s.submit(some_work, x)
print(f"Time taken to execute Program :{datetime.datetime.utcnow() - start_time}")

Output:

Thread 0 executed
Thread 1 executed
Thread 2 executed
Thread 3 executed
Thread 4 executed
Time taken to execute Program :0:00:03.006654

2) 5 tasks with 5 threads

from concurrent.futures import ThreadPoolExecutor, as_completed
import datetime, time


def some_work(x):
time.sleep(0.6)
print(f"Thread {x} executed")


if __name__ == '__main__':
# Case 1 : 5 tasks with 5 threads
n_tasks = 5
n_threads = 5
start_time = datetime.datetime.utcnow()
with ThreadPoolExecutor(max_workers=n_threads) as s:
for x in range(n_tasks):
s.submit(some_work, x)
print(f"Time taken to execute Program :{datetime.datetime.utcnow() - start_time}")

Output:

Thread 2 executed
Thread 1 executed
Thread 0 executed
Thread 4 executed
Thread 3 executed
Time taken to execute Program :0:00:00.604016

Let’s take a look at how this code works:

  • concurrent.futures is imported to give us access to ThreadPoolExecutor.
  • A with statement is used to create a ThreadPoolExecutor instance e that will promptly clean up threads upon completion.
  • Five jobs are submitted to the e: one for each of the task in the range between 1 to 5.
  • Each call to submit returns a Future instance that is stored in the futures list.
  • The as_completed function waits for each Future some_work() method call to complete so we can print its result.

Observation:

On the machine used in this tutorial, without threads took ~3.0063 seconds, and with threads took ~0.6040 seconds. Our program ran significantly faster with threads.

Conclusion:

In this blog we use the ThreadPoolExecutor python utility to efficiently run code that is I/O bound. We created a function which gets invoked within threads, and understand how to retrieve output from threaded executions of function , and observed the performance boost gained by using threads.

For more information on other concurrency function by following the documentationconcurrent.futures.

Will try to bring some more exciting topics in future blogs. Till then:

Code Everyday and Learn Everyday ! ! !

--

--

Chandra Shekhar Sahoo

Freelancers, Python Developer by Profession and ML Enthusiasts