Working with Threads

Sun 14 December 2014

Filed under Python

Working with Threads

I've been encouraged to learn sockets and threading. It's been fun. With some frustrating moments when the book I am using and the docs are just not clear enough - for me. So here I present it in a way that would have worked for me. So I share - in case it makes learning about threads easier for someone.

First, a scenario where threading can be useful: You're using one of those websites that check several airlines for best fares. Finally the trip to Barcelona! It would take a long time to search each airline separately. It woud be so much faster if the computer would search several airlines concurrently. Thankfully, that is what they do - saving us much time.

That's exactly what threading does. The program sets off to do several things at the same time - each function on its own thread. This is especially useful when the functions don't rely on or get data from each other. Thread1 searches American Airlines, Thread2 searches Air France and Thread3 searches Iberia. And so on. And they each come back with their results when they are done. And then you can choose the best flight to Barcelona.

So that's what threading does. Sure, that's oversimplified. And, I'm not sure how those flight aggregator programs really work, but that is a scenario where we'd see the benefits of threads.

What's happening under the hood of threads?

First some basic things to know: - Threads execute parellel within the same process - and as such, they share all the same context (name space, address, etc.) - Threads have a beginning, the execution and an end. - locks. Gotto add more here.

And here's how it all works - the pseudo code. 1. create the thread instances 2. execute the code 3. exit the thread and do whatever

So in the case of the search for best tickets to Barcelona. Nah, that would be too much code; here is a much simpler scenario. Trying to find the local time in several places in the world. (Of course, there is no real need to thread for these queries, since they are super quick. But this will give an idea about how threading works.)

  1. Get the search query - what places we are looking for.

    • these are hardcoded in a dictionary called places.
    • we'll use www.timeapi.org to find the local time for those places.
  2. Create a thread instance for each place

    • in a for loop, we will create a thread for each query.
  3. Execute the concurrent query searches.

  4. When all query results come in, display the places and times on the screen.

Here's the code:

import threading
import requests
from time import ctime
from atexit import register

url_lookup = 'http://www.timeapi.org/{}/now'

places = {'Paris': 'cet',
            'Moscow': 'msk',
            'Kuwait':  'ast',
            'New York': 'est',
            'Copenhagen': 'cet'}  #place: time_zone code

def _display_time(place):
    time_zone = places[place]
    current_time = requests.get(url_lookup.format(time_zone)).content    
    print '{:>10} | {:<15}'.format(place, current_time[11:19])

def main():
    print 'starting at:', ctime()
    print '-'*30
    for place in places:
        threading.Thread(target=_display_time, args=(place,)).start()

@register
def _atexit():
    print '-'*30
    print 'all done at:', ctime()

if __name__ == '__main__':
    main()

First the imports.

url_lookup = 'http://www.timeapi.org/{}/now': Then, the url that we'll use to find the time. Notice the curly braces {}. We'll use that later to insert the time_zone., with .format.

places: This is a dictionary of places and time zones codes (as this API likes it).

def _display_time(place): This function

  1. gets the time zone (the value of the key place).

  2. gets the current time - using the url with the time_zone inserted

    • uses the requests method. Sends a 'GET' request to the url; the response will be the .content of the requests.get. See more about requests in my previous post and in the docs.
    • uses the .format method to insert the string data. Read more about .format
  3. prints/ displays the place and current_time.

    • {:>10} is the formatting - right aligned - 10 width.
    • {:<15} - left aligned - 15 width
    • Read more about formatting with .format
    • current_time[11:19] - the time comes back as a long string 2014-12-14T01:47:46+00:00. I just sliced it to get the data I need. More about slicing

def main(): This is where the threading happens!

  1. print the current time. Just to see how long the process takes. To compare to the time we print at the very end of this whole script.

  2. a for loop for each place in the places dictionary.

  3. threading.Thread(target=_display_time, args=(place,)).start(): OK, a lot going on here. This is where all the threading happens.

    • threading.Thread: A Thread is an object that represents a single thread for execution. We're in a loop here, so we'll be creating several instances of Thread.
    • The Thread class has these arguments threading.Thread(group=None, target=None, name=None, args=(), kwargs={}) We don't need all those in our case. see more about Thread
    • target=_display_time: the target is the function that will be called (run) when the Thread is created. In this case it is the _display_time function.
    • args=(place,): these are the args for the target. In this case, the place parameter for the _display_time function.
    • .start(): gets the thread started. The start is what actually gets the target/ _display_time function started.

@register is a decorator for the _atexit() function. Together, they are a function that registers an exit function - to do something just before the script quits. In this case, print the current time to compare it to the start time. More about atexit()

So there, a very simple threading process.


Of course, there is much more to threading. A very clear, easy to understand resource. http://pymotw.com/2/threading/


Comments


DeeKras.com © Dee Kras Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More