Python/Urllib2/Threading: Single download thread faster than multiple download threads. Why?

9 women can't combine to make a baby in one month. If you have 10 threads, they each have only 10% the bandwidth of a single thread, and there is the additional overhead for context switching, etc.

Thanks alot. I get it now – Nedy Nov 19 '10 at 4:03.

Python threading use something call the GIL (Golbal Interpreter Lock) that sometime degrade the programs execution time. Without doing a lot of talk here I invite you to read this and this maybe it can help you to understand your problem, you can also see the two conference here and here. Hope this can help :).

GIL is freed while waiting for I/O so it's not the case for GIL weirdness. – andreypopp Nov 18 '10 at 22:42 @andreypopp: have you looked to the links in my answer? , I/O bound process are also "affected"(intensionally) by he GIL in case when we use a multi-core process (which is the often case this days), I didn't want to get to explain all what I know about the GIL and I/O bound and CPU bound process because I taught that the video conference must be better than my poor knowledge and english so take a look to the links, they are talking about the OP case.

– mouad Nov 18 '10 at 23:27 Thanks for the video. Very helpful – Nedy Nov 19 '10 at 1:38 Sorry, I haven't time right now to look the video you mention, but I guess you're talking about both CPU bound and I/O bound threads in the same process — this can be an issue, but raw I/O bound threads works just good. Of course you cannot spawn thousands of threads like you can do in Erlang, but having 3-5 threads for concurrent download is common situation.

– andreypopp Nov 19 '10 at 6:41.

Twisted uses non-blocking I/O, that means if data is not available on socket right now, doesn't block the entire thread, so you can handle many socket connections waiting for I/O in one thread simultaneous. But if doing something different than I/O (parsing large amounts of data) you still block the thread. When you're using stdlib's socket module it does blocking I/O, that means when you're call socket.

Read and data is not available at the moment — it will block entire thread, so you need one thread per connection to handle concurrent download. These are two approaches to concurrency: Fork new thread for new connection (threading + socket from stdlib). Multiplex I/O and handle may connections in one thread (Twisted).

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions