Results 1 to 3 of 3

Thread: Python urllib2 utf-8 and threading

  1. #1
    Join Date
    May 2008
    Posts
    191

    Python urllib2 utf-8 and threading

    Hello,

    I have some problems to understand the threading module and the few bits of code I found does not say more.

    Here is my simple code of departure (it recursive html pages and sending a message if everything is ok):
    1. #! / usr / bin / env python2.6
    2. # -*- Coding: utf-8 -*-
    3. import thread
    4. import urllib2
    5. class robot:
    6. def __init__ (self, url):
    7. url = self.__url
    8. def queue (self):
    9. for page in self.__url:
    10. html = self.execute (page)
    11. if html:
    12. self.sendmail (the content is ok)
    13. else:
    14. print "error on page (0)." format (page)
    15. def execute (self, url):
    16. "" "
    17. recursive html content ...
    18. "" "
    19. urllib2.url open return (url). read ()
    20. def sendmail (self, message):
    21. "" "
    22. sending a mail ...
    23. "" "
    24. pass
    25. if __name__ == '__main__':
    26. url = [ "http://google.com", "http://www.techarena.in"]
    27. recursive = robot (url)
    28. recursive.queue ()
    It works but will cause speed problems (depending on the number of pages, or pages that are blocked: timeout ...). In short I paralleled the "execute" to use the threading module.

    The worry is that I need to recover the results of this function to generate tests / messages and I do not see how ...

    thank you to those who could enlighten me

  2. #2
    Join Date
    May 2008
    Posts
    297

    Re: Python urllib2 utf-8 error

    Use (Queue.Queue) with the URLs to the entrance, a queue for output.

  3. #3
    Join Date
    May 2008
    Posts
    191

    Re: Python urllib2 utf-8 error

    I used the multiprocessing module is more efficient for what I have to do.

    I modified the function queue ():

    def queue (self):
    time_start datetime.datetime.now = ()
    processes = [code]
    for page in self.__url:
    p = Process (target = self.execute, args = (page))
    p.start ()
    processes.append (p)
    for p in processes:
    p.join ()
    # For page in html:
    # If page:
    # Self.sendmail (the content is ok)
    # Else:
    # Print "error on page (0)"
    time_end = datetime.datetime.now ()
    print "execute in (0) sec." format (time_end - time_start)
    This works but a few questions:
    - A process does not return "nothing". It means execute my function which returns the html code, I could not recover. Me, I manage to integrate the test on the html code directly in the execute ()?

Similar Threads

  1. Threading in c# .net
    By Athreya in forum Software Development
    Replies: 3
    Last Post: 12-01-2011, 02:29 PM
  2. What is Hyper threading
    By Ameeryan in forum Overclocking & Computer Modification
    Replies: 2
    Last Post: 06-10-2009, 04:03 PM
  3. Python Threading - Basics
    By Zecho in forum Software Development
    Replies: 3
    Last Post: 14-04-2009, 02:00 PM
  4. What is Threading in C# Programming ?
    By HAKAN in forum Software Development
    Replies: 2
    Last Post: 31-03-2009, 11:18 AM
  5. Download Python 3.0 / Python 3000
    By Amaresh in forum Software Development
    Replies: 6
    Last Post: 24-02-2009, 08:28 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •