[Twisted-Python] client crashed

Andrew Bennetts andrew-twisted at puzzling.org
Sat Jan 13 07:46:52 EST 2007


Rick Graves wrote:
[...]
> 
> def _parse(url, defaultPort=None):
>     parsed = urlparse.urlparse(url)
>     scheme = parsed[0]
>     path = urlparse.urlunparse(('','')+parsed[2:])
>     if defaultPort is None:
>         if scheme == 'https':
>             defaultPort = 443
>         else:
>             defaultPort = 80
>     host, port = parsed[1], defaultPort
>     if ':' in host:
>         host, port = host.split(':')
>         port = int(port)
>     return scheme, host, port, path
> 
> 
> 
> Here is the end of the traceback:
> 
>  File "/usr/lib/python2.4/site-packages/twisted/web/client.py", line 376, in _parse
>     port = int(port)
> exceptions.ValueError: invalid literal for int():
> 
> 
> I was running a script on a big list of URL's that I scraped off a google query.  
> 
> I looked at each URL including a colon, but nothing jumped out at me.

Here's an example that will trigger that exception:

    >>> _parse('http://foo:/')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "<stdin>", line 13, in _parse
    ValueError: invalid literal for int(): 

If you're having trouble diagnosing the cause of your problem, I suggest putting
a try-except in your code to log the URL triggering it:

        try:
            port = int(port)
        except:
            print "Bad port, url is:", url
            raise

You could also put the try/except around the entire body of the function.

Anyway, there's nothing Twisted specific here that I can see.

-Andrew.





More information about the Twisted-Python mailing list