Wednesday, November 10, 2010

Validating SSL server certificate with Python 2.x

SSL stands for Secure Sockets Layer and is designed to create secure connection between client and server. Secure means that connection is encrypted and therefore protected from eavesdropping. It also allows to validate site identity when connecting with HTTPS protocol.


However, there is a bug in ssl module from standard library of Python 2.x, that allows successful MITM attack using valid certificate from other site. Basically, module checks when connecting that server certificate is valid and correctly signed by root certificate, but it does not check that certificate actually belongs to the site, i.e. that site name matches the name specified in certificate.


It is still possible to validate server identity in Python 2.6 manually. Let's start with illustration of the vulnerability. The following snippet should fail - it replaces HOST "www.google.com" to connect to with its IP address. If you try to use this IP in Chrome like https://74.125.232.50 - it will show an error, but ssl library will not throw exception.

import socket
import ssl

HOST = "www.google.com"
PORT = 443

# replace HOST name with IP, this should fail connection attempt,
# but it doesn't in Python 2.x
HOST = socket.getaddrinfo(HOST, PORT)[0][4][0]
print(HOST)

# create socket and connect to server
# server address is specified later in connect() method
sock = socket.socket()
sock.connect((HOST, PORT))

# wrap socket to add SSL support
sock = ssl.wrap_socket(sock,
# flag that certificate from the other side of connection is
# required and should be validated when wrapping
cert_reqs=ssl.CERT_REQUIRED,
# file with root certificates
ca_certs="cacert.pem"
)



You will need cacert.pem file with root certificates. Just grab the latest version from http://curl.haxx.se/ca/cacert.pem This code above won't give you any error. Replace HOST value with https://www.debian-administration.org/ to check that certificate validation actually works. This site's certificate is not signed by any root certificates from "cacerts.txt", so you get an error.

To validate that a certificate matches requested site, you need to check commonName field in the subject of the certificate. This information can be accessed with getpeercert() method of wrapped socket.


import socket
import ssl

HOST = "www.google.com"
PORT = 443

# replace HOST name with IP, this should fail connection attempt
HOST = socket.getaddrinfo(HOST, PORT)[0][4][0]
print(HOST)

# create socket and connect to server
# server address is specified later in connect() method
sock = socket.socket()
sock.connect((HOST, PORT))

# wrap socket to add SSL support
sock = ssl.wrap_socket(sock,
# flag that certificate from the other side of connection is
# required and should be validated when wrapping
cert_reqs=ssl.CERT_REQUIRED,
# file with root certificates
ca_certs="cacerts.txt"
)

# manual check of hostname
cert = sock.getpeercert()
for field in cert['subject']:
if field[0][0] == 'commonName':
certhost = field[0][1]
if certhost != HOST:
raise ssl.SSLError(
"Host name '%s' doesn't match certificate host '%s'"
% (HOST, certhost))


That's it. I put my findings to http://wiki.python.org/moin/SSL - you may want check it for updates.

Wednesday, September 22, 2010

hgsubversion: Installing on Windows

        I use Windows primarily, and it's not easy to install Mercurial extensions there if they are not bundled with Mercurial installer itself. I am using plain .msi installer, which doesn't include hgsubversion, so I'll show you how to get this extension in place manually. Suppose you have Mercurial installed into C:\Mercurial Go there and notice library.zip archive. This is where you should add hgsubversion. library.zip has the following structure: / +-- email +-- encodings +-- hgext +-- logging ... __future__.pyc _abcoll.pyc _elementtree.pyd ... zipextimporter.pyc zipfile.pyc Contents of this archive is the whole Mercurial code. Code is written in Python, and is compiled into .pyc and .pyd files for speed and packed into library.zip for convenience. For interpreted languages like Python it is not necessary to compile source files for execution, but they did it. Get latest hgsubversion sources from http://bitbucket.org/durin42/hgsubversion/src by pressing link "get source" in upper right corner. Or download released version from http://pypi.python.org/pypi/hgsubversion/1.1.2 Unpack archive. You should have directory structure like: / +-- hgsubversion +-- notes +-- tests +-- tools .hgignore ... setup.py Now put hgsubversion/ dir from this structure into library.zip archive, inside hgext/ directory, where Mercurial expects to find its extensions. Enable extension by uncommenting line ";hgsubversion =" in Mercurial.ini that is usually located in your profile directory, i.e. C:\Users\yourname If it is not there - copy one from your C:\Mercurial\hgrc.d directory. Try to checkout some subversion repository. You will get an error like this: C:\p>hg clone http://google-twitter.googlecode.com/svn/trunk (falling back to static-http) (falling back to Subversion support) destination directory: trunk abort: subvertpy 0.7.3 or later required, but not found! You need Subversion bindings for Python. They advertise subvertpy, but it is also possible to use official bindings. Those bindings is just a set of libraries that allows Python scripts to use Subversion binary code directly. Unfortunately, after migration to apache.org, Subversion bindings are not compiled anymore, so there is no official place where you can download them. But thanks to http://alagazam.net/ it is possible to get unofficial . Download svn-win32-1.6.12_py.zip or more up-to-date distribution and extract it. You will get a directory structure like: / +-- python +-- libsvn +-- svn README.txt In libsvn/ rename _client.dll to _client.pyd. Do the same with _core.dll, _delta.dll, _ra.dll Maybe you'll have to rename other .dll files, but these were enough for me. Now put svn/ and libsvn/ into the library.zip directly in the root. I also had to copy intl3_svn.dll from an old 1.6.6 installation of SVN into library.zip root (you may find this file in Windows binaries archive from http://alagazam.net/) and place libsvn_swig_py-1.dll from libsvn/ dir also in library.zip root. It should work now. 

Sunday, June 06, 2010

Users read only first sentence

During the work on Python documentation somebody joked that users look only at first sentence. I would say that in the Age of Twitter it is a rule. So I should swap the first two sentences (and switch to microblogging).

Thursday, May 06, 2010

Sphinx PDF with rst2pdf

I deliberately omit word LaT*X in my post to avoid missing people who add '-LaT*X' in search queries. Yes, it is possible to generate PDF with Sphinx without LaT*X in cross-platform way. Yes, on Windows too. You will need only rst2pdf. Actually integration with Sphinx is well described in rst2pdf manual (text and PDF), but people find it hard to find this information, so I'll quote checklist here:
  1. install rst2pdf
  2. register rst2pdf in your conf.py Sphinx config
    extensions = ['sphinx.ext.autodoc','rst2pdf.pdfbuilder']
  3. run
    sphinx-build -bpdf sourcedir outdir
I hope it was helpful. Actually, check the manual - it has some useful options for conf.py and it's more up-to-date.

    Friday, April 23, 2010

    Why Far Manager and Vim are awesome

    Far Manager and Vim are both console tools. When non-computer folks see blue panels of Far on my screen they are usually surprised like "DOS? But why?". These folks still remember DOS windows (DOS Windows, huh), but if some fourteen years adult tells me something about DOS it will be my turn to be surprised.

    Far Manager is essentially a Windows console Swiss Army Knife. Much like Vim is for Unix. While GUI tools are awesome, they are not as responsive as console ones, not deterministic - I know it sounds horrible, but it just means that at any given moment you can't be sure what a given combination of  keys will do. It depends on the focus and this focus is not always clearly visible. Interaction using mouse is slow (if you're not a hardcore gamer, of course). But I actually written this post to say that:

    You don't need to press "ENTER" for most used commands and that is awesome!

    In Far you may setup shortcuts as easy as pressing F2 and then Ins. Every action is reachable as a series of key presses. You don't need to put you hand off the keyboard to reach the mouse. Forget about mouse - you will need it only for copy/pasting. You may easily automate repeated key pressing by using Ctrl-. then typing the keys you like and then hitting Ctrl-. again. You'll be prompted for a key combination that will invoke the keys you've just written. Repeated edits, replacements, file renames, copies - a lot of things can be automated using this shortcut called Macro.

    Another from many Far features you may find awesome is output redirection. By using edit:< prefix on the command line you may redirect the output to embedded editor and navigate around it as you like. With Colorer plugin you will even get syntax highlight for the output.

    Enjoy!

    Wednesday, April 14, 2010

    Porting Python applications from Unix to Windows

    os.open

    Note the difference between open and os.open and ensure that all os.open calls have os.O_BINARY flag for Windows.

    See also Stani and Nadia talk about cross platform application development and distribution

    Wednesday, February 03, 2010

    URL parameters in action method of HTML form are lost for GET request

    Surprisingly how experienced web-developers may miss some basic nuances of form processing that exist for many-many years. One of them is the fact that URL parameters added to action attribute of <form> element are lost for GET requests.

    Example:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
    <html>
    <head>
    <title>Query string from action attribute is killed for GET request</title>
    </head>
    <body>

      <form name="getSearchForm" action="?missed=one" method="get">
        <input type="text" name="q" value="s" />
        <button class="searchButton" type="submit">GET</button>
      </form>

      <form name="postSearchForm" action="?missed=one" method="post">
        <input type="text" name="q" value="s" />
        <button class="searchButton" type="submit">POST</button>
      </form>

    </body>
    </html> 

    The parameter "missed=one" is missing from URL after clicking a button on a form with GET send method. It is present for POST form though.

    Friday, January 29, 2010

    Working with complex issues

    Some issues should be cut into chewable chunks and linked together into a dependency tree. Each chunk should be accompanied by "digestion recipe" that includes tools, skills and necessary ingredients. If the issue is very complicated, the chunks may be split in pieces that take no longer than one day to get the chunk, analyze it and solve. The work on the issue can then be spread over the looong time.

    This requires tools. Tools to minimize waste of time on getting all necessary stuff to start work, save work, send it in one day and wait for it to be approved. Learning these tools should also be quick to get the task done.

    Life is short, so time savers are critical. Cheatsheet template with already filled address of contacts, repositories etc. can greatly reduce the time to get the thing done. Once template is filled, it is the a cheatsheet that can be commented with new information for this specific task. These comments can then be incorporated back to original template or into a new, for more advanced usage. If making a reusable template is just one hour, and solution for the task is one day - a template one day and the task the other is better.

    Monday, January 25, 2010

    Repacking library.zip from py2exe



    intro
    py2exe tool converts Python scripts to standalone .exe distributives. It isolates all required Python modules together with Python interpreter itself and wraps them into single library.zip file. This way application is not affected by Python modules that may be already installed on user system. Unfortunately, this also means that you can't add new modules to your standalone Python program.

    For example, you need to add Hg-Git extension to Mercurial installed as standalone program. You can specify path to extension in Mercurial.ini, but Hg-Git depends on Dulwich module, which is not present in library.zip Attempt to use this extension will fail. The same problem is with converting Bazaar repositories using convert extension that is present in library.zip, but additionally requires installed bzr module.

    solution
    The script below helps to add Python modules to library.zip file made with py2exe. Latest version should be available at this bitbucket repository. The following command unpacks library.zip in current directory and makes library_unpacked.zip that you can edit with your favorite archiver:

    python relibzip.py unpack
    After you've finished, issue:

    python relibzip.py pack
    and script will create library_packed.zip from extracted resources. Copy this file over library.zip and you all set.

    Source code (ugly, but works):
    
    """
    pack/unpack library.zip created by py2exe to standard .zip archive
    
    library.zip created with py2exe can not always be processed by standard
    archivers. this script removes extra chunks added by py2exe and puts
    them back when requested
    
    MIT license, by techtonik // gmail.com
    """
    
    import sys
    import os
    from optparse import OptionParser
    import struct
    
    
    PYTHONDLL = "<pythondll>"
    PDNAME = "pythondll"
    ZLIBPYD = "<zlib.pyd>"
    ZDNAME = "zlibpyd"
    
    UNPACKED = "library_unpacked.zip"
    PACKED = "library_packed.zip"
    
    
    def unpack(filename):
       f = open(filename, "rb")
       # looking for PYTHONDLL name
       pdname = f.read(len(PYTHONDLL))
       if pdname != PYTHONDLL:
           if pdname[:2] == "PK":
               sys.exit("Seems to be normal .zip archive, not unpacking")
           else:
               sys.exit("Unknown archive format")
    
       def save_section(secname, fname):
           print "Extracting %s section to %s" % (secname, fname)
           fsize = struct.unpack("i", f.read(4))[0]
           fpd = open(fname, "wb")
           fpd.write(f.read(fsize))
           fpd.close()
       save_section(PYTHONDLL, PDNAME)
    
       buf = ""
       zdname = f.read(len(ZLIBPYD))
       if zdname != ZLIBPYD:
           if zdname[:2] == "PK":
               print "No zlib.pyd section"
               buf = zdname
           else:
               sys.exit("Unknown archive format")
       else:
           save_section(ZLIBPYD, ZDNAME)
      
       flib = open(UNPACKED, "wb")
       flib.write(buf)
       flib.write(f.read())
       flib.close()
       f.close()
       print "Done. Unpacked .zip contents is available at %s" % UNPACKED
       sys.exit(0)
    
    
    def pack(tofname):
    
       if not os.path.exists(PDNAME):
           sys.exit("%s section file %s is not found. Exiting" % (PYTHONDLL, PDNAME))
       if not os.path.exists(UNPACKED):
           sys.exit("Unpacked version %s is not found. Exiting" % UNPACKED)
    
       f = open(tofname, "wb")
    
       def write_section(secname, fname):
           print "Writing %s section from %s" % (secname,fname)
           f.write(PYTHONDLL)
           fsize = os.stat(PDNAME).st_size
           f.write(struct.pack("i", fsize))
    
           fpd = open(fname, "rb")
           f.write(fpd.read())
           fpd.close()
           print "Removing section file %s" % fname
           os.remove(fname)
       write_section(PYTHONDLL, PDNAME)
    
       # check for optional ZLIBPYD section
       if not os.path.exists(ZDNAME):
           print "No %s section file %s. Skipping" % (ZLIBPYD, ZDNAME)
       else:
           write_section(ZLIBPYD, ZDNAME)
    
       fzip = open(UNPACKED, "rb")
       f.write(fzip.read())
       fzip.close()
    
       f.close()
       print "Done. Packed .zip contents is available at %s" % tofname
       sys.exit(0)
    
    
    parser = OptionParser(usage="usage: %prog ",
       description="update library.zip created by py2exe utility")
    opt,arg = parser.parse_args()
      
    if arg and arg[0] == 'unpack':
       unpack("library.zip")
    elif arg and arg[0] == 'pack':
       pack(PACKED)
    else:
       sys.exit(parser.format_help())