David Mertz (mertz@gnosis.cx), Programmer, Gnosis Software, Inc.
Summary: In this final installment of his series on Twisted, David looks at specialized protocols and servers contained in the Twisted package, with a focus on secure connections.
One thing the servers and clients in Parts 1, 2, and 3 had in common is that they operated completely in the clear, cryptographically speaking. Sometimes, however, you want to keep your connection free from prying eyes (or from tampering/spoofing).
While protocols for determining permissions on server resources are interesting, for this installment I want to look at protocols involving actual wire-level encryption. But for general background, you might want to investigate Web-oriented mechanisms such as Basic Authentication, which is described in RFC-2617 and implemented in Apache and other Web servers. The Twisted package twisted.cred is a general but complex framework for providing authentication services in general-purpose Twisted servers (not limited to Web servers).
There are two widespread APIs for wire-level encryption over the Internet: SSL and SSH. The former, SSL (Secure Sockets Layer) is widely implemented in Web browsers and Web servers; in principle, however, there is no reason SSL is specifically tied to the HTTP protocol. SSL combines a public-key infrastructure, complete with a "web-of-trust" based on Certificate Authorities, with creation of a session key for standard symmetrical encryption during the life of a particular connection.
Twisted does come with an SSL framework; however, as with most things in Twisted, exactly how it might work is poorly documented. I tried downloading two likely support packages to try to get the Twisted v.1.0.6 script test_ssl.py to work (seeResources). I am sure that with some version of the right third-party libraries (and some Twisted version) -- and perhaps after corrections to erroneous examples -- it is possible to use SSL with Twisted, but I have not done so for this article.
The other widely used API for wire-level encryption is SSH (Secure Shell), well known from the tool of the same name (in lowercase: ssh). Many of the underlying cryptographic algorithms are shared between SSL and SSH, but SSH is focused on creating encrypted shell connections (rather than using snooper-friendly programs/protocols such as telnet and rsh). Twisted lets you write your own custom SSH clients and servers, which is quite nice. While you certainly can write a basic interactive remote shell, like that provided by the client and server ssh and sshd, you can also create more specialized tools to use these secure connections for higher-level purposes.
An SSH Weblog client
In continuing with the example of this series of articles, I created a tool to examine hits to my Web server log file, but to do so over an encrypted SSH channel. This purposes is realistic, actually -- perhaps I do not want to publicly reveal the hits I get to someone monitoring my packet stream.
The Twisted package itself was missing a support module, and what it was, exactly, was evidently not documented. Before I could get far in my efforts, I needed to figure out what the line import Crypto
in the twisted.conch package was trying to find. The name is obviously a hint, but I was also somewhat familiar with the Python cryptography library maintained by Andrew Kuchling (please see Resources for a link). A bit of Googling, a download, and an install later, Twisted's test_conch.py would run without complaint. So on to the project of creating a custom SSH client.
I based my client on the example provided in the Twisted file doc/examples/sshsimpleclient.py. I have simplified somewhat (as well as done some customizing); you might want to look at what else is in the distributed example. As with most Twisted components, twisted.conch consists of several layers, each of which can be customized. I guess the name "conch" is a play on the word "shell" in Secure Shell.
The transport layer is a customization of SSHClientTransport
. We may define several methods but need at least to define.verifyHostKey()
and .connectionSecure()
. In our implementation, we trust every host key and simply give control back to the asynchronous reactor core by returning a defer.succeed
object. Of course, if you wanted to verify a host against a known key, you could do that in .verifyHostKey()
.
Creating the channel is where the other layers come in. A child of SSHUserAuthClient
performs the actual login authentication; if successful, it establishes a connection (for which I define a child of SSHConnection
). This connection, in turn, creates a channel -- a child of SSHChannel
. It is the channel, which I named simply Channel
, that does the actual custom work. Specifically, the channel does things like send and receive data and commands. Let's look at my specific client:
#!/usr/bin/env python
"""Monitor a remote weblog over SSH
USAGE: ssh-weblog.py user@host logfile
"""
from twisted.conch.ssh import transport, userauth, connection, channel
from twisted.conch.ssh.common import NS
from twisted.internet import defer, protocol, reactor
from twisted.python import log
from getpass import getpass
import struct, sys, os
import webloglib as wll
USER,HOST,CMD = None,None,None
class Transport(transport.SSHClientTransport):
def verifyHostKey(self, hostKey, fingerprint):
print 'host key fingerprint: %s' % fingerprint
return defer.succeed(1)
def connectionSecure(self):
self.requestService(UserAuth(USER, Connection()))
class UserAuth(userauth.SSHUserAuthClient):
def getPassword(self):
return defer.succeed(getpass("password: "))
def getPublicKey(self):
return # Empty implementation: always use password auth
class Connection(connection.SSHConnection):
def serviceStarted(self):
self.openChannel(Channel(2**16, 2**15, self))
class Channel(channel.SSHChannel):
name = 'session' # must use this exact string
def openFailed(self, reason):
print '"%s" failed: %s' % (CMD,reason)
def channelOpen(self, data):
self.welcome = data # Might display/process welcome screen
d = self.conn.sendRequest(self,'exec',NS(CMD),wantReply=1)
def dataReceived(self, data):
recs = data.strip().split('\n')
for rec in recs:
hit = [field.strip('"') for field in wll.log_fields(rec)]
resource = hit[wll.request].split()[1]
referrer = hit[wll.referrer]
if resource=='/kill-weblog-monitor':
print "Bye bye..."
self.closed()
return
elif hit[wll.status]=='200' and hit[wll.referrer]!='-':
print referrer, ' -->', resource
def closed(self):
self.loseConnection()
reactor.stop()
if __name__=='__main__':
if len(sys.argv) < 3:
sys.stderr.write('__doc__')
sys.exit()
USER, HOST = sys.argv[1].split('@')
CMD = 'tail -f -n 1 '+sys.argv[2]
protocol.ClientCreator(reactor, Transport).connectTCP(HOST, 22)
reactor.run()
The overall structure of the client is like most of the Twisted applications we have seen. It creates a protocol, and monitors events in an asyncronous loop (in other words, reactor.run()
).
The interesting part comes in the methods of Channel()
. As soon as the channel is opened, we execute a custom command -- in this case, a tail -f
on the Weblog file whose name is specified on the command line. Naturally, the host, which is still a completely generic sshd server rather than anything Twisted specific, starts sending some data back. The methoddataReceived()
parses the data as it comes in (incrementally as tail
produces more). For this specific client, we decide when to terminate based on the actual content of the Weblog being parsed -- which amounts to having a Web-based way to kill the monitoring application. While that specific configuration is probably unusual, the example demonstrates the general concept of severing the connection when some condition is met (it could be any condition). A session looks like:
Listing 2. Sample session of Weblog monitor
$ ./ssh-weblog.py gnosis@gnosis.cx access-log
host key fingerprint: 56:54:76:b6:92:68:85:bb:61:d0:f0:0e:3d:91:ce:34
password:
http://gnosis.cx/dW/ --> /publish/whatsnew.html
http://gnosis.cx/dW/whatsnew.html --> /home/hugo.gif
Bye bye...
This is pretty much the same as all the other Weblog monitors this series created. I ended the above session by pointing a browser at <http://gnosis.cx/kill-weblog-monitor> from another window (otherwise, it would watch indefinitely).
Modifying the SSH client
It is a simple matter to create other SSH clients that achieve other purposes. For example, I copied ssh-weblog.py to the name scp.py, and made just a few changes to the code. The _main_
body parses options slightly differently, and the docstring was adjusted; beyond that, I simply modified the .dataReceived()
method to read:
Listing 3. scp.py (modified Channel method)
def dataReceived(self, data):
open(DST,'wb').write(data)
self.closed()
(The variable CMD was set to "cat "+sys.argv[2]
.)
Viola! I have implemented the tool scp
that accompanies many SSH clients.
These examples are both "run and collect" tools. That is, they are not interactive during the session. But you could easily create another tool that made additional calls to self.conn.sendRequest()
within Channel
methods. In fact, if the client was some kind of GUI client, you might add those data collection forms as callbacks within the reactor. That is, perhaps when certain forms are completed, new remote commands could be issued, and the results again collected for processing or presentation.
An SSH Weblog server
An SSH server uses much of the same structure as the client. As before, I simplify and customize doc/examples/sshsimpleserver.py for my example. One twist is that a server is best created using an SSHFactory
child that has been configured with appropriate keys and classes.
In our SSH Weblog server, we configure a password and username for an authorized user. In the example, they are hardcoded, but you could obviously store them otherwise; perhaps configure a list of authorized Weblog monitors. Let's look at the example:
Listing 4. ssh-weblog-server.py
#!/usr/bin/env python2.3
from twisted.cred import authorizer
from twisted.conch import identity, error
from twisted.conch.ssh import userauth, connection, channel, keys
from twisted.conch.ssh.factory import SSHFactory
from twisted.internet import reactor, protocol, defer
import time
class Identity(identity.ConchIdentity):
def validatePublicKey(self, data):
return defer.succeed('')
def verifyPlainPassword(self, password):
if password=='password' and self.name == 'user':
return defer.succeed('')
return defer.fail(error.ConchError('bad password'))
class Authorizer(authorizer.Authorizer):
def getIdentityRequest(self, name):
return defer.succeed(Identity(name, self))
class Connection(connection.SSHConnection):
def gotGlobalRequest(self, *args):
return 0
def getChannel(self, channelType, windowSize, maxPacket, data):
if channelType == 'session':
return Channel(remoteWindow=windowSize,
remoteMaxPacket=maxPacket, conn=self)
return 0
class Channel(channel.SSHChannel):
def channelOpen(self, data):
weblog = open('../access.log')
weblog.readlines()
while 1:
time.sleep(5)
for rec in weblog.readlines():
self.write(rec)
def request_pty_req(self, data):
return 1 # ignore, but this gets send for shell requests
def request_shell(self, data):
self.client = protocol.Protocol()
self.client.makeConnection(self)
self.dataReceived = self.client.dataReceived
return 1
def loseConnection(self):
self.client.connectionLost()
channel.SSHChannel.loseConnection(self)
class Factory(SSHFactory):
publicKeys = {'ssh-rsa':keys.getPublicKeyString(
data=open('~/.ssh/id_rsa.pub').read())}
privateKeys ={'ssh-rsa':keys.getPrivateKeyObject(
data=open('~/.ssh/id_rsa').read())}
services = {'ssh-userauth': userauth.SSHUserAuthServer,
'ssh-connection': Connection}
authorizer = Authorizer()
reactor.listenTCP(8022, Factory())
reactor.run()
For brevity, the parsing and formatting of the Weblog records is omitted, but the idea of using an open channel to write new records as they become available is almost the same as with the client approach. Of course, in this case, any generic SSH client can connect to the specialized server:
Listing 5. Sample session of Weblog monitor
$ ssh gnosis.python-hosting.com -p 8022 -l user
user@gnosis.python-hosting.com's password:
141.154.146.89 - - [26/Aug/2003:02:47:40 -0500]
"GET /voting-project/August.2003/0010.html HTTP/1.1" 200 8986
"http://gnosis.python-hosting.com/voting-project/August.2003/0009.html"
"Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85
(KHTML, like Gecko) Safari/85"
[...]
As with the client approach, an enhanced version might become more interactive; the .dataReceived()
method of the channel could be customized to do something useful with data sent from the (generic) client.
Social dynamics
The biggest reservation I have about recommending the Twisted framework is, unfortunately, the "wild west" feel among its developer group. The software itself is quite powerful. But even more than in most open source projects, there is insufficient API consistency between releases, the documentation remains rough, and a thick skin is the main prerequisite for seeking help on its mailing list; you can get helpful responses, but only after wading through the acerbic ones.
As this installment demonstrated -- especially in my attempts to fill in pieces missing from the examples and documentation, Twisted could really stand to have a helpful community behind it. Hopefully, with time, both the documentation and mailing list will improve in quality; the facilities hiding in the various corners of the Twisted framework are quite impressive.
Resources
- Twisted Matrix comes with quite a bit of documentation, and many examples. Browse around the Twisted Matrix homepageto glean a greater sense of how Twisted Matrix works and what has been implemented with it.
- Read the previous installments in the "Network programming with the Twisted framework" series. Part 1 covered asynchronous server programming; Part 2 introduced higher-level techniques for writing Web services; and Part 3 used Woven templating to implement dynamic Web serving.
- The Python Cryptography Toolkit, maintained by Andrew Kuchlink, includes numerous well-investigated public-key, private-key, and cryptographic hash functions, as well as some miscellaneous other protocols.
- The SourceForge project Python OpenSSL Wrappers (POW) looks like a useful tool for SSL programming in Python. However, it does not appear (from my trial-and-error) to be what Twisted is looking for in its SSL subsystem.
- Most likely, for Twisted, the SSL wrapper you want is pyOpenSSL. At least after I installed that, I got past an import exception in Twisted's
test_ssl.py
(but only so far as what appears to be an error in the test script). - Some background on HTTP authentication techniques can be found in RFC-2617.
- An introduction to the SSL protocol can be found at the Netscape developer site.
- A simple version of a Weblog server is presented in the developerWorks article, "Use Simple API for XML as a long-running event processor" (developerWorks, May 2003).
- Find more articles for Python developers in the developerWorks Linux zone.
No comments:
Post a Comment