I am learning network programming using twisted 10 in python. In below code is there any way to detect HTTP Request when data recieved? also retrieve Domain name, Sub Domain, Port values from this? Discard it if its not http data?
from twisted.internet import stdio, reactor, protocol
from twisted.protocols import basic
import re
class DataForwardingProtocol(protocol.Protocol):
def _ _init_ _(self):
self.output = None
self.normalizeNewlines = False
def dataReceived(self, data):
if self.normalizeNewlines:
data = re.sub(r"(\r\n|\n)", "\r\n", data)
if self.output:
self.output.write(data)
class StdioProxyProtocol(DataForwardingProtocol):
def connectionMade(self):
inputForwarder = DataForwardingProtocol( )
inputForwarder.output = self.transport
inputForwarder.normalizeNewlines = True
stdioWrapper = stdio.StandardIO(inputForwarder)
self.output = stdioWrapper
print "Connected to server. Press ctrl-C to close connection."
class StdioProxyFactory(protocol.ClientFactory):
protocol = StdioProxyProtocol
def clientConnectionLost(self, transport, reason):
reactor.stop( )
def clientConnectionFailed(self, transport, reason):
print reason.getErrorMessage( )
reactor.stop( )
if __name__ == '_ _main_ _':
import sys
if not len(sys.argv) == 3:
print "Usage: %s host port" % _ _file_ _
sys.exit(1)
reactor.connectTCP(sys.argv[1], int(sys.argv[2]), StdioProxyFactory( ))
reactor.run( )
protocol.dataReceived, which you’re overriding, is too low-level to serve for the purpose without smart buffering that you’re not doing — per the docs I just quoted,
You appear to be completely ignoring this crucial part of the docs.
You could instead use LineReceiver.lineReceived (inheriting from
protocols.basic.LineReceiver, of course) to take advantage of the fact that HTTP requests come in “lines” — you’ll still need to join up headers that are being sent as multiple lines, since as this tutorial says:Once you have a nicely formatted/parsed response (consider studying twisted.web’s sources so see one way it could be done),
now the
Hostheader (cfr the RFC section 14.23) is the one containing this info.