Parsing MIME-encoded data in an HTTP request

Discussion in 'Python' started by Ron Garret, Jul 3, 2008.

  1. Ron Garret

    Ron Garret Guest

    I'm writing a little HTTP server and need to parse request content that
    is mime-encoded. All the MIME routines in the Python standard library
    seem to have been subsumed into the email package, which makes this
    operation a little awkward. It seems I have to do the following:

    1. Extract the content-length header from the HTTP request and use that
    to read the payload.

    2. Stick some artificial-looking headers onto the beginning of this
    payload to make it look like an email message (including the
    content-type and content-transfer-encoding headers)

    3. Parse the resulting string into a email message

    That works, but it feels way too hackish for my tastes. Surely there
    must be a better/more standard way of doing this?

    Thanks,
    rg
     
    Ron Garret, Jul 3, 2008
    #1
    1. Advertising

  2. Ron Garret

    Guest

    On Jul 3, 3:59 pm, Ron Garret <> wrote:
    > I'm writing a little HTTP server and need to parse request content that
    > is mime-encoded. All the MIME routines in the Python standard library
    > seem to have been subsumed into the email package, which makes this
    > operation a little awkward.


    To deal with messages of that kind, I've seen modules such as
    'rfc822', and 'mimetools' (which apparently builds itself from
    'rfc822', so it might be more complete). There's also 'mimetypes', in
    case you need to deal with file extensions and their corresponding
    MIME media type.

    > It seems I have to do the following:
    >
    > 1. Extract the content-length header from the HTTP request and use that
    > to read the payload.
    >
    > 2. Stick some artificial-looking headers onto the beginning of this
    > payload to make it look like an email message (including the
    > content-type and content-transfer-encoding headers)
    >
    > 3. Parse the resulting string into a email message
    >


    Email? Why does an HTTP server need to build an email message?

    I remember doing things like that some time ago when building an HTTP
    server myself (http://code.google.com/p/sws-d/). Incidentally, I
    resisted the urge to use much of the Python's library facilities (most
    things are done manually; am I a knucklehead or what!? :). You might
    wanna take a look to get some ideas.

    Sebastian
     
    , Jul 4, 2008
    #2
    1. Advertising

  3. Ron Garret

    Ron Garret Guest

    In article
    <>,
    wrote:

    > On Jul 3, 3:59 pm, Ron Garret <> wrote:
    > > I'm writing a little HTTP server and need to parse request content that
    > > is mime-encoded. All the MIME routines in the Python standard library
    > > seem to have been subsumed into the email package, which makes this
    > > operation a little awkward.

    >
    > To deal with messages of that kind, I've seen modules such as
    > 'rfc822', and 'mimetools' (which apparently builds itself from
    > 'rfc822', so it might be more complete). There's also 'mimetypes', in
    > case you need to deal with file extensions and their corresponding
    > MIME media type.


    From the mimetools docs:

    "Deprecated since release 2.3. The email package should be used in
    preference to the module. This module is present only to maintain
    backward compatibility."

    >
    > > It seems I have to do the following:
    > >
    > > 1. Extract the content-length header from the HTTP request and use that
    > > to read the payload.
    > >
    > > 2. Stick some artificial-looking headers onto the beginning of this
    > > payload to make it look like an email message (including the
    > > content-type and content-transfer-encoding headers)
    > >
    > > 3. Parse the resulting string into a email message
    > >

    >
    > Email? Why does an HTTP server need to build an email message?


    It shouldn't. That's my whole point. But see the docs excerpt above.

    > I remember doing things like that some time ago when building an HTTP
    > server myself (http://code.google.com/p/sws-d/). Incidentally, I
    > resisted the urge to use much of the Python's library facilities (most
    > things are done manually; am I a knucklehead or what!? :). You might
    > wanna take a look to get some ideas.


    I'd much prefer not to reinvent this particular wheel.

    rg
     
    Ron Garret, Jul 4, 2008
    #3
  4. Ron Garret wrote:
    > I'm writing a little HTTP server and need to parse request content that
    > is mime-encoded. All the MIME routines in the Python standard library
    > seem to have been subsumed into the email package, which makes this
    > operation a little awkward.


    How about using cgi.parse_multipart()?

    Ciao, Michael.
     
    Michael Ströder, Jul 4, 2008
    #4
  5. Ron Garret

    Ron Garret Guest

    In article <>,
    Michael Ströder <> wrote:

    > Ron Garret wrote:
    > > I'm writing a little HTTP server and need to parse request content that
    > > is mime-encoded. All the MIME routines in the Python standard library
    > > seem to have been subsumed into the email package, which makes this
    > > operation a little awkward.

    >
    > How about using cgi.parse_multipart()?
    >
    > Ciao, Michael.


    Unfortunately cgi.parse_multipart doesn't handle nested multiparts,
    which the requests I'm getting have. You have to use a FieldStorage
    object to do that, and that only works if you're actually in a cgi
    environment, which I am not. The server responds to these requests
    directly.

    Anyway, thanks for the idea.

    rg
     
    Ron Garret, Jul 4, 2008
    #5
  6. Ron Garret

    Ron Garret Guest

    In article <>,
    Ron Garret <> wrote:

    > In article <>,
    > Michael Ströder <> wrote:
    >
    > > Ron Garret wrote:
    > > > I'm writing a little HTTP server and need to parse request content that
    > > > is mime-encoded. All the MIME routines in the Python standard library
    > > > seem to have been subsumed into the email package, which makes this
    > > > operation a little awkward.

    > >
    > > How about using cgi.parse_multipart()?
    > >
    > > Ciao, Michael.

    >
    > Unfortunately cgi.parse_multipart doesn't handle nested multiparts,
    > which the requests I'm getting have. You have to use a FieldStorage
    > object to do that, and that only works if you're actually in a cgi
    > environment, which I am not. The server responds to these requests
    > directly.
    >
    > Anyway, thanks for the idea.
    >
    > rg


    Hm, it actually seems to work if I manually pass in the outerboundary
    parameter and environ={'REQUEST_METHOD':'POST'} That seems like the
    Right Answer.

    Woohoo!

    Thanks Michael!

    rg
     
    Ron Garret, Jul 4, 2008
    #6
  7. Ron Garret wrote:
    > In article <>,
    > Ron Garret <> wrote:
    >
    >> In article <>,
    >> Michael Ströder <> wrote:
    >>
    >>> Ron Garret wrote:
    >>>> I'm writing a little HTTP server and need to parse request content that
    >>>> is mime-encoded. All the MIME routines in the Python standard library
    >>>> seem to have been subsumed into the email package, which makes this
    >>>> operation a little awkward.
    >>> How about using cgi.parse_multipart()?
    >>>

    >> Unfortunately cgi.parse_multipart doesn't handle nested multiparts,
    >> which the requests I'm getting have. You have to use a FieldStorage
    >> object to do that, and that only works if you're actually in a cgi
    >> environment, which I am not. The server responds to these requests
    >> directly.
    >>
    >> Anyway, thanks for the idea.

    >
    > Hm, it actually seems to work if I manually pass in the outerboundary
    > parameter and environ={'REQUEST_METHOD':'POST'} That seems like the
    > Right Answer.


    I'm also using it to parse form parameters in a message body received by
    POST.

    CIao, Michael.
     
    Michael Ströder, Jul 5, 2008
    #7
  8. Ron Garret

    Ron Garret Guest

    In article <>,
    Michael Ströder <> wrote:

    > Ron Garret wrote:
    > > In article <>,
    > > Ron Garret <> wrote:
    > >
    > >> In article <>,
    > >> Michael Ströder <> wrote:
    > >>
    > >>> Ron Garret wrote:
    > >>>> I'm writing a little HTTP server and need to parse request content that
    > >>>> is mime-encoded. All the MIME routines in the Python standard library
    > >>>> seem to have been subsumed into the email package, which makes this
    > >>>> operation a little awkward.
    > >>> How about using cgi.parse_multipart()?
    > >>>
    > >> Unfortunately cgi.parse_multipart doesn't handle nested multiparts,
    > >> which the requests I'm getting have. You have to use a FieldStorage
    > >> object to do that, and that only works if you're actually in a cgi
    > >> environment, which I am not. The server responds to these requests
    > >> directly.
    > >>
    > >> Anyway, thanks for the idea.

    > >
    > > Hm, it actually seems to work if I manually pass in the outerboundary
    > > parameter and environ={'REQUEST_METHOD':'POST'} That seems like the
    > > Right Answer.

    >
    > I'm also using it to parse form parameters in a message body received by
    > POST.
    >
    > CIao, Michael.


    Just for the record, here's the incantation I ended up with:


    class post_handler(BaseHTTPRequestHandler):
    def do_POST(self):
    form = cgi.FieldStorage(fp=self.rfile, headers=self.headers,
    environ={'REQUEST_METHOD':'POST'})
    ...


    works like a charm.

    rg
     
    Ron Garret, Jul 6, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. CJ
    Replies:
    1
    Views:
    1,574
    Andrew Thompson
    Oct 29, 2004
  2. David Stockwell
    Replies:
    2
    Views:
    718
    Tim Jarman
    Jun 4, 2004
  3. John Levine
    Replies:
    0
    Views:
    732
    John Levine
    Feb 2, 2012
  4. Jan Arickx
    Replies:
    0
    Views:
    202
    Jan Arickx
    Aug 25, 2003
  5. joe
    Replies:
    0
    Views:
    197
Loading...

Share This Page