string search and modification

Discussion in 'Python' started by Jim Britain, Sep 6, 2006.

  1. Jim Britain

    Jim Britain Guest

    I know absolutely nothing about Python. My background is shell
    scripts assembly language and C programming. Currently I work network
    support.

    This is a portion of a Python script written by aaronsinclair. the
    full script can be found at:
    http://forums.ev1servers.net/printthread.php?t=50435&page=3&pp=25

    It monitors sendmail logfiles for dictionary attacks, and blocks with
    additions to iptables.



    The part I am having a problem with is the regular expression in the
    re.search function.

    Basicly, it is insufficiently qualified.

    Troublesome example logfile line:
    Sep 6 00:46:32 tabor sendmail[26642]: k867kMH5026642:
    dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be
    forged)
    Possible SMTP RCPT flood, throttling.

    (all one line in the logfile)

    What is happenning, is there are two sections that will qualify in
    this logfile line, and it matches on the wrong one.

    What I would like to happen, is to return the value from within the
    brackets, in every successful match.

    I have tried putting \[ in the beginning of the string, but am
    unsuccessful editting the qualifying character back out again, and
    returning the real ip string. (If indeed I did even get a match).

    This script runs in the background, and I would have to build a
    complete test environ, and rewrite the whole darn thing to run
    visibly, and use different files.

    I thought asking -- like a beginner -- for the trivial solution.
    (besides being up all night and all day).

    Thanks in advance for any help. Quick searches online for tutorial
    documentation and the books I have.. met with horrible results in
    finding a solution.

    I would like to match [123.123.123.123] (including the qualifying
    brackets), but be able to simply return the contents, without the
    brackets.

    (Perl would be easy, but it's not Python)

    def identifyHost(self):

    for line in self.fileContents:
    if re.search("throttling", line.lower()):
    ip = re.search("[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}",
    line)

    if ip.group() in self.ignoreList:
    continue
    if not ip.group() in self.banList:
    self.banList.append(ip.group())
    Jim Britain, Sep 6, 2006
    #1
    1. Advertising

  2. Jim Britain

    Paul Rubin Guest

    Jim Britain <> writes:
    > I would like to match [123.123.123.123] (including the qualifying
    > brackets), but be able to simply return the contents, without the
    > brackets.



    >>> p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
    >>> m = 'dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be'
    >>> g=re.search(p,m)
    >>> g.group(1)

    '125.22.38.13'

    g.group(1) matches the stuff in the first set of parens, which excludes
    the square brackets.
    Paul Rubin, Sep 6, 2006
    #2
    1. Advertising

  3. Jim Britain

    Jim Britain Guest

    On 06 Sep 2006 13:23:43 -0700, Paul Rubin
    <http://> wrote:

    >Jim Britain <> writes:
    >> I would like to match [123.123.123.123] (including the qualifying
    >> brackets), but be able to simply return the contents, without the
    >> brackets.

    >
    >
    > >>> p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
    > >>> m = 'dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be'
    > >>> g=re.search(p,m)
    > >>> g.group(1)

    >'125.22.38.13'
    >
    >g.group(1) matches the stuff in the first set of parens, which excludes
    >the square brackets.


    Final integration:

    def identifyHost(self):
    for line in self.fileContents:
    if re.search("throttling", line.lower()):
    p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
    ip=re.search(p,line)
    if ip.group(1) in self.ignoreList:
    continue
    if not ip.group(1) in self.banList:
    self.banList.append(ip.group(1))


    Thanks for the help.
    Jim
    --
    Jim Britain, Sep 7, 2006
    #3
  4. Jim Britain

    John Machin Guest

    Jim Britain wrote:

    >
    > Final integration:
    >
    > def identifyHost(self):
    > for line in self.fileContents:
    > if re.search("throttling", line.lower()):
    > p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
    > ip=re.search(p,line)


    A prudent pessimist might test for the complete absence of an IP
    address:
    if not ip:
    print "Huh?" # or whatever

    > if ip.group(1) in self.ignoreList:
    > continue
    > if not ip.group(1) in self.banList:
    > self.banList.append(ip.group(1))
    >
    >
    John Machin, Sep 7, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Abby Lee
    Replies:
    5
    Views:
    368
    Abby Lee
    Aug 2, 2004
  2. Sam Kong
    Replies:
    4
    Views:
    98
    Lothar Scholz
    Jun 4, 2005
  3. RK Sentinel
    Replies:
    34
    Views:
    359
    Simon Krahnke
    Jan 28, 2009
  4. Tyler
    Replies:
    7
    Views:
    541
    James Kuyper
    Apr 9, 2012
  5. Sia
    Replies:
    24
    Views:
    242
    Nick Mellor
    Jan 7, 2013
Loading...

Share This Page