Backreferences in python ?

Discussion in 'Python' started by Pankaj, Jan 23, 2006.

  1. Pankaj

    Pankaj Guest

    I have something like below in perl and i am searching for equivalent
    in python:

    ::: Perl :::
    ***********
    while( <FILEHANDLE> )
    {

    line = $_;

    pattern = "printf\( \"$lineNo \" \),";

    line =~ s/"for(.*)\((*.)\;(.*)/for$1\($pattern$2\;$3/g;
    }

    This is used to

    search for : for ( i = 0; i < 10; i++)
    Replace with: for( printf( "10" ), i =0; i < 10; i++)
    Where 10 is the line no.

    ****************************************
    What i tried in python was::
    ****************************************

    f = open( "./1.c", "r")
    fNew = open( "./1_new.c", "w")
    for l in f:
    print l
    lineno = lineno + 1
    strToFind = "for\((.*)\;(.*)"

    ## For Converting int to string, i.e. line no. to string
    lineNoClone = lineno

    pattern = "printf(\"" + str( lineNoClone) + "\"),"

    print pattern

    strToReplace = "for\(" + pattern + "\1\;"

    fNew.write( l.replace( strToFind, strToReplace) )

    print l

    fNew.close()
    Pankaj, Jan 23, 2006
    #1
    1. Advertising

  2. Pankaj wrote:
    > [...]
    > ****************************************
    > What i tried in python was::
    > ****************************************
    >
    > f = open( "./1.c", "r")
    > fNew = open( "./1_new.c", "w")
    > for l in f:
    > print l
    > lineno = lineno + 1
    > strToFind = "for\((.*)\;(.*)"
    > [...]


    Regular expressions are not handled automatically in Python the way you
    apparently think they are.

    In Python, you will need to use the "re" module:

    http://docs.python.org/lib/module-re.html

    -- Gerhard
    =?ISO-8859-1?Q?Gerhard_H=E4ring?=, Jan 23, 2006
    #2
    1. Advertising

  3. Pankaj

    Pankaj Guest

    My tries have with re have not yielded results::


    {
    strToFind = 'for*;*'

    ## Converting int to string, i.e. line no. to string
    lineNoClone = lineno

    pattern = "printf(\"" + str( lineNoClone) + "\"),"

    regObj = re.compile( strToFind)

    m = regObj.search( l)


    if ( m != None ) :
    subStrPattern1_hasInitialization = "\1"
    #m.group(1)

    subStrPattern2_hasRestTillEnd = "\2"
    #m.group(2)

    strToReplace = "for(" + pattern +
    subStrPattern1_hasInitialization + ";" + subStrPattern2_hasRestTillEnd
    fNew.write( regObj.sub( strToFind, strToReplace ) )
    else:
    fNew.write( l)
    }


    Here problem is , i am not getting backreferences using \1 and \2

    The string : for( i =0; i < 10; i++)

    is getting replace by: for *;* {

    I don't believe this, they have given that \1 and \2 store
    backreferences, then where r they??
    Pankaj, Jan 23, 2006
    #3
  4. Pankaj

    Pankaj Guest

    I got my answer

    if ( m != None ) :
    subStrPattern1_hasInitialization = m.group(1)


    subStrPattern2_hasRestTillEnd = m.group(2)

    str = subStrPattern1_hasInitialization +
    subStrPattern2_hasRestTillEnd
    strToReplace = "for(" + pattern + str


    This gave me my solution

    But to tell u, i have not got that, why i should concatenate and then
    only place it . while i was trying the same thing by concatenation it
    straight to replace string, it was not working

    Any body has reasons ???
    Pankaj, Jan 23, 2006
    #4
  5. Pankaj

    Duncan Booth Guest

    Pankaj wrote:

    > Here problem is , i am not getting backreferences using \1 and \2
    >


    You wrote:
    > subStrPattern1_hasInitialization = "\1"


    "\1" is the way to create a string containing a control-A character. What
    you actually wanted was a string containing a backslash and a "1", so you
    need either:

    "\\1"

    or

    r"\1"

    Try using the print statement to see what all those strings you are
    creating actually contain.
    Duncan Booth, Jan 23, 2006
    #5
  6. Pankaj

    Paul McGuire Guest

    "Pankaj" <> wrote in message
    news:...
    >
    > I have something like below in perl and i am searching for equivalent
    > in python:
    >
    > ::: Perl :::
    > ***********
    > while( <FILEHANDLE> )
    > {
    >
    > line = $_;
    >
    > pattern = "printf\( \"$lineNo \" \),";
    >
    > line =~ s/"for(.*)\((*.)\;(.*)/for$1\($pattern$2\;$3/g;
    > }
    >
    > This is used to
    >
    > search for : for ( i = 0; i < 10; i++)
    > Replace with: for( printf( "10" ), i =0; i < 10; i++)
    > Where 10 is the line no.
    >


    Here is a solution using pyparsing instead of re's. You're already used to
    re's from using Perl, so you may be more comfortable using that tool in
    Python as well. But pyparsing has some builtin features for pattern
    matching, calling out to callback routines during parsing, and a lineno
    function to report the current line number, all wrapped up in a simple
    transformString method call.

    Download pyparsing at http://pyparsing.sourceforge.net.

    -- Paul


    from pyparsing import Keyword,SkipTo,lineno,cStyleComment

    # define grammar for a for statement
    for_ = Keyword("for")
    forInitializer = SkipTo(';').setResultsName("initializer")
    forStmt = for_ + "(" + forInitializer + ';'

    # ignore silly comments
    forStmt.ignore(cStyleComment)

    # setup a parse action that will insert line numbers
    # parse actions are all called with 3 args:
    # - the original string being parsed
    # - the current parse location where the match occurred
    # - the matching tokens
    # if a value is returned from this function, transformString will
    # insert it in place of the original content
    def insertPrintStatement(st,loc,toks):
    lineNumber = lineno(loc,st)
    if toks[0]:
    return r'print("%d\n"), %s' % (lineNumber,toks[0])
    else:
    return r'print("%d\n")' % lineNumber
    forInitializer.setParseAction(insertPrintStatement)

    # transform some code
    # this is how you would read in a whole file as a single string
    #testdata = file(inputfilename).read()
    # to read the entire file into a list of strings, do:
    #testdata = file(inputfilename).readlines()
    # for now, just fake some source code
    testData = """
    for(i = 0; i <= 100; ++i)
    {
    /* some stuff */
    }

    for (;;;)
    {
    /* do this forever */
    }

    /* this for has been commented out
    for(a = -1; a < 0; a++)
    */

    """

    # use the grammar and the associated parse action to
    # transform the source code
    print forStmt.transformString(testData)

    --------------------------
    Gives:
    for(print("2\n"), i = 0; i <= 100; ++i)
    {
    /* some stuff */
    }

    for(print("7\n");;;)
    {
    /* do this forever */
    }

    /* this for has been commented out
    for(a = -1; a < 0; a++)
    */
    Paul McGuire, Jan 23, 2006
    #6
  7. Pankaj wrote:

    >>>> Perl :::

    > ***********
    > while( <FILEHANDLE> )
    > {
    >
    > line = $_;
    >
    > pattern = "printf\( \"$lineNo \" \),";
    >
    > line =~ s/"for(.*)\((*.)\;(.*)/for$1\($pattern$2\;$3/g;
    > }
    >
    > This is used to
    >
    > search for : for ( i = 0; i < 10; i++)
    > Replace with: for( printf( "10" ), i =0; i < 10; i++)
    > Where 10 is the line no.



    import re
    import fileinput

    for L in fileinput.input(inplace=True):
    pattern = 'printf("%d"),' % input.filelineno()
    L = re.sub(r"for(.*)\((*.)\;(.*)", r"for\1\(%s\2;\3" % pattern, L)
    print L,

    or something
    --
    Giovanni Bajo
    Giovanni Bajo, Jan 23, 2006
    #7
  8. Pankaj <> wrote:
    >search for : for ( i = 0; i < 10; i++)
    >Replace with: for( printf( "10" ), i =0; i < 10; i++)
    >Where 10 is the line no.


    >f = open( "./1.c", "r")
    >fNew = open( "./1_new.c", "w")
    >for l in f:
    > print l
    > lineno = lineno + 1
    > strToFind = "for\((.*)\;(.*)"

    [etc.]
    >search for : for ( i = 0; i < 10; i++)
    >Replace with: for( printf( "10" ), i =0; i < 10; i++)


    Ah, the dangers of thinking of all string manipulation as requiring
    regexps, thanks to their ubiquity in Perl. Observe:
    >search for : for ( i = 0; i < 10; i++)
    >Replace with: for( printf( "10" ), i = 0; i < 10; i++)


    All you need is:
    strToFind = "for ("
    strToReplace = 'for (printf( "+str(lineno)+'" ),'
    # Note the use of '' to avoid the need to escape the "s
    fNew.write(l.replace(strToFind, strToReplace)

    (OK, maybe you do need the regexp if you've got any "for (;" loops,
    or inconsitencies as to whether it's "for(" or "for (". But if a
    simple string replace will do the job, use it.)

    --
    \S -- -- http://www.chaos.org.uk/~sion/
    ___ | "Frankly I have no feelings towards penguins one way or the other"
    \X/ | -- Arthur C. Clarke
    her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
    Sion Arrowsmith, Jan 24, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark Fletcher
    Replies:
    1
    Views:
    482
    Mark Fletcher
    May 19, 2004
  2. Chris Nolte
    Replies:
    9
    Views:
    4,266
    Jeff Schwab
    May 25, 2004
  3. dhek bhun kho

    java.util.regex: Backreferences?

    dhek bhun kho, Jul 9, 2003, in forum: Java
    Replies:
    2
    Views:
    777
    dhek bhun kho
    Jul 9, 2003
  4. Amy Dillavou

    backreferences

    Amy Dillavou, Sep 28, 2005, in forum: Python
    Replies:
    4
    Views:
    431
    Peter
    Sep 28, 2005
  5. Replies:
    1
    Views:
    1,940
Loading...

Share This Page