manipulating a string

Discussion in 'C++' started by ma740988, Dec 28, 2008.

  1. ma740988

    ma740988 Guest

    I'm interested in displaying the variable names for methods with
    single arguments: The code below does just that and produce the right
    output: i.e
    arg_a
    arg_b
    arg_c

    Something tells me the parse function could be alot simpler.
    Critiques welcomed. Thanks


    # include <sstream>
    # include <string>
    # include <iostream>
    # include <iomanip>
    # include <algorithm>

    void parse ( std::string& to_parse ) {
    std::string::size_type const posb = to_parse.find_first_of
    ( '(' ) ;
    std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
    if ( posb == std::string::npos || pose == std::string::npos ) {
    return ;
    }
    to_parse = to_parse.substr( posb , pose );
    std::replace( to_parse.begin(), to_parse.end(), ' ', '+' );
    std::string::iterator it ;
    int end = 0; int beg = 0 ;
    int sz = to_parse.size();
    for( it = to_parse.end() - 1, sz; it != to_parse.begin() ; --it, --
    sz ) {
    if ( *it == ')' ) {
    continue ;
    } else if ( *it == '+' ) {
    if ( end ) { break ; }
    else { continue; }
    } else {
    if ( !end ) {
    end = sz ;
    } else {
    beg = sz - 1;
    }
    }
    }
    to_parse = to_parse.substr( beg, end - beg );
    }


    int main() {

    const std::string strr( "void Set_Whatever( int arg_a )\n"
    "void Set_This( unsigned int arg_b )\n"
    "void Set_That( double arg_c )\n" );
    std::istringstream isss( strr );
    std::string mline ;
    while ( std::getline ( isss, mline ) ) {
    parse ( mline ) ;
    std::cout << mline << std::endl;
    }
    std::cin.get() ;
    }
     
    ma740988, Dec 28, 2008
    #1
    1. Advertising

  2. On 29 déc 2008, 10:24, Paavo Helde <> wrote:
    > ma740988 <> kirjutas:
    >
    > > I'm interested in displaying the variable names for methods with
    > > single arguments: The code below does just that and produce the right
    > > output: i.e
    > > arg_a
    > > arg_b
    > > arg_c

    >
    > > Something tells me the parse function could be alot simpler.
    > > Critiques welcomed. Thanks

    >
    > Parsing C++ code is not a trivial task. Note that your code attempts to
    > parse the line backwards, whereas C++ is only parsable forwards, I
    > believe. Thus it will fail in presence of default argument values in
    > declarations, and probably in other cases.
    >
    > If the source code uses strict formatting rules, then it might be doable,
    > however.
    >
    > The work you do in the example code seems to be accomplished with a
    > single regex. I would write a Perl single-liner for that. If you insist
    > on learning C++ string manipulations, then you should stick either to
    > loops or find* functions, mixing them only creates a mess. E.g.
    > (untested!):
    >
    > typedef std::string::size_type pos_t;
    >
    > std::string parse_last_word(
    > const std::string& to_parse
    > ) {
    >
    > pos_t endparen = to_parse.rfind(')');
    > if (endparen==to_parse.npos || endparen==0) return "";
    >
    > pos_t word_end = to_parse.find_last_not_of(" \t", endparen-1);
    > if (word_end==to_parse.npos || !isalnum(to_parse[word_end])) {
    > return "";
    > }
    > pos_t word_start = to_parse.find_last_of(" \t*&/)", word_end);
    > if (word_start==to_parse.npos) return "";
    >
    > return to_parse.substr(word_start+1, word_end-word_start);
    >
    > }
    >
    > This will fail for default argument values as well, and probably in other
    > cases.
    >
    > hth
    > Paavo
    >
    >
    >
    > > # include <sstream>
    > > # include <string>
    > > # include <iostream>
    > > # include <iomanip>
    > > # include <algorithm>

    >
    > > void parse ( std::string& to_parse ) {
    > > std::string::size_type const posb = to_parse.find_first_of
    > > ( '(' ) ;
    > > std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
    > > if ( posb == std::string::npos || pose == std::string::npos ) {
    > > return ;
    > > }
    > > to_parse = to_parse.substr( posb , pose );
    > > std::replace( to_parse.begin(), to_parse.end(), ' ', '+' );
    > > std::string::iterator it ;
    > > int end = 0; int beg = 0 ;
    > > int sz = to_parse.size();
    > > for( it = to_parse.end() - 1, sz; it != to_parse.begin() ; --it, --
    > > sz ) {
    > > if ( *it == ')' ) {
    > > continue ;
    > > } else if ( *it == '+' ) {
    > > if ( end ) { break ; }
    > > else { continue; }
    > > } else {
    > > if ( !end ) {
    > > end = sz ;
    > > } else {
    > > beg = sz - 1;
    > > }
    > > }
    > > }
    > > to_parse = to_parse.substr( beg, end - beg );
    > > }

    >
    > > int main() {

    >
    > > const std::string strr( "void Set_Whatever( int arg_a )\n"
    > > "void Set_This( unsigned int arg_b )\n"
    > > "void Set_That( double arg_c )\n" );
    > > std::istringstream isss( strr );
    > > std::string mline ;
    > > while ( std::getline ( isss, mline ) ) {
    > > parse ( mline ) ;
    > > std::cout << mline << std::endl;
    > > }
    > > std::cin.get() ;
    > > }




    It works great in this case (one argument with no default options and
    some spaces) but you would definitely take advantage of using regular
    expressions.


    For instance, the following code:

    > std::string::size_type const posb = to_parse.find_first_of
    > ( '(' ) ;
    > std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
    > if ( posb == std::string::npos || pose == std::string::npos ) {
    > return ;
    > }
    > to_parse = to_parse.substr( posb , pose );


    Would be done using Perl regular expressions as:

    if ( /.+\((.*)\).*/ )
    {
    $to_parse=trim($1);
    }

    What are you working on, what is the purpose of this ?


    Because refactoring the algorithm would be necessary to handle more
    than one parameter and to allow default values. I don't see other
    obvious limitations appart the occasional comment between variables.

    However, sometimes there are no space character to rely on.
    Consider the following valid declarations:


    void f(int*x);
    const int const * g (int&y);

    They break your algorithm. And I can't rely forbid that practice.
    My main idea is to treat &, * and , characters separately.
    But using a code source beautifier like AStyle could do the trick. ;-)

    Good luck, and tell us if you have improvements.
     
    Cédric Baudry, Jan 2, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kelvin Tsai @ Singapore

    Manipulating with the T1, T0 and TX in a SAIF file.

    Kelvin Tsai @ Singapore, Sep 9, 2003, in forum: VHDL
    Replies:
    0
    Views:
    729
    Kelvin Tsai @ Singapore
    Sep 9, 2003
  2. Tom Rowton
    Replies:
    2
    Views:
    696
    Tom Rowton
    Aug 1, 2003
  3. Veit Wiessner
    Replies:
    5
    Views:
    467
    Veit Wiessner
    Dec 3, 2003
  4. =?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

    String manipulating using substring-before problem

    =?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=, Oct 23, 2006, in forum: XML
    Replies:
    4
    Views:
    2,405
    =?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=
    Oct 24, 2006
  5. Replies:
    1
    Views:
    404
Loading...

Share This Page