excluding search string in regular expressions

  • Thread starter Franz Steinhaeusler
  • Start date
F

Franz Steinhaeusler

Hello,

Following Problem:

find only occurances, where in the line are'::' characters and
the former line is not equal '**/'

so 2) and 3) should be found and 1) not.

1)
"""
**/
void C::B
"""

2)
"""

void C::B
"""

3)
"""
*/
void C::B
"""

I tried something
"\*\*/\n.*::"

But this is the opposite.

So my question is: how can I exclude a pattern?

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

thank you in advance,
 
F

Franz Steinhaeusler

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

Sorry,
is this the solution (simple concatenating [^*][^*][^/]\n.*:: ?


The background:
I want to scan cpp file, whether the have a doxygen comment already:
It should find all postitions, where this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)

the problem is to find the method or function definition, and for
that, I need a regex.
it should ignore blabla::InitButtonPanel(a, b);

So a mark is that if there is a semikolon at the end,
it is no function or method defininition.

So I would need
[^*][^*][^/]\n.*[)]*[^;]
but this is not working.

Thank you again in advance!
 
M

Mitja

Franz said:
single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?

That should do, though it's admittedly far from elegant; I, too, would like to see a nicer solution.
The background:
I want to scan cpp file, whether the have a doxygen
comment already: It should find all postitions, where
this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)

In this case, I'd replace \n with \w*, meaning any amount of whitespace.
 
F

Franz Steinhaeusler

Franz said:
single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?

That should do, though it's admittedly far from elegant; I, too, would like to see a nicer solution.
The background:
I want to scan cpp file, whether the have a doxygen
comment already: It should find all postitions, where
this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)

In this case, I'd replace \n with \w*, meaning any amount of whitespace.

Hello, thank you.

Oh, not really right (about finding c function/method definition):

[^*][^*][^/]\w*.*[)]*[^;]


if func()
{

would also be found.

A more common solution for detecting functions/Methods would be fine.

[^*][^*][^/]\w*--c-method/function/definition
 
O

Oliver Fromme

Mitja said:
> > > [...]
> > > single characters with [^ab] but I need not(ab)
> > >
> > > not_this_brace_pattern(\*\*/\n).*::
> >
> > Sorry,
> > is this the solution (simple concatenating
> > [^*][^*][^/]\n.*:: ?
>
> That should do, though it's admittedly far from elegant; I, too,
> would like to see a nicer solution.

It won't work correctly. Franz needs a sub-expression that
matches anything which is not "**/". However, [^*][^*][^/]
is a character-wise negation, not word-wise. It doesn't
match "**/", but neither does it match "xx/", nor any other
string which has only one or two of the characters at the
right position.

What you need is a "negative look-behind assertion". The
following Python-RE will do: (?<!\*\*/)\n.*::
Remember to use raw string notation, or you need to double
the backslashes:

my_re_str = r"(?<!\*\*/)\n.*::"
my_re_obj = re.compile(my_re_str)

Note that you might want to use \s* instead of \n, so any
amount of whitespace (including newlines) is matched, not
just one single newline.

For more information about regular expressions supported by
Python, refer to the Library Reference manual:

http://docs.python.org/lib/re-syntax.html

Best regards
Oliver
 
D

Diez B. Roggisch

A more common solution for detecting functions/Methods would be fine.

Maybe you should go for a real parser here - together with a
C-syntax-grammar. Trying to cram this stuff into regexps is bound for not
catching special cases. And its gereally difficult to have a regexp _not_
macht a certain word.

Another approach would be to look for closing comments and function
definitions in several rexes, and use python-logic:

if doxy_close_rex.match(line):
line = lines.next()
if fun_def_rex.match(line):
....
 
B

Bengt Richter

Hello,

Following Problem:

find only occurances, where in the line are'::' characters and
the former line is not equal '**/'

so 2) and 3) should be found and 1) not.

1)
"""
**/
void C::B
"""

2)
"""

void C::B
"""

3)
"""
*/
void C::B
"""

I tried something
"\*\*/\n.*::"

But this is the opposite.

So my question is: how can I exclude a pattern?

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

thank you in advance,

To look back a line, I think I'd just use a generator, and test current
and last lines to get what I wanted. E.g., perhaps you can adapt this:
(I am just going literally by
"""
find only occurances, where in the line are'::' characters and
the former line is not equal '**/'
"""
which doesn't need a regex)
... getline = iter(lineseq).next
... curr = getline().rstrip()
... while True:
... last, curr = curr, getline().rstrip()
... if '::' in curr and last != '**/': yield curr
...

I made a file, modifying your data a little:
----
1)
"""
**/
void C::B -- no (1)
"""

2)
"""

void C::B -- yes (2)
"""

3)
"""
*/
void C::B -- yes (3)
"""
----

Here's what the generator returns:
...
'void C::B -- yes (2)'
'void C::B -- yes (3)'


Regards,
Bengt Richter
 
F

Franz Steinhaeusler

Maybe you should go for a real parser here - together with a
C-syntax-grammar. Trying to cram this stuff into regexps is bound for not
catching special cases. And its gereally difficult to have a regexp _not_
macht a certain word.

Hello Diez,

thanks, yes, it is difficult for "not" find a searchstring in regex ;)

I only want to find a regex for an editor (which is written in python)
to have a common function (of course it cannot be so accurate as a
parser) to find a function/method defininition.
 
F

Franz Steinhaeusler

Mitja said:
Franz said:
Franz Steinhaeusler wrote:
[...]
single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?

That should do, though it's admittedly far from elegant; I, too,
would like to see a nicer solution.

Hello Oliver,
It won't work correctly. Franz needs a sub-expression that
matches anything which is not "**/". However, [^*][^*][^/]
is a character-wise negation, not word-wise. It doesn't
match "**/", but neither does it match "xx/", nor any other
string which has only one or two of the characters at the
right position.

yes, you are right, the approach above is false.
What you need is a "negative look-behind assertion".

??, sounds interesting ;)
The
following Python-RE will do: (?<!\*\*/)\n.*::
Remember to use raw string notation, or you need to double
the backslashes:

my_re_str = r"(?<!\*\*/)\n.*::"

my_re_obj = re.compile(my_re_str)

Note that you might want to use \s* instead of \n, so any
amount of whitespace (including newlines) is matched, not
just one single newline.

For more information about regular expressions supported by
Python, refer to the Library Reference manual:

http://docs.python.org/lib/re-syntax.html

(?<!...)
Matches if the current position in the string is not preceded by a
match for..

That is it.

Many thanks for your helpful reply,
 
F

Franz Steinhaeusler

To look back a line, I think I'd just use a generator, and test current
and last lines to get what I wanted. E.g., perhaps you can adapt this:
(I am just going literally by
"""
find only occurances, where in the line are'::' characters and
the former line is not equal '**/'
"""
which doesn't need a regex)
... getline = iter(lineseq).next
... curr = getline().rstrip()
... while True:
... last, curr = curr, getline().rstrip()
... if '::' in curr and last != '**/': yield curr
...
[...]

Regards,
Bengt Richter

Hello Bengt,

thank you for suggesting this interesting approach,

regards
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top