Matching single character words

S

sandeep.de

Hi ,

I am very new to regular expressions . I have a requirement to match
abbreviations in words e.g. the input would be A B C Corp. and I would
have to identify the A , B , C as bieng abbreviations using Regular
Expressions .

In other words I need to match only single character words from a
series of words .

How do I do that ?

Thanks for your help.

Sandeep.
 
C

Charles DeRykus

Hi ,

I am very new to regular expressions . I have a requirement to match
abbreviations in words e.g. the input would be A B C Corp. and I would
have to identify the A , B , C as bieng abbreviations using Regular
Expressions .

In other words I need to match only single character words from a
series of words .

How do I do that ?

A possibility:
@abbrev = $input =~ / (?: ^ | (?<=\s)) ( [[:alpha:]] )\s /xg;

But, are you sure other abbreviations like A.B., A&W, 3M, etc. will
be excluded...

hth,
 
D

Dr.Ruud

(e-mail address removed) schreef:
I need to match only single character words from a
series of words .

@abs = / (\b (?: [[:alpha:]] (?: [^[:alpha:]] | \b ) )+ ) /gx;

See `perldoc perlre`.

Test:

$ echo 'xxx A B C XXXX A.B.C., D.E.F. xxx D E F'
| perl -nle '$"="\n";
@_=/(\b(?:[[:alpha:]](?:[^[:alpha:]]|\b))+)/g;
print "@_"'

A B C
A.B.C.
D.E.F.
D E F
 
T

Tad McClellan

I am very new to regular expressions . I have a requirement to match
abbreviations


That is not pattern matching (regular expressions), that is
Natural Language Processing, a much much harder thing to do.

Finding things that sorta kinda look like maybe they might
be abbreviations is easy, deciding if a particular matched
item actually is an abbreviation is not easy.

in words e.g. the input would be A B C Corp. and I would
have to identify the A , B , C as bieng abbreviations using Regular
Expressions .


A problem that I must solve is to find abbreviations.

How can you tell that the "A" and "I" there are not abbreviations?

In other words I need to match only single character words from a
series of words .

How do I do that ?


while ( $str =~ /\b([A-Z])\b/g ) { # untested
print "'$1' looks like it MIGHT be an abbreviation.\n";
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top