Expression problem

K.J. 44 · Nov 27, 2006

I have the two following regular expressions. I am not very good at
writing these yet. I am parsing some logs looking for some key words,
then taking the text after them.

if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {
($nothing, $hostName[$i]) = split(/:/, $&);
}
if ($details[$i] =~ /\buser\b\bname:\b\w+/i) {
print("The username is: $&");
($nothing, $username[$i]) = split(/:/, $&);
}

The $details array is read in from a text file and this works fine.
What I want to do is search the $details text for certain key words,
then take the text right after. The first if statement

Find the word Workstation followed by a space followed by name:
followed by a space followed by a string of characters including word
characters and hyphens. if the match is found, take only the text
after the : as the workstation name.

The second part is along the same lines for username.

Find the word user followed a space followed by the word name: followed
by a space followed by a string of word characters. Split at the : as
the username found.

These do not seem to be finding matches when I can see them in the log
file. Where am I messing up?

Thanks.

Ric · Nov 27, 2006

You need to post at least one of the lines you read from your text file.
You should name the values you would like to have extracted.

example line: yesterday 12:30 hans:went:home

var1 should contain: 12:20
var2 should contain: went

this is much faster and precise than explaining something in a huge
block of text

xhoster · Nov 27, 2006

K.J. 44 said:
I have the two following regular expressions. I am not very good at
writing these yet. I am parsing some logs looking for some key words,
then taking the text after them.

if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {

About the only places that \b should be used are the beginning or end
of the regex or before or after something like ".*". Also, I don't see how
you can ever productively have more than one in a row, doing so is probably
exactly the same as having just one.

\b is a zero-width condition. So "tion\b\bname" can never match because
you are demanding that there is a n before the \b zero-width placeholder,
and an n after the \b zero-width placeholder, and if that is the case then
the conditions which define \b are not met. Similarly the \b in ":\b\s" is
impossible to ever match, as it has to have a non-word character on each
side which is what \b does not do. The \b in "\s\b[0-9A-Za-z_\-]" is
almost redundant--it just forbid the "-" from the character class from
being used, which I doubt is what you want.

....

Find the word Workstation followed by a space followed by name:
followed by a space followed by a string of characters including word
characters and hyphens. if the match is found, take only the text
after the : as the workstation name.

.... /\bworkstation name: ([-\w]+)/i ...

If you "space" you meant "white space", then turn my spaces into "\s"
(not "\b").

Xho

K.J. 44 · Nov 27, 2006

Thank you very much for your help and suggestions. I will try these
out.

Thanks!

K.J. 44 said:
K.J. 44 said:

I have the two following regular expressions. I am not very good at
writing these yet. I am parsing some logs looking for some key words,
then taking the text after them.

if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {

Click to expand...

About the only places that \b should be used are the beginning or end
of the regex or before or after something like ".*". Also, I don't see how
you can ever productively have more than one in a row, doing so is probably
exactly the same as having just one.

\b is a zero-width condition. So "tion\b\bname" can never match because
you are demanding that there is a n before the \b zero-width placeholder,
and an n after the \b zero-width placeholder, and if that is the case then
the conditions which define \b are not met. Similarly the \b in ":\b\s" is
impossible to ever match, as it has to have a non-word character on each
side which is what \b does not do. The \b in "\s\b[0-9A-Za-z_\-]" is
almost redundant--it just forbid the "-" from the character class from
being used, which I doubt is what you want.

...

Find the word Workstation followed by a space followed by name:
followed by a space followed by a string of characters including word
characters and hyphens. if the match is found, take only the text
after the : as the workstation name.

Click to expand...

... /\bworkstation name: ([-\w]+)/i ...

If you "space" you meant "white space", then turn my spaces into "\s"
(not "\b").

Xho

Tad McClellan · Nov 28, 2006

Christian Winter said:
So the first piece of code could be cut down to
if( $details[$i] =~ /\bworkstation\sname:\s([0-9A-Za-z_\-]+)/i )
{
$hostName[$i] = $1;
}

There's still room for improvement, like

adding an //x modifier, particularly if you are "not very good"
at grokking regexes:

if ( $details[$i] =~ /\bworkstation
\s
name:
\s
(
[0-9A-Za-z_-]+
)
/ix
)

e.g. removing A-Z from
the character class, as you are giving the /i modifier anyway,
so both upper and lowercase characters will be matched.

and removing the backslash. Hyphen is not meta in a character
class if it is first or last in the class.

Mr P · Nov 28, 2006

K.J. 44 said:
Thank you very much for your help and suggestions. I will try these
out.

Thanks!

K.J. 44 said:

I have the two following regular expressions. I am not very good at
writing these yet. I am parsing some logs looking for some key words,
then taking the text after them.

if ($details[$i] =~ /\bworkstation\b\bname:\b\s\b[0-9A-Za-z_\-]+\b/i) {

Click to expand...

Click to expand...

Mr P · Nov 28, 2006

The $details array is read in from a text file and this works fine.
What I want to do is search the $details text for certain key words,
then take the text right after. The first if statement

Dr.Ruud · Nov 28, 2006

Mr P schreef:

PS: [0-9A-Za-z_\-] looks a LOT like \w

But \w matches 91801 characters (codepoints).
http://www.xs4all.nl/~rvtol/perl/unicount.pl

Replacement Problem	1	Apr 28, 2022
Genetic algoritm generating the text	0	Aug 18, 2023
RegEx	0	Sep 1, 2022
Color for hover over	1	Jul 30, 2022
Logic Problem with BigInteger Method	2	Aug 26, 2023
Translater + module + tkinter	1	Feb 16, 2023
Tasks	1	Nov 29, 2022
JavaScript in Acrobat Save As Found Text	3	Nov 11, 2021

Expression problem

K.J. 44

Ric

xhoster

K.J. 44

Tad McClellan

Mr P

Mr P

Dr.Ruud

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads