regular expressions

T

Tom Allison

I'm sorry if this has been done before, but I can't seem to find any good examples:

I'm trying to search mail messages for headers based on the following:

config = YAML.load_file('config.yaml')

$stdin.each do |line|
puts line if line =~ /^Recevied:/ .. line =~ /by #{config['hostname']}/o
end

and guess what? It doesn't work.
If I hard code the hostname it works fine.
But this and /by config['hostname']/ both fail to match.

So, how do you set a variable in a regular expression?

Also, how would I match multiple lines?

I would much rather match something like:

/^Received:.+?\s{4,}by #{config['hostname']}/sm
because that will pick up the one Received header I want.
I can write in Perl 5 regex but I'm not as familiar with Ruby's methods.
 
M

MonkeeSage

Tom said:
So, how do you set a variable in a regular expression?

You have it right.
Also, how would I match multiple lines?

You're right again with the //m. But you can't read line-by-line
(IO#each) and match across multiple lines without building some kind of
parser. It's usually better to just read the data into a single string
for multiline matches.

Input (multiline):
Received: from mail.papa.smurf (mail.papa.smurf [127.0.0.1])
by (e-mail address removed)

config = {'hostname' => '(e-mail address removed)'}
line = $stdin.read
puts line if line =~ /^Received:.*by #{config['hostname']}/m
I would much rather match something like:

/^Received:.+?\s{4,}by #{config['hostname']}/sm
because that will pick up the one Received header I want.
I can write in Perl 5 regex but I'm not as familiar with Ruby's methods.

Input (single line):
Received: from mail.papa.smurf (mail.papa.smurf [127.0.0.1]) by
(e-mail address removed)

$stdin.each { |line|
puts line if line =~ /^Received:.+?\s{4,}by #{config['hostname']}/
}

Regards,
Jordan
 
T

Tom Allison

MonkeeSage said:
Tom said:
So, how do you set a variable in a regular expression?

You have it right.
Also, how would I match multiple lines?

You're right again with the //m. But you can't read line-by-line
(IO#each) and match across multiple lines without building some kind of
parser. It's usually better to just read the data into a single string
for multiline matches.

Input (multiline):
Received: from mail.papa.smurf (mail.papa.smurf [127.0.0.1])
by (e-mail address removed)

config = {'hostname' => '(e-mail address removed)'}
line = $stdin.read
puts line if line =~ /^Received:.*by #{config['hostname']}/m

This prints out the entire message.

I'm only trying to get the one section that matched the Received header.
I should be able to do with with the statement
puts line if line =~ /Received/ .. line =~ /#{config['hostname']}/
At least that's what I'm believe I'm being led towards.

But the match on /Received/ turns on the printing and it never matches on the
second regexp.
 
M

MonkeeSage

Tom said:
I'm only trying to get the one section that matched the Received header.
I should be able to do with with the statement
puts line if line =~ /Received/ .. line =~ /#{config['hostname']}/
At least that's what I'm believe I'm being led towards.

In ruby .. is a range operator. If you want a grouped match, use parens
and a backreference just like perl.

$stdin.each { |line|
puts $1 if line =~ /^Received:(.+?)\s{4,}by #{config['hostname']}/
# prints " from mail.papa.smurf (mail.papa.smurf [127.0.0.1])"
}

Regards,
Jordan
 
E

Eero Saynatkari

--rUztinBX/EQDJOOk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Tom said:
I'm only trying to get the one section that matched the Received header.
I should be able to do with with the statement
puts line if line =3D~ /Received/ .. line =3D~ /#{config['hostname']}/
At least that's what I'm believe I'm being led towards.
=20
In ruby .. is a range operator. If you want a grouped match, use parens
and a backreference just like perl.
=20
$stdin.each { |line|
puts $1 if line =3D~ /^Received:(.+?)\s{4,}by #{config['hostname']}/
# prints " from mail.papa.smurf (mail.papa.smurf [127.0.0.1])"
}

And you almost certainly want to Regexp.escape that interpolation.

--rUztinBX/EQDJOOk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFIclJ7Nh7RM4TrhIRAt0/AJ4gLDEwvCVaxKDSuGBusa4oo0lrbQCg2z1z
NTvLyPtZ5FkmsKfjdM1wvOA=
=BxWl
-----END PGP SIGNATURE-----

--rUztinBX/EQDJOOk--
 
T

Tom Allison

Paul said:
Tom said:
I'm sorry if this has been done before, but I can't seem to find any good
examples:

I'm trying to search mail messages for headers based on the following:

config = YAML.load_file('config.yaml')

$stdin.each do |line|
puts line if line =~ /^Recevied:/ .. line =~ /by
#{config['hostname']}/o
end

and guess what? It doesn't work.

Yes, it doesn't, and I think I know why. I have been hearing for years how
it just doesn't matter whether young people learn how to spell common
words, and I have steadfastly taken the position that it does matter.

You make a pretty good example of another lost art...
 
L

Leslie Viljoen

I'm sorry if this has been done before, but I can't seem to find any good examples:

I'm trying to search mail messages for headers based on the following:

config = YAML.load_file('config.yaml')

$stdin.each do |line|
puts line if line =~ /^Recevied:/ .. line =~ /by #{config['hostname']}/o
end

and guess what? It doesn't work.
If I hard code the hostname it works fine.
But this and /by config['hostname']/ both fail to match.

To eliminate a possibility, do

p config['hostname']

just before the $stdin line there. I can't tell you how many times I
have forgotten about invisible newlines tacked onto my variables that
cause all sorts of matching to fail.


Les
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top