Rubish Way of extracting elements

Daniel Völkerts · Aug 16, 2004

I started written a little script to analyse my syslogs. The development
went on very fast, but today I'm searching the rubish way to dissect a
string into some parts. For example in my syslog there is a line (valid
as described in rfc3146)

<165> Aug 16 17:01:35 localhost Just a test

I was trying to reach this form

var = content

pri = 165
timestamp = Aug 16 17:01:35
device = localhost
msg = Just a test

But how do I accomplish this? I read the pickaxe book, but the example I
found was about repeating values e.g. | as seperator. Is a suitable
regexp the way or should use another technique e.g. String#index etc.?

Thanks for your time helping me, I'll pay it back if I become a little
more rubisher

Daniel Völkerts · Aug 16, 2004

Daniel said:
I started written a little script to analyse my syslogs.

I feel sorry, 'I started writting..' is the correct way.

David A. Black · Aug 16, 2004

Hi --

I started written a little script to analyse my syslogs. The development
went on very fast, but today I'm searching the rubish way to dissect a
string into some parts. For example in my syslog there is a line (valid
as described in rfc3146)

<165> Aug 16 17:01:35 localhost Just a test

I was trying to reach this form

var = content

pri = 165
timestamp = Aug 16 17:01:35
device = localhost
msg = Just a test

But how do I accomplish this? I read the pickaxe book, but the example I
found was about repeating values e.g. | as seperator. Is a suitable
regexp the way or should use another technique e.g. String#index etc.?

Thanks for your time helping me, I'll pay it back if I become a little
more rubisher

You could match it to a regular expression, and grab the results in
()-expressions:

str = "<165> Aug 16 17:01:35 localhost Just a test"

pri, timestamp, device, msg =
/<(\d+)>\s+(\w+\s+\d+\s+[\d:]+)\s+(\S+)\s+(.*)/.match(str).captures

Another way would be to use scanf. This has the advantage that you
get your 165 as an integer (if that's important):

require 'scanf'
pri, timestamp, device, msg = str.scanf("<%\d> %15c %s%*c %[\\S\\s]"

(You might have to adjust either the regex or the format string
depending on how consistent and predictable the lines are.)

David

Charles Mills · Aug 16, 2004

I started written a little script to analyse my syslogs. The
development went on very fast, but today I'm searching the rubish way
to dissect a string into some parts. For example in my syslog there is
a line (valid as described in rfc3146)

<165> Aug 16 17:01:35 localhost Just a test

I was trying to reach this form

var = content

pri = 165
timestamp = Aug 16 17:01:35
device = localhost
msg = Just a test

But how do I accomplish this? I read the pickaxe book, but the example
I found was about repeating values e.g. | as seperator. Is a suitable
regexp the way or should use another technique e.g. String#index etc.?

Probably use regular expressions. You could have one big regexp or one
for each field like so:
var =~ /<([0-9]+)>/
pri = $1
$' =~ /some regexp/ # I'm lazy
timestamp = $1
# etc
You could also use \A along with the post match ($') to make sure the
fields come in the order you expect.
-Charlie

Florian Gross · Aug 16, 2004

Daniel said:
<165> Aug 16 17:01:35 localhost Just a test
I was trying to reach this form

var = content

pri = 165
timestamp = Aug 16 17:01:35
device = localhost
msg = Just a test

This ought to work, but there might be other ways to do this:

if md = /^<(\d+)> (\S+ \d+ \d+:\d+:\d+) (\S+) (.*?)$/.match(text)
pri, timestamp, device, msg = *md.captures
# Do something with the captures
end

Regards,
Florian Gross

Daniel Völkerts · Aug 16, 2004

Daniel said:
I feel sorry, 'I started writting..' is the correct way.

What the hell, writting is also wrong, tzzz. Too much caffeine in my head!

After I posted the above thread I have written this line

pri,timestamp,device,msg = aMsg.scan(/<\d{1,5}>|\w{3,} \d\d
\d\d:\d\d:\d\d|\w+/)

Is this the right way? Please feel free to post comments. I'll looking
for it to improve my ruby skills.

Daniel Völkerts · Aug 16, 2004

David said:
You could match it to a regular expression, and grab the results in
()-expressions:

str = "<165> Aug 16 17:01:35 localhost Just a test"

pri, timestamp, device, msg =
/<(\d+)>\s+(\w+\s+\d+\s+[\d:]+)\s+(\S+)\s+(.*)/.match(str).captures

Another way would be to use scanf. This has the advantage that you
get your 165 as an integer (if that's important):

require 'scanf'
pri, timestamp, device, msg = str.scanf("<%\d> %15c %s%*c %[\\S\\s]"

(You might have to adjust either the regex or the format string
depending on how consistent and predictable the lines are.)

Thanks a lot. Thats the way I would expect it. Simple and nice to
understand. I'll try it.

Many regards.

Robert Klemme · Aug 16, 2004

Florian Gross said:
This ought to work, but there might be other ways to do this:

if md = /^<(\d+)> (\S+ \d+ \d+:\d+:\d+) (\S+) (.*?)$/.match(text)
pri, timestamp, device, msg = *md.captures
# Do something with the captures
end

Some more admittedly ugly constructions:

val = "<165> Aug 16 17:01:35 localhost Just a test"
unless ( ( pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+
\d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x.match(val).to_a ).empty? )
puts "matched"
end

pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+ \d+:\d+:\d+)
\s+ (\S+) \s+ (.*)$/x.match(val).to_a
if pri
puts "matched"
end

LOG_RX = /^<(\d+)> \s+ (\S+ \s+ \d+ \s+ \d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x

unless ( ( pri, timestamp, device, msg = * LOG_RX.match(val).to_a ).empty? )
puts "matched"
end

if ( line, pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+
\d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x.match(val) ) && line
puts "matched"
end

if ( line, pri, timestamp, device, msg = * LOG_RX.match(val) ) && line
puts "matched"
end

robert

Daniel Völkerts · Aug 16, 2004

Robert said:
Some more admittedly ugly constructions:

val = "<165> Aug 16 17:01:35 localhost Just a test"
unless ( ( pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+
\d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x.match(val).to_a ).empty? )
puts "matched"
end

pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+ \d+:\d+:\d+)
\s+ (\S+) \s+ (.*)$/x.match(val).to_a
if pri
puts "matched"
end

LOG_RX = /^<(\d+)> \s+ (\S+ \s+ \d+ \s+ \d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x

unless ( ( pri, timestamp, device, msg = * LOG_RX.match(val).to_a ).empty? )
puts "matched"
end

if ( line, pri, timestamp, device, msg = * /^<(\d+)> \s+ (\S+ \s+ \d+ \s+
\d+:\d+:\d+) \s+ (\S+) \s+ (.*)$/x.match(val) ) && line
puts "matched"
end

if ( line, pri, timestamp, device, msg = * LOG_RX.match(val) ) && line
puts "matched"
end

robert

*boom* That blow my mind away! No no, thanks a lot for that piece of code.

But I prefer the scanf and one-line-regexp.

I'll test which kind performs better for my needs. As I said, I'm a ruby
newbie and personal programming rule is: keep it simple!

I've to
understand the things I wrote.

If the point is reached where my little script becomes interesting for
others than me, I'll post an [Ann] thread.

Bye,

Robert Klemme · Aug 17, 2004

(1) This one converts the RX MatchData into an array and tests for emptyness
to determine whether it matched. And along the way values are assigned to
local vars.

(2) Similar, but now just one local var is used as match check: if "pri" is
not nil, the RX matched.

(3) Same approach as (1) but the regexp is defined as a constant to make
stuff more readable.

(4) Similar approach to (2) but the test is included ("&& line"). Note that
this time no conversion to array is done here so we need the additional
local "line" to receive the complete capture.

(5) Same as (4) but with regexp in constant as in (3).

*boom* That blow my mind away! No no, thanks a lot for that piece of code.

I *should've* put some comments in... Ok, inserting them above now.

But I prefer the scanf and one-line-regexp.

Basically I used extended regular expressions (switched by the "/x" flag).
Whitespace is ignored, that's why you see more "\s+" in there. And that's
why the regexp is longer.

I'll test which kind performs better for my needs. As I said, I'm a ruby
newbie and personal programming rule is: keep it simple! I've to
understand the things I wrote.

That's an excellent road to walk down! Handcrafted, simple code is better
than a mindless copy of something found somewhere.

Kind regards

robert

Daniel Völkerts · Aug 17, 2004

Hi Robert,

thank you very very much for your short lesson. It's very intresting and
I'll see how I can profite from these information.

Ruby becomes more and more usable for me (normally my language of choice
is java but for such little scripts ruby is a great of fun!).

Bye,

Robert Klemme · Aug 18, 2004

Daniel Völkerts said:
Hi Robert,

thank you very very much for your short lesson. It's very intresting and
I'll see how I can profite from these information.

I'm glad I could be of any help.

Ruby becomes more and more usable for me (normally my language of choice
is java but for such little scripts ruby is a great of fun!).

Same here. I even use Ruby sometimes to manipulate Java code or search
through piles of Java code...

Daniel Völkerts ::
"Ich habe einen Drachen, und ich WERDE ihn benutzen!" - Esel in Shrek

Ohooo... *shake in fear*

Kind regards

robert

Dany Cayouette · Aug 18, 2004

Thanks for taking the time to put in explainations. Also a ruby newbie that never wrote anything useful yet, but started to follow this list a bit. Always look forward to your posting since I'm sure you'll put some line of code I won't understand.. ;-) Part of my learning is 'trying' to understand them. Thanks for the extra hand on this one!

Dany

Maximum number of sockets	8	Apr 6, 2010
what do you think of this code?	15	Aug 12, 2008
Lazy evaluated	2	Mar 28, 2013
Simple way of connecting cellular automata?	0	Mar 4, 2006
Story of B	0	Aug 20, 2003
Mimic AES_ENCRYPT and AES_DECRYPT functions in Ruby	19	Mar 24, 2009
[ANN] Beta (trial) version of Ruby/TkORCA	0	Sep 28, 2006
use of uninitialized value (beginner)	5	Jun 21, 2006

Rubish Way of extracting elements

Daniel Völkerts

Daniel Völkerts

David A. Black

Charles Mills

Florian Gross

Daniel Völkerts

Daniel Völkerts

Robert Klemme

Daniel Völkerts

Robert Klemme

Daniel Völkerts

Robert Klemme

Dany Cayouette

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads