# Question about regex (nagios plugin)

Discussion in 'Perl Misc' started by Ashish Kumar, Sep 30, 2008.

1. ### Ashish KumarGuest

Hello,

I'm developing a plugin for nagios to get CPU usage on Red Hat Linux
machines. Below is a snippet:

-------------------- 8< --------------------
$get_cpu_util=vmstat 1 2 | tail -n 1; # procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- # r b swpd free buff cache si so bi bo in cs us sy id wa # 2 0 0 4424156 181780 2505320 0 0 0 7 1 1 0 0 100 0 # 0 0 0 4426332 181780 2505320 0 0 0 48 1121 3762 0 0 100 0 if($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*$/){$us=$1;$sy=$2;$id=$3;$wa=$4; } -------------------- 8< -------------------- The code runs fine on RHEL4 hosts and shows the correct values. But in RHEL5 they have added one more value namely "st" in cpu section: -------------------- 8< -------------------- procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 33184 10028 66208 68396 0 0 737 276 15 12 34 9 51 7 0 0 0 33184 10028 66208 68396 0 0 0 32 1104 140 0 0 100 0 0 -------------------- 8< -------------------- So, I was just wondering if there is any way to run the same code run on both hosts with some regex wizardry? Thanks. Ashish Kumar, Sep 30, 2008 1. ### Advertising 2. ### Tim GreerGuest Ashish Kumar wrote: > Hello, > > I'm developing a plugin for nagios to get CPU usage on Red Hat Linux > machines. Below is a snippet: > > -------------------- 8< -------------------- >$get_cpu_util=vmstat 1 2 | tail -n 1;
>
> # procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> # r b swpd free buff cache si so bi bo in cs
> us sy id wa
> # 2 0 0 4424156 181780 2505320 0 0 0 7 1
> 1 0 0 100 0
> # 0 0 0 4426332 181780 2505320 0 0 0 48 1121
> 3762 0 0 100 0
>
> if($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*$/){
> $us=$1;
> $sy=$2;
> $id=$3;
> $wa=$4;
> }
> -------------------- 8< --------------------
>
> The code runs fine on RHEL4 hosts and shows the correct values. But
> in RHEL5 they have added one more value namely "st" in cpu section:
>
> -------------------- 8< --------------------
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us
> sy id wa st
> 0 0 33184 10028 66208 68396 0 0 737 276 15 12 34
> 9 51 7 0
> 0 0 33184 10028 66208 68396 0 0 0 32 1104 140 0
> 0 100 0 0
> -------------------- 8< --------------------
>
> So, I was just wondering if there is any way to run the same code run
> on both hosts with some regex wizardry?
>
>
> Thanks.

You can add a check in the script to determine the OS version, or add an
optional last field and value (which if blank, you can assume is the
older OS version).

Just declare the variables before the regex and:

if ($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/)
{
$us =$1;
$sy =$2;
$id =$3;
$wa =$4;
$st =$5;
}

You'd want to do more checking than that, but that's a simple way to
show an example working off your existing code.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
Tim Greer, Sep 30, 2008

3. ### Tim GreerGuest

Tim Greer wrote:

> Ashish Kumar wrote:
>
>> Hello,
>>
>> I'm developing a plugin for nagios to get CPU usage on Red Hat Linux
>> machines. Below is a snippet:
>>
>> -------------------- 8< --------------------
>> $get_cpu_util=vmstat 1 2 | tail -n 1; >> >> # procs -----------memory---------- ---swap-- -----io---- --system-- >> ----cpu---- >> # r b swpd free buff cache si so bi bo in cs >> us sy id wa >> # 2 0 0 4424156 181780 2505320 0 0 0 7 1 >> 1 0 0 100 0 >> # 0 0 0 4426332 181780 2505320 0 0 0 48 1121 >> 3762 0 0 100 0 >> >> if($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*$/){ >>$us=$1; >>$sy=$2; >>$id=$3; >>$wa=$4; >> } >> -------------------- 8< -------------------- >> >> The code runs fine on RHEL4 hosts and shows the correct values. But >> in RHEL5 they have added one more value namely "st" in cpu section: >> >> -------------------- 8< -------------------- >> procs -----------memory---------- ---swap-- -----io---- --system-- >> -----cpu------ >> r b swpd free buff cache si so bi bo in cs us >> sy id wa st >> 0 0 33184 10028 66208 68396 0 0 737 276 15 12 34 >> 9 51 7 0 >> 0 0 33184 10028 66208 68396 0 0 0 32 1104 140 0 >> 0 100 0 0 >> -------------------- 8< -------------------- >> >> So, I was just wondering if there is any way to run the same code run >> on both hosts with some regex wizardry? >> >> >> Thanks. > > You can add a check in the script to determine the OS version, or add > an optional last field and value (which if blank, you can assume is > the older OS version). > > Just declare the variables before the regex and: > > if ($get_cpu_util =~
> /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/) > { >$us = $1; >$sy = $2; >$id = $3; >$wa = $4; >$st = $5; > } > > You'd want to do more checking than that, but that's a simple way to > show an example working off your existing code. Pardon, I should have fixed your example (rather than add to it), as it was broken. That would actually be: if ($get_cpu_util =~ m/^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)
\s*$/) { -- Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc. Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers and Custom Hosting. 24/7 support, 30 day guarantee, secure servers. Industry's most experienced staff! -- Web Hosting With Muscle! Tim Greer, Sep 30, 2008 4. ### Tim GreerGuest Tim Greer wrote: > Tim Greer wrote: > >> Ashish Kumar wrote: >> >>> Hello, >>> >>> I'm developing a plugin for nagios to get CPU usage on Red Hat Linux >>> machines. Below is a snippet: >>> >>> -------------------- 8< -------------------- >>>$get_cpu_util=vmstat 1 2 | tail -n 1;
>>>
>>> # procs -----------memory---------- ---swap-- -----io---- --system--
>>> ----cpu----
>>> # r b swpd free buff cache si so bi bo in cs
>>> us sy id wa
>>> # 2 0 0 4424156 181780 2505320 0 0 0 7 1
>>> 1 0 0 100 0
>>> # 0 0 0 4426332 181780 2505320 0 0 0 48 1121
>>> 3762 0 0 100 0
>>>
>>> if($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*$/){
>>> $us=$1;
>>> $sy=$2;
>>> $id=$3;
>>> $wa=$4;
>>> }
>>> -------------------- 8< --------------------
>>>
>>> The code runs fine on RHEL4 hosts and shows the correct values. But
>>> in RHEL5 they have added one more value namely "st" in cpu section:
>>>
>>> -------------------- 8< --------------------
>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>> -----cpu------
>>> r b swpd free buff cache si so bi bo in cs us
>>> sy id wa st
>>> 0 0 33184 10028 66208 68396 0 0 737 276 15 12 34
>>> 9 51 7 0
>>> 0 0 33184 10028 66208 68396 0 0 0 32 1104 140 0
>>> 0 100 0 0
>>> -------------------- 8< --------------------
>>>
>>> So, I was just wondering if there is any way to run the same code
>>> run on both hosts with some regex wizardry?
>>>
>>>
>>> Thanks.

>>
>> You can add a check in the script to determine the OS version, or add
>> an optional last field and value (which if blank, you can assume is
>> the older OS version).
>>
>> Just declare the variables before the regex and:
>>
>> if ($get_cpu_util =~ >> /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/)
>> {
>> $us =$1;
>> $sy =$2;
>> $id =$3;
>> $wa =$4;
>> $st =$5;
>> }
>>
>> You'd want to do more checking than that, but that's a simple way to
>> show an example working off your existing code.

>
> Pardon, I should have fixed your example (rather than add to it), as
> it
> was broken. That would actually be:
>
> if ($get_cpu_util =~ m/^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+) > \s*$/) {
>
>

In fact, that above example isn't really right anyway, working off the
code, because it will match either OS version's vmstat output, since
you are only capturing the last so-many digits, rather than a fixed
regex to count and ignore the first 11 digits and only catpure the last
5 or 6. There are a few ways to go about that. You can use split
instead of the regex, or use a regex in a few different ways. I'd
personally use split or a fixed field regex or something similar, or
just check the OS version to determine the regex/split used.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
Tim Greer, Sep 30, 2008
5. ### Peter MakholmGuest

Ashish Kumar <> writes:

> if($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*$/){
> $us=$1;
> $sy=$2;
> $id=$3;
> $wa=$4;
> }

I would do something quite different:

open my $vmstat, '-|', 'vmstat 1 5' or die; # discard first line <$vmstat>;

# next line is field names:
my @fields = split /\s+/, <$vmstat>; my @result; while(<$vmstat>) {
my %stats;
@stats{ @fields } = split /\s+/;
push @result, \%stats;
}

//Makholm
Peter Makholm, Sep 30, 2008
6. ### Martien VerbruggenGuest

On Mon, 29 Sep 2008 23:29:41 -0700 (PDT),
Ashish Kumar <> wrote:
> Hello,
>
> I'm developing a plugin for nagios to get CPU usage on Red Hat Linux
> machines. Below is a snippet:

\begin{offtopi}

Why don't you run collectd and use collectd-nagios? That avoids problems
with changes in output of system tools.

\end{offtopic}

Martien
--
|
Martien Verbruggen | For heaven's sake, don't TRY to be cynical.
| It's perfectly easy to be cynical.
|
Martien Verbruggen, Sep 30, 2008
7. ### Ashish KumarGuest

> You can add a check in the script to determine the OS version, or add an
> optional last field and value (which if blank, you can assume is the
> older OS version).

True but I was just wondering if there was any possibility of getting
this to work.

> if ($get_cpu_util =~ /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/)

It worked as expected. Since I am fairly new to regex, I am still
trying to figure out how it worked though.

Thanks.
Ashish Kumar, Oct 1, 2008
8. ### Tim GreerGuest

Ashish Kumar wrote:

>> You can add a check in the script to determine the OS version, or add
>> an optional last field and value (which if blank, you can assume is
>> the older OS version).

>
> True but I was just wondering if there was any possibility of getting
> this to work.
>
>> if ($get_cpu_util =~ >> /^.*\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/)

>
> It worked as expected. Since I am fairly new to regex, I am still
> trying to figure out how it worked though.
>
> Thanks.

The above is actually a bad example and can present unexpected results,
but can easily be modified to work consistently. For now, the regex is
just capturing (within ( ) ) anything that's follows one or more white
space(s) \s+ (spaces) that happens to be a digit (one or more digits
\d+). The add I set was zero or more white spaces at the end \s* and
an "optional" final digit(s) \d+, by using (\d+). (content)? will make
it optional, so it can match with or without it. But, again, that
presents some issues, so you'll want to modify it to your specific
needs.

Instead, you should do something like (watch for word-wrap):

if ($get_cpu_util =~ m/^\s*(?:\d+\s+){12 (\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*$/) {

Since there's always 12 instances of a \d+ (digit) and white space \s+
before you want to start capturing only the last 4 or 5 digits
following that (depending on the OS), I've added (?:\d+\s+){12} to the
example, which will require (and ignore -- not capture) the first 12
instances, and then you are more accurately capturing the last 4 (or 5,
if there are 5) digits. There are probably better ways to do this,
such as the last portion \s+(\d+)(?:\s+)?(\d+)?\s*$/) { or by using split. But the above modified is more accurate. -- Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc. Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers and Custom Hosting. 24/7 support, 30 day guarantee, secure servers. Industry's most experienced staff! -- Web Hosting With Muscle! Tim Greer, Oct 1, 2008 9. ### Ashish KumarGuest > Instead, you should do something like (watch for word-wrap): > > if ($get_cpu_util =~ m/^\s*(?:\d+\s+){12
> (\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*(\d+)?\s*\$/) {
>

I have got it now. I appreciate your time and efforts.

Thanks.
Ashish Kumar, Oct 3, 2008