Complex regex question

Tuxedo · Sep 26, 2009

Hi,

I use a simple grep procedure munching through a domain zone file to return
a report of existing domains against any particular keyword, which also
includes matches in the nameserver field (although it is not meant to). For
example, this is the first few lines result of 'grep KOMODO zonefile.txt':

KOMODODRAGON NS DNS2.GORGE.NET.
KOMODODRAGON NS SERV.GORGE.NET.
HELIOCENTRIC NS NS1.KOMODOTEK
HELIOCENTRIC NS NS2.KOMODOTEK
DIVEKOMODO NS NS1.PUREHOST
DIVEKOMODO NS NS2.PUREHOST
KOMODO-TECH NS NS1.CISCO
KOMODO-TECH NS NS2.CISCO
KOMODOSYSTEM NS DNS.NETFORCE.IT.
KOMODOSYSTEM NS NS2.IPOINT.IT.
KOMODOISLAND-TOURS NS NS1.BALINTER.NET.
KOMODOISLAND-TOURS NS NS2.BALINTER.NET.

Any domain match, being the first string starting with a new line, may have
two or more name servers associated with the domain, so the result is one
line p/match and name server (usually two but sometimes more lines).

However, I would like to output a list with only one line p/domain match,
regardless of number of nameservers.

I would also like to exclude any occurrence returned from the nameserver
field, ie. anything after a white space (eg. the third and fourth listing
in the above example should not occur at all). In other words, only return
matches that are not having a whitespace occuring before a new line (does
this make sense.?.).

So the result is stripping any matches in the nameserver output altogether
as well as any duplicate domains. When the above list is processed, the
result would be simply one domain p/line and one line p/domain:

KOMODODRAGON
DIVEKOMODO
KOMODO-TECH
KOMODOSYSTEM
KOMODOISLAND-TOURS

The purpose is simply to return a list of domains against a particular
keyword, stripping the irrelevant parts. I'm not quite sure how to do this,
although I guess Perl is the best tool, being the de-facto regex master!

Any suggestions or snippet code would be greatly appreciated!

Many thanks,
Tuxedo

Tuxedo · Sep 26, 2009

Tad J McClellan wrote:

[...]

perldoc -q duplicate
[...]

---------------
#!/usr/bin/perl
use warnings;
use strict;

my $term = 'KOMODO';

my %seen;
while ( <DATA> ) {
if ( /^(\S*$term\S*)/ ) {
print "$1\n" unless $seen{$1}++;
}
}

[...]

Thanks for the perldoc tip and for the working regex magic

Tuxedo

regex question	7	Jun 20, 2013
Clickable link conversion regex?	0	Nov 30, 2012
about Net::DNS 0.47	2	May 30, 2004
complex regex	1	Oct 10, 2007
complex regex	1	Oct 10, 2007
How to debug a regex with (?DEFINE)?	0	Aug 7, 2012
FAQ 6.9 How can I quote a variable to use in a regex?	10	Apr 12, 2011
Remote SSH and Configuring code help	0	Dec 13, 2023

Complex regex question

Tuxedo

Tuxedo

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads