Regular expression to match particular lines between markers

M

MENTAT

Hi,

I am trying to write a regular expression that extracts all the
comment lines between two specified markers in a text. An example
snippet is given below. (comments start with ";")

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer

I want the output to look like:

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1

[DisableVAD] ; Disables VAD in the TX streamer

basically, everything between [MRCP_SERVER] and [DisableVAD] that
starts on a line beginning with ";"

I tried using s/(\[MRCP_SERVER\].*)^[^;].*$(\[DisableVAD\])/marker $1
marker $2/ms. But this doesn't seem to work. What am I doing wrong?
isn't ^[^;].*$ the syntax for everything on a line not beginnign with
";"?

Note also that the input could be of the format

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; clk_osr_dev
;some other comment can interrupt, but the next line should be caught
tadlir02:4900 ; clk_osr_test

[DisableVAD] ; Disables VAD in the TX streamer

Any help, in the form of heavily commented regular expressions would
be much appreciated.
 
M

MENTAT

Hi,

I am trying to write a regular expression that extracts all the
comment lines between two specified markers in a text. An example
snippet is given below. (comments start with ";")

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer

I want the output to look like:

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1

[DisableVAD] ; Disables VAD in the TX streamer

basically, everything between [MRCP_SERVER] and [DisableVAD] that
starts on a line beginning with ";"

I tried using s/(\[MRCP_SERVER\].*)^[^;].*$(\[DisableVAD\])/marker $1
marker $2/ms. But this doesn't seem to work. What am I doing wrong?
isn't ^[^;].*$ the syntax for everything on a line not beginnign with
";"?

Note also that the input could be of the format

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; clk_osr_dev
;some other comment can interrupt, but the next line should be caught
tadlir02:4900 ; clk_osr_test

[DisableVAD] ; Disables VAD in the TX streamer

Any help, in the form of heavily commented regular expressions would
be much appreciated.

Sorry, formatting error...guess i should have previewed before
posting. To avoid any confusion what i meant was

[MRCP_SERVER] ;MRCP server format:<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer

I want the output to look like:

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900
forSpeechworks
; Each server should be specified on a separateline
;123.123.123.1:554 ; server1

[DisableVAD] ; Disables VAD in the TX streamer
 
M

Martin Kissner

MENTAT wrote :
Hi,

I am trying to write a regular expression that extracts all the
comment lines between two specified markers in a text. An example
snippet is given below. (comments start with ";")

This is what I would do.
I'll appreciate any comments an suggestions for improovement, since I'm
rather new to Perl.

--- code start ---

#!/usr/bin/perl

# description=

use warnings;
use strict;

while (my $line = <DATA>) {
print $line and match() if $line =~ /\[MRCP_SERVER\]/;
}


sub match {
while (my $line =<DATA>) {
print $line if $line =~ /^;/;
print $line and last if $line =~ /\[DisableVAD\]/;
}
}

__DATA__
some other line
and still another one
[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for Speechworks
; Each server should be specified on a separate line
;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer
some other line
and still another one

--- code end ---

--- outout (long lines wrapped by newsreader) ---
[MRCP_SERVER] ;MRCP server format:
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate line
;123.123.123.1:554 ; server1
[DisableVAD] ; Disables VAD in the TX streamer
--- end output ---

HTH
Martin
 
M

MENTAT

Ok, I can't format it to look just right but what i mean is

[MRCP_SERVER] ;MRCP server ...
; <Server_Port> is 554 for Nuance, ...
; Each server should be specified ...

;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer

I want the output to look like:

[MRCP_SERVER] ;MRCP server ...
; <Server_Port> is 554 for Nuance, ...
; Each server should be specified ...

;123.123.123.1:554 ; server1

[DisableVAD] ; Disables VAD in the TX streamer
 
J

John W. Krahn

MENTAT said:
I am trying to write a regular expression that extracts all the
comment lines between two specified markers in a text. An example
snippet is given below. (comments start with ";")

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1
tadlir01:4900 ; comment 1
tadlir02:4900 ; comment 2

[DisableVAD] ; Disables VAD in the TX streamer

I want the output to look like:

[MRCP_SERVER] ;MRCP server format:
<Server_IP_Address>:<Server_Port>
; <Server_Port> is 554 for Nuance, 4900 for
Speechworks
; Each server should be specified on a separate
line
;123.123.123.1:554 ; server1

[DisableVAD] ; Disables VAD in the TX streamer

basically, everything between [MRCP_SERVER] and [DisableVAD] that
starts on a line beginning with ";"

According to your examples you probably want something like this:

# UNTESTED


while ( <FILEHANDLE> ) {
if ( /^\[MRCP_SERVER]/ .. /^\[DisableVAD]/ ) {
$_ = '' if /^[^;]+;/;
}
print;
}


# UNTESTED


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top