writing get_script()

F

Franken Sense

In Dread Ink, the Grave Hand of Tad J McClellan Did Inscribe:
No, it removes the record separator.

When the record separator is "one or more blank lines" (ie. para mode),
then chomp() removes "one or more blank lines".

It would seem that both
local $/ = "";
and
local $/ = '';
induce paragraph mode.

P. 492 of the camel book speaks to this somewhat with the -O switch, which
specifies the record separator as an octal number. I would have thought
that a goocher of
C:\MinGW\source>perl m5.pl -0 00

would be equivalent

local $/="";

, but the former gives me the following output:

C:\MinGW\source>perl m5.pl -0 00
s[0] = 44:005:017
s[1] = Then
s[2] = the
s[3] = high
s[4] = priest
s[5] = rose
s[6] = up,
s[7] = and
s[8] = all
s[9] = they
s[10] = that
s[11] = were
s[12] = with
s[13] = him,
44:005:017 Then the high priest rose up, and all they that were with him,
s[0] =
s[1] = (which
s[2] = is
s[3] = the
s[4] = sect
s[5] = of
s[6] = the
s[7] = Sadducees,)
s[8] = and
s[9] = were
s[10] = filled
s[11] = with
(which is the sect of the Sadducees,) and were filled with
s[0] =
s[1] = indignation,
indignation,

s[0] = 44:005:018
s[1] = And
s[2] = laid
s[3] = their
s[4] = hands
s[5] = on
s[6] = the
s[7] = apostles,
s[8] = and
s[9] = put
s[10] = them
s[11] = in
s[12] = the
44:005:018 And laid their hands on the apostles, and put them in the
s[0] =
s[1] = common
s[2] = prison.
common prison.
....
--
Frank

And by the way, a few months ago, I trademarked the word 'funny.' So when
Fox calls me 'unfunny,' they're violating my trademark. I am seriously
considering a countersuit.
~~ Al Franken, in response to Fox's copyright infringement lawsuit
 
F

Franken Sense

In Dread Ink, the Grave Hand of Uri Guttman Did Inscribe:
FS> In Dread Ink, the Grave Hand of Uri Guttman Did Inscribe:
FS> for my $i (0..$#s) {


FS> I think I can answer your question if you can tell me why this is
FS> giving me numbers instead of words:

FS> my $outline = join(' ', (1..$#s));

i don't see any word data in that code. what do you think 1 .. $#s will
do? do you know what $#s does? (these are for you to answer). how would
you think that data has anything to do with the word data you have in
@s?

uri

What I want it to do is join the first through the ultimate words in s.

If I knew the answer, I wouldn't ask; I'd just continue.
--
Frank

In many ways I'm still a Hubert Humphrey Democrat -- someone who believes
in afflicting the comfortable and comforting the afflicted. A society is
judged by how it treats the elderly, the sick, the impoverished. To me it's
a matter of ethics and compassion.
~~ Al Franken, Playboy interview
 
F

Franken Sense

In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:

[snipped and re-ordered]
A quick search for 'Binary Tree" on CPAN returns several hundred
results, the very first on being "Tree::Binary" with many more
interesting modules on the same and the next page.

my $tree = Tree::Binary->new( 'root' );

my $left = Tree::Binary->new( 'left' );
$tree->left( $left );

my $right = Tree::Binary->new( 'left' );
$tree->right( $right );

my $right_child = $tree->right;

$tree->right( undef ); # Unset the right child.

my @nodes = $tree->traverse( $tree->POST_ORDER );

my $traversal = $tree->traverse( $tree->IN_ORDER );
while ( my $node = $traversal->() ) {
# Do something with $node here
}

Well, this is the cpan for Tree::Binary. I'll need a little time just to
look at it.

I had larger ambitions for my activestate install earlier in the moon. I
saw the gorgeous, full moon rise on an Isotopes game where we slugged the
Zephyrs into a 16-8 loss. We can provide a player to cover Manny and still
pull out a big win on lutheran night.

Anyways, I wanted to download *all* of the modules containing 'Tree'.
Screenshot here: http://lomas-assault.net/usenet/z27.jpg . Fishing for
hints.
And even more: "[...]and then use C to
insert the data into a binary tree."

The task is motivated by §11-12 in _C Unleashed_, wherein Heathfield uses a
binary tree to remove duplicate lines in a text.

Oh, well, aehmmmm, unless you want to do that as a learning excercise
there is a _MUCH_ better method in Perl, see 'perldoc -q duplicate':
"How can I remove duplicate elements from a list or array?"
in particular the very last sentence.

Would it delete the first or the non-first entry?

Found in C:\Perl\lib\pod\perlfaq4.pod
How can I remove duplicate elements from a list or array?
(contributed by brian d foy)

Use a hash. When you think the words "unique" or "duplicated", think
"hash keys".

If you don't care about the order of the elements, you could just
create
the hash then extract the keys. It's not important how you create that
hash: just that you use "keys" to get the unique elements.

my %hash = map { $_, 1 } @array;
# or a hash slice: @hash{ @array } = ();
# or a foreach: $hash{$_} = 1 foreach ( @array );

my @unique = keys %hash;

You can also go through each element and skip the ones you've seen
before. Use a hash to keep track. The first time the loop sees an
element, that element has no key in %Seen. The "next" statement creates
the key and immediately uses its value, which is "undef", so the loop
continues to the "push" and increments the value for that key. The next
time the loop sees that same element, its key exists in the hash *and*
the value for that key is true (since it's not 0 or undef), so the next
skips that iteration and the loop goes to the next element.

my @unique = ();
my %seen = ();

foreach my $elem ( @array )
{
next if $seen{ $elem }++;
push @unique, $elem;
}

You can write this more briefly using a grep, which does the same
thing.

my %seen = ();
my @unique = grep { ! $seen{ $_ }++ } @array;

I'll need to do a little digesting here, too.
In Perl you just put each line into a hash (as keys) and the semantics
of a hash will automatically eliminate all duplicates.

I'm getting closer to getting my head around this. How would you
disambiguate "hash" in perl as opposed to "hash table" in C? (It sounds
like a shady deal on the Raperbahn, nicht?)
--
Frank

[Roger Ailes, Fox News Founder, Chairman and CEO, and former
Nixon-Reagan-Bush strategist, is] a cynical Republican ideologue with no
regard for fairness and balance.
~~ Al Franken,
 
J

Jürgen Exner

Franken Sense said:
I think I can answer your question if you can tell me why this is giving me
numbers instead of words:
my @s = split /\s+/, $_;

You are splitting on white space ...
# print fields
print $s[0];

.... and printing the first 'word', which in your data is e.g.
'44:005:017' ...
my $outline = join(' ', (1..$#s));

....and then you are merging the natural numbers from 1 to some value n
....
print "$outline\n";

.... and printing them.

Nowhere in this code do you ever print any of the other text. Where do
you think it should be printed?

jue
 
J

Jürgen Exner

Franken Sense said:
In Dread Ink, the Grave Hand of Uri Guttman Did Inscribe:

What I want it to do is join the first through the ultimate words in s.

If you don't know what $#s means, then maybe you could ask? If you don't
know what (1..$#s) means, then maybe you could ask? Using code fragments
that you picked up somewhere without knowing their meaning an throwing
them together rarely produces useful code.

$@s is the highest index in the array @s, i.e. a number.
(1..$#s) is the list of numbers from 1 to the highest index of @s.
Nowhere does it relate to the content of @s.

What you want is maybe
@s[1..$#s]
which is a slice of the array @s, containing the elements from index 1
to the highest index. However I would probably use shift() instead to
remove the first element from an array.

Another note:
You are splitting the whole string into individual words only to join
them again later. Because the only value you really want separated out
as an individual value is the 't44:005:0171' it makes much more sense
to only split off that one single value by limiting split() to only 2
fields.
my ($numbers, $text) = split /\s+/, $_, 2;

jue
 
J

Jürgen Exner

Franken Sense said:
In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:

Sorry, please ignore this, my version of the perlfaq is outdated, I
really need to upgrade. This particular answer has been rewritten, such
that now ...
Found in C:\Perl\lib\pod\perlfaq4.pod
How can I remove duplicate elements from a list or array?
(contributed by brian d foy)

Use a hash. When you think the words "unique" or "duplicated", think
"hash keys".

.... the key sentence has been moved to the top.
Would it delete the first or the non-first entry?

Why would that matter? The items are identical, therefore you cannot
distinguish them anyway.

If you really have a list where the sequence is important, then use one
of the other methods that are listed in that FAQ for preserving the
sequence. If you keep the first of the last element typically depends
upon if you loop through the list from the top or from the bottom.
I'm getting closer to getting my head around this. How would you
disambiguate "hash" in perl as opposed to "hash table" in C? (It sounds
like a shady deal on the Raperbahn, nicht?)

Not absolutely sure what you mean by 'disambiguate' in this context,
therefore my answer may be off the mark.

A hash in Perl is a buildin standard data structure which maps strings
to scalars and allows access to every elements in O(1) time. Internally
it is implemented using hash tables, but Joe Programmer normally doesn't
have to concern himself with that detail. For him a hash is an array
where the indices are arbitrary strings instead of (natural) numbers.

C as a programming language does not have hash tables. It is the
programmers task to implement the abstract data structure 'hash table'
himself using pointers and arrays and chunks of memory.

jue
 
J

Jürgen Exner

Jürgen Exner said:
$@s is the highest index in the array @s, i.e. a number.

Arrrg, make that
$#s is the highest index in the array @s, i.e. a number.
of course.

jue
 
F

Franken Sense

In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:
What I want it to do is join the first through the ultimate words in s.

If you don't know what $#s means, then maybe you could ask? If you don't
know what (1..$#s) means, then maybe you could ask? Using code fragments
that you picked up somewhere without knowing their meaning an throwing
them together rarely produces useful code.

$@s is the highest index in the array @s, i.e. a number.
(1..$#s) is the list of numbers from 1 to the highest index of @s.
Nowhere does it relate to the content of @s.

What you want is maybe
@s[1..$#s]
which is a slice of the array @s, containing the elements from index 1
to the highest index. However I would probably use shift() instead to
remove the first element from an array.


C:\MinGW\source>perl m9.pl
44:005:017 Then the high priest rose up, and all they that were with him,
(which
is the sect of the Sadducees,) and were filled with indignation,
44:005:018 And laid their hands on the apostles, and put them in the common
pris
on.
44:005:019 But the angel of the Lord by night opened the prison doors, and
broug
ht them forth, and said,
44:005:020 Go, stand and speak in the temple to the people all the words of
this
life.

C:\MinGW\source>type m9.pl
#!/usr/bin/perl
# perl m9.pl
use warnings;
use strict;

local $/="";
while ( <DATA> ) {
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);
print "$verse $script\n";
}

__DATA__
44:005:017 Then the high priest rose up, and all they that were with him,
(which is the sect of the Sadducees,) and were filled with
indignation,

44:005:018 And laid their hands on the apostles, and put them in the
common prison.

44:005:019 But the angel of the Lord by night opened the prison doors,
and brought them forth, and said,

44:005:020 Go, stand and speak in the temple to the people all the words
of this life.


C:\MinGW\source>

Thanks, jü, this is looking better. The last thing I think this needs
before insertion into a tree is having the newlines of $verse adiosed
unless it's the ultimate one.

Is it the case that with perl, in verse 18, s[?] has 4 characters for the
word 'the', where the ultimate is newline, or six characters for 'words' in
verse 20?
 
J

Jürgen Exner

Franken Sense said:
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);

As I wrote before you can replace those three lines above with a single

my ($verse, $script) = split /\s+/, $_, 2;
[...]
The last thing I think this needs
before insertion into a tree is having the newlines of $verse adiosed
unless it's the ultimate one.

I have no idea what you mean by this sentence.
Is it the case that with perl, in verse 18, s[?] has 4 characters for the
word 'the', where the ultimate is newline, or six characters for 'words' in
verse 20?

And neither with this sentence.

jue
 
F

Franken Sense

In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:
Franken Sense said:
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);

As I wrote before you can replace those three lines above with a single

my ($verse, $script) = split /\s+/, $_, 2;

When I print $verse and $script at the end of this loop, I get a faithful
copy of the original, and I'm glad to have that as an option.


split /PATTERN/,EXPR,LIMIT

Since EXPR would default to $_, I have to wonder how general this syntax
could be. What would happen if limit weren't 2? (Perldoc perlfunc doesn't
have a lot on this.)
[...]
The last thing I think this needs
before insertion into a tree is having the newlines of $verse adiosed
unless it's the ultimate one.

I have no idea what you mean by this sentence.
Is it the case that with perl, in verse 18, s[?] has 4 characters for the
word 'the', where the ultimate is newline, or six characters for 'words' in
verse 20?

And neither with this sentence.

jue

It doesn't help that I also don't really know what I'm getting at here.
Here's what these data look like in a hex editor:

http://lomas-assault.net/usenet/z30.jpg

About halfway down on the screen dump the F0 line begins with
74 68 65
, which is the 'the'
, followed by
OD OA
, cr-lf, the newline on my implementation,
then
twelve '20''s, which is whitespace.

I want to get rid of the cr-lf's in the middle of a verse. I think with
the more verbose version, I get rid of newlines that don't immediately
precede the next verse.

#!/usr/bin/perl
# perl m21.pl
use strict;
use warnings;


local $/="";
my $len = 5;

while( my $line = <DATA> ) {
chomp($line);
my @s = split /\s+/, $line;
for my $i (0..$#s) {
$len = length($s[$i]);
print "s[$i] = $s[$i] $len \n";
}

}
__DATA__
44:005:017 Then the high priest rose up, and all they that were with him,
(which is the sect of the Sadducees,) and were filled with
indignation,

44:005:018 And laid their hands on the apostles, and put them in the
common prison.

44:005:019 But the angel of the Lord by night opened the prison doors,
and brought them forth, and said,

44:005:020 Go, stand and speak in the temple to the people all the words
of this life.

# end script begin abridged output

s[12] = the 3
s[13] = common 6
s[14] = prison. 7

It looks like the 'the' has no newlines attached. Tja.
--
Frank

The irony upon irony of this lawsuit was great. First, Fox having the
trademark 'fair and balanced' -- a network which is anything but fair and
balanced. Then there's the irony of a news organization trying to suppress
free speech.
~~ Al Franken, CNN interview
 
P

Peter J. Holzer

Franken Sense said:
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);

As I wrote before you can replace those three lines above with a single

my ($verse, $script) = split /\s+/, $_, 2;

That's not the same. Franken's version splits the script into
white-space separated words and then joins them with a single space. In
other words, it replaces all sequences of whitespace with a single
space. Your version doesn't. This should be equivalent:

my ($verse, $script) = split /\s+/, $_, 2;
$script =~ s/\s+/ /g;

(except that this will also preserve whitespace at the end).
[...]
The last thing I think this needs before insertion into a tree is
having the newlines of $verse adiosed unless it's the ultimate one.

I have no idea what you mean by this sentence.

I think he means that he wants to remove (adios = tschüß) all but the
last newline from each verse. But he already does that, so maybe I'm
wrong.

hp
 
F

Franken Sense

In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:
A quick search for 'Binary Tree" on CPAN returns several hundred
results, the very first on being "Tree::Binary" with many more
interesting modules on the same and the next page.

I've got a new problem here. I've installed Tree::Binary using
activestate's ppm. To test, I've run a script with the line
use Tree::Binary;
.. I find no Binary.pm in the Tree folder where I would expect it:

http://lomas-assault.net/usenet/z31.jpg

I wouldn't know how to proceed.
 
J

Jürgen Exner

Peter J. Holzer said:
Franken Sense said:
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);

As I wrote before you can replace those three lines above with a single

my ($verse, $script) = split /\s+/, $_, 2;

That's not the same. Franken's version splits the script into
white-space separated words and then joins them with a single space. In
other words, it replaces all sequences of whitespace with a single
space. Your version doesn't.

You are right, I missed that desired side effect, sorry.


jue
 
J

Jürgen Exner

Franken Sense said:
split /PATTERN/,EXPR,LIMIT

Since EXPR would default to $_, I have to wonder how general this syntax
could be. What would happen if limit weren't 2? (Perldoc perlfunc doesn't
have a lot on this.)

What potential problem do you see? Limit can be whatever value you
choose. The docs say
If LIMIT is specified and positive, splits into no more than
that many fields (though it may split into fewer).

The clarification in paranthesis is obviously addressing the case that
there are fewer fields in the data than LIMIT, so that is covered, too.

jue
 
J

Jürgen Exner

Franken Sense said:
I want to get rid of the cr-lf's in the middle of a verse. I think with
the more verbose version, I get rid of newlines that don't immediately
precede the next verse.

The most prudent course of action might be to radically clean up your
data by removing _ALL_ cr-lf's anywhere in your text by a simple
s/\n//g;
Then at least you know exactly what your data looks like and you can
easily add a new "\n" at the very end if you feel like it.
Personally I consider "\n" not to be data but formatting and in general
will add those only while actually print()ing data.

jue
 
J

John W. Krahn

Jürgen Exner said:
The most prudent course of action might be to radically clean up your
data by removing _ALL_ cr-lf's anywhere in your text by a simple
s/\n//g;

cr is represented by \r so that doesn't remove any cr just lf.



John
 
J

John W. Krahn

Franken said:
In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:
What I want it to do is join the first through the ultimate words in s.
If you don't know what $#s means, then maybe you could ask? If you don't
know what (1..$#s) means, then maybe you could ask? Using code fragments
that you picked up somewhere without knowing their meaning an throwing
them together rarely produces useful code.

$@s is the highest index in the array @s, i.e. a number.
(1..$#s) is the list of numbers from 1 to the highest index of @s.
Nowhere does it relate to the content of @s.

What you want is maybe
@s[1..$#s]
which is a slice of the array @s, containing the elements from index 1
to the highest index. However I would probably use shift() instead to
remove the first element from an array.


C:\MinGW\source>perl m9.pl
44:005:017 Then the high priest rose up, and all they that were with him,
(which
is the sect of the Sadducees,) and were filled with indignation,
44:005:018 And laid their hands on the apostles, and put them in the common
pris
on.
44:005:019 But the angel of the Lord by night opened the prison doors, and
broug
ht them forth, and said,
44:005:020 Go, stand and speak in the temple to the people all the words of
this
life.

C:\MinGW\source>type m9.pl
#!/usr/bin/perl
# perl m9.pl
use warnings;
use strict;

local $/="";
while ( <DATA> ) {
my @s = split /\s+/, $_;
my $verse = $s[0];
my $script = join(' ', @s[1..$#s]);
print "$verse $script\n";
}

local $/ = '';
while ( <DATA> ) {
my ( $verse, @s ) = split;
my $script = join ' ', @s;
print "$verse $script\n";
}




John
 
F

Franken Sense

In Dread Ink, the Grave Hand of Franken Sense Did Inscribe:
In Dread Ink, the Grave Hand of Jürgen Exner Did Inscribe:


I've got a new problem here. I've installed Tree::Binary using
activestate's ppm. To test, I've run a script with the line
use Tree::Binary;
. I find no Binary.pm in the Tree folder where I would expect it:

http://lomas-assault.net/usenet/z31.jpg

I wouldn't know how to proceed.

I went to cpan and downloaded the tar.gz package for this, exracted them on
my desktop, and somehow I have .pm files where I had .html files (in the
site folder not on my desktop). I would have thought that I remembered
falsely what I was looking at before, but I saved it in a screenshot
yesterday: http://lomas-assault.net/usenet/z31.jpg

So it would appear that making the ppm install was half the soln and
extracting the tar.gz file was the second half.

The bad news is that this is very much beyond my ability with perl, and
finding examples has been fruitless. (For what would one google?)

It prettymuch all looks like this:

sub setRight {
my ($self, $tree) = @_;
(blessed($tree) && $tree->isa("Tree::Binary"))
|| die "Insufficient Arguments : right argument must be a Tree::Binary object";
$tree->{_parent} = $self;
$self->{_right} = $tree;
unless ($tree->isLeaf()) {
$tree->fixDepth();
}
else {
$tree->{_depth} = $self->getDepth() + 1;
}
$self;
}

There's nothing in the camel book on it, so I would rather take the
discretion part of valor as opposed to getting myself in too deep.

I think, instead, I'll try to shoe horn these data into a hash and see what
I can pull off.
--
Frank

No Child Left Behind is the most ironically named act, piece of legislation
since the 1942 Japanese Family Leave Act.
~~ Al Franken, in response to the 2004 SOTU address
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,142
Latest member
DewittMill
Top