Problem modifying captured regexp results

P

Paul van Delst

Hello,

I'm using ruby to automatically generate Fortran95 code and I'm using a regular expression
to parse the following type of definition line:

REAL(fp), DIMENSION(Dim1,Dim2) :: Arr2 ! Description of Arr2

The regexp I'm using works fine and I build a array of hashes for each definition, i.e.

if line =~ componentRegexp
# We have matched an array component definition
arrayList<<{"type"=>$1,
"param"=>$2,
"dimlist"=>$3,
"name"=>$4,
"description"=>$5}
puts(arrayList.last.inspect)
else
# No match, so raise an error
raise StandardError, "Invalid array definition, #{$~}"
end

which works fine. The inspect o/p gives me:

{"name"=>"Arr2", "type"=>"REAL", "description"=>"Description of Arr2", "param"=>"fp",
"dimlist"=>"Dim1,Dim2"}

However, what I want to do is modify the dimlist in the hash so it is a string array
"dimlist"=>["Dim1","Dim2"]
rather than a single string,
"dimlist"=>"Dim1,Dim2"

Because the number of dimensions in the dimlist can vary from 1 to 7, rather than do the
splitting in the regexp, I tried doing it in the arrayList concatenation using the split
method like so,

arrayList<<{"type"=>$1,
"param"=>$2,
"dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist on ","
"name"=>$4,
"description"=>$5}

but I've found that the above operation on the $3 captured result appears to "wipe" the
subsequent entries $4 (name) and $5 (description). For example, the output of
puts(arrayList.last.inspect)
on the above gives me,

{"name"=>nil, "type"=>"REAL", "description"=>nil, "param"=>"fp", "dimlist"=>["Dim1", "Dim2"]}

Note that the "dimlist" is how I want it, but "name" and "description" entries are now nil.

So can someone elaborate on why the above split operation on captured regexp results seems
to bugger up the other captured results? Does this issue extend to *any* operation on
captured regexp results?

I've looked through the pickaxe and cookbook, but no information on this was immediately
apparent.

Thanks for any info.

cheers,

paulv
 
P

Pit Capitain

Paul said:
(...)

if line =~ componentRegexp
# We have matched an array component definition
arrayList<<{"type"=>$1,
"param"=>$2,
"dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist
"name"=>$4,
"description"=>$5}

but I've found that the above operation on the $3 captured result
appears to "wipe" the subsequent entries $4 (name) and $5 (description).

Paul, the problem is that #split with a Regexp internally executes some
Regexp matches which change $1, $2 etc. You have to capture the results
of the first match before executing the split.

Regards,
Pit
 
P

Paul van Delst

Pit said:
Paul, the problem is that #split with a Regexp internally executes some
Regexp matches which change $1, $2 etc. You have to capture the results
of the first match before executing the split.

Aha! That is the answer to the question (see my other post).

Bewdy. Thanks Pit and Gavin.

cheers,

paulv
 
J

Jan Svitok

Paul, the problem is that #split with a Regexp internally executes some
Regexp matches which change $1, $2 etc. You have to capture the results
of the first match before executing the split.

In this case, it's really easy: just reorder the lines so that the one
containing split will be the last (hash changes the order anyway):

arrayList<<{"type"=>$1,
"param"=>$2,
"name"=>$4,
"description"=>$5,
"dimlist"=>$3.split(/\s*,\s*/)}

This will work fine as in the moment split messes up those $x, you
don't need them any more. Obviously this would not work if there were
more splits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top