False positives in editing data

R

RichardOnRails

Hi All,

Below is a 33-line program that analyzes a set of names.
The names proper are prefixed with an optional set of period-separated
numbers.
The analysis checks for the following, and flags violations:
1. A number may not start with zeros.
2. No more that two numbers may be prefixed
3. Names proper must begin with a non-digit

The program works pretty well, except it produces "false negatives,"
which I flag on the output displayed following the program. The
problem is around line 29 when I try to determine if a leading number
has any initial zero by using an RE and testing for Nil.

I'd welcome any ideas on how to correct my mistake(s), especially with
suggestion for improved style.

Thanks in advance,
Richard

Program
-------------------------------------------------------------------------------
MLN = 2 # MaxLeadingNumbers
input = <<DATA
05Topic 5
1Topic 1
2.002.1Topic 2.2.1
2.1Topic 2.1
2.2.02Topic 2.2.2
DATA
input.each { |sName|
# Debug
puts "\n" + "="*10 + "DBG", sName, "="*10+ "DBG"

#Get leading numbers separated by periods into $1, $2, $13
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/


# Save match variables in n[1] ... n[MLN+1]
n = Array.new
(1..MLN+1).each { |i| eval %{n = $#{i}} }

# Debug
print "DBG0>> "; (1..MLN+1).each { |i| printf("\t#{n}") }; puts
# puts "n1=#{n[1]}, n2=#{n[2]}, n3=#{n[3]}"

# Get and check leading, period-separated numbers in dir. names
(1..MLN+1).each { |i|
n =~ /^0+/
i_thNumberExists = n ? true : false
iThNumbrerHasLeadingZero = i_thNumberExists && (eval("$#{i}.class ==
NilClass") ?
true : false)
puts "ERROR: Leading zeros in #{n}" if iThNumbrerHasLeadingZero
&& (i <= MLN)
puts "ERROR: Depth of hierarchy exceeds #{MLN}" if i_thNumberExists
&& i > MLN
}
}


Output
---------------------------------------------

==========DBG
05Topic 5
==========DBG
DBG0>> 05
ERROR: Leading zeros in 05

==========DBG
1Topic 1
==========DBG
DBG0>> 1
ERROR: Leading zeros in 1 [False positive]

==========DBG
2.002.1Topic 2.2.1
==========DBG
DBG0>> 2 002 1
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 002
ERROR: Depth of hierarchy exceeds 2

==========DBG
2.1Topic 2.1
==========DBG
DBG0>> 2 1
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 1 [False positive]

==========DBG
2.2.02Topic 2.2.2
==========DBG
DBG0>> 2 2 02
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 2 [False positive]
ERROR: Depth of hierarchy exceeds 2
 
R

RichardOnRails

Hi All,

Below is a 33-line program that analyzes a set of names.
The names proper are prefixed with an optional set of period-separated
numbers.
The analysis checks for the following, and flags violations:
1. A number may not start with zeros.
2. No more that two numbers may be prefixed
3. Names proper must begin with a non-digit

The program works pretty well, except it produces "false negatives,"
which I flag on the output displayed following the program. The
problem is around line 29 when I try to determine if a leading number
has any initial zero by using an RE and testing for Nil.

I'd welcome any ideas on how to correct my mistake(s), especially with
suggestion for improved style.

Thanks in advance,
Richard

Program
-------------------------------------------------------------------------------
MLN = 2 # MaxLeadingNumbers
input = <<DATA
05Topic 5
1Topic 1
2.002.1Topic 2.2.1
2.1Topic 2.1
2.2.02Topic 2.2.2
DATA
input.each { |sName|
# Debug
puts "\n" + "="*10 + "DBG", sName, "="*10+ "DBG"

#Get leading numbers separated by periods into $1, $2, $13
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/

# Save match variables in n[1] ... n[MLN+1]
n = Array.new
(1..MLN+1).each { |i| eval %{n = $#{i}} }

# Debug
print "DBG0>> "; (1..MLN+1).each { |i| printf("\t#{n}") }; puts
# puts "n1=#{n[1]}, n2=#{n[2]}, n3=#{n[3]}"

# Get and check leading, period-separated numbers in dir. names
(1..MLN+1).each { |i|
n =~ /^0+/
i_thNumberExists = n ? true : false
iThNumbrerHasLeadingZero = i_thNumberExists && (eval("$#{i}.class ==
NilClass") ?
true : false)
puts "ERROR: Leading zeros in #{n}" if iThNumbrerHasLeadingZero
&& (i <= MLN)
puts "ERROR: Depth of hierarchy exceeds #{MLN}" if i_thNumberExists
&& i > MLN
}

}

Output
---------------------------------------------

==========DBG
05Topic 5
==========DBG
DBG0>> 05
ERROR: Leading zeros in 05

==========DBG
1Topic 1
==========DBG
DBG0>> 1
ERROR: Leading zeros in 1 [False positive]

==========DBG
2.002.1Topic 2.2.1
==========DBG
DBG0>> 2 002 1
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 002
ERROR: Depth of hierarchy exceeds 2

==========DBG
2.1Topic 2.1
==========DBG
DBG0>> 2 1
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 1 [False positive]

==========DBG
2.2.02Topic 2.2.2
==========DBG
DBG0>> 2 2 02
ERROR: Leading zeros in 2 [False positive]
ERROR: Leading zeros in 2 [False positive]
ERROR: Depth of hierarchy exceeds 2
Exit code: 0


Hi All,

Problem solved!

I went through a convoluted process to determine whether any of the
initial numbers began with a zero. I simplified it to:

i_thNumberHasLeadingZero = (n =~ /^0+/) ? true : false

I apologize for raising the question. My excuse is that I was
flailing about unable to figure out what was wrong. After I posted,
I took a harder look and realized my big mistake: I needed to use the
KISS method :)

Best wishes,
Richard
 
R

Ryan Davis

I realize you solved your problem, but there is a much more insidious
underlying issues here:

MLN = 2 # MaxLeadingNumbers
MLN
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName
n = Array.new
n
print "DBG0>> "; (1..MLN+1).each { |i|
printf("\t#{n}") };


nasty compounded lines and obtuse debugging output
i_thNumberExists = n ? true : false
iThNumbrerHasLeadingZero = i_thNumberExists &&


i_thNumberExists
iThNumbrerHasLeadingZero


ARGH! My eyes! STOP!!!

Ruby is NOT C.

For the love of all that is ruby, please write readable code!

Use English variable names (or whatever your native language is--just
use words). Don't use hungarian notation, it doesn't help--hell, your
variable names are so ugly you can't even read the misspellings in
them. Don't mix camelcase and underscores. Don't use 'n ? true :
false' when 'n' would do. Don't debug with logging. Write tests. /
[\d]+/ is the same as /\d+/. Those evals are TERRIFYING. Don't use them.

And, most importantly, 2 spaces per indent.

If you follow these suggestions, your code will be much more
understandable and less error prone. Really.
 
H

Harry Kakueki

Problem solved!

I went through a convoluted process to determine whether any of the
initial numbers began with a zero. I simplified it to:

i_thNumberHasLeadingZero = (n =~ /^0+/) ? true : false

I apologize for raising the question. My excuse is that I was
flailing about unable to figure out what was wrong. After I posted,
I took a harder look and realized my big mistake: I needed to use the
KISS method :)

Best wishes,
Richard

If you solved your problem then you won't need this.
But, just in case you are interested in looking at it from a little
different view....
It is not a complete solution, just an idea.

inp = <<DATA
05Topic 5
1Topic 1
2.002.1Topic 2.2.1
2.1Topic 2.1
2.2.02Topic 2.2.2
DATA

err1 = inp.select{|a| a =~ /^0|\.0/}
puts "A number may not start with zero" unless err1.empty?
print err1
puts
err2 = inp.select{|b| b =~ /^\d+\.\d+\.\d/}
puts "No more that two numbers may be prefixed" unless err2.empty?
print err2

Harry
 
R

Raul Parolari

RichardOnRails said:
Below is a 33-line program that analyzes a set of names.
The names proper are prefixed with an optional set of period-separated
numbers.
The analysis checks for the following, and flags violations:
1. A number may not start with zeros.
2. No more that two numbers may be prefixed
3. Names proper must begin with a non-digit
..
I'd welcome any ideas on how to correct my mistake(s), especially with
suggestion for improved style.

Richard,

even after your correction, there are things that we can improve in
your program; the problems I see are:

1) the array of numbers for the prefix is configured with the maximum
(MLN) nr of components, rather than with the entries that are present
within the line. This leads to complicated code (evaluating $n variables
beyond the valid ones, etc).
2) the regular expression sName for the numbers would need to be updated
if MLN changes (in fact, it currently contains more values than needed).
3) the script does not catch if the prefix contains characters other
than digits and dots. Try eg, prefix '2.!1' ; the prefix will be
recognized as 2, and consider the entry valid.

So the problem is that the code is very complex, and fragile; the good
news is that we can have a simpler and more robust solution; let us
divide the problem in small steps:

1) we first collect everything until the first letter (not included); we
will consider this the Prefix.
2) we verify that the Prefix has the format that we want: numbers
separated by dots.
3) we then split the Prefix into an array of Numbers.
4) we verify that the Array size does not exceed the MLN max size.
5) we then check if any of the entries begins with 0.
*) oh, and we will use the /x modifier to make the regexps legible

Each of the previous steps will be one line of code, so we should have a
very small script; I will comment the lines with the same number used
above, so you can map every statement with the 'battle plan' above:

class FormatError < Exception
# used right now only for unrecoverable errors
end

input.each do |line|

# debug info
puts "\n" + "="*10 + "DBG", line, "="*10+ "DBG\n"

# 1) collect everything (not greedy) until first letter
if line =~ /^ (.*?) [a-zA-Z] /x

prefix = $1
puts "prefix= '#{prefix}'"

# 2) validate format: numbers separated by digits
if prefix =~ /^ (?:\d+ [.]?)+ $ /x

# 3) collect the numbers in an array
arr = prefix.split('.')

# 4) verify max array size
puts " (Error: depth of hierarchy > #{MLN}!)" if
arr.length > MLN

# 5) any number begins with 0?
puts " (Error: a nr begins with 0!) " if
arr.any? { |v| v =~ /^0/ }

print " Numbers: ", arr.join(', '), "\n"

else
raise FormatError, "Prefix format NOT nr.nr.nr.."
end

else
print "the line has no prefix (option or error?)"
end

end # input

This program generated, tested with the same data you used, this output:

==========DBG
05Topic 5
==========DBG
prefix= '05'
(Error: a nr begins with 0!)
Numbers: 05

==========DBG
1Topic 1
==========DBG
prefix= '1'
Numbers: 1

==========DBG
2.002.1Topic 2.2.1
==========DBG
prefix= '2.002.1'
(Error: depth of hierarchy > 2!)
(Error: a nr begins with 0!)
Numbers: 2, 002, 1

==========DBG
2.1Topic 2.1
==========DBG
prefix= '2.1'
Numbers: 2, 1

==========DBG
2.2.02Topic 2.2.2
==========DBG
prefix= '2.2.02'
(Error: depth of hierarchy > 2!)
(Error: a nr begins with 0!)
Numbers: 2, 2, 02

As you can see, it seems to work well (it also catches double errors,
which is nice). Notice that if the max array size had to be changed, we
just need to modify the MLN value.

I hope this made sense, and that it helped.
Good luck!

Raul
 
R

RichardOnRails

Problem solved!
I went through a convoluted process to determine whether any of the
initial numbers began with a zero. I simplified it to:
i_thNumberHasLeadingZero = (n =~ /^0+/) ? true : false

I apologize for raising the question. My excuse is that I was
flailing about unable to figure out what was wrong. After I posted,
I took a harder look and realized my big mistake: I needed to use the
KISS method :)
Best wishes,
Richard

If you solved your problem then you won't need this.
But, just in case you are interested in looking at it from a little
different view....
It is not a complete solution, just an idea.

inp = <<DATA
05Topic 5
1Topic 1
2.002.1Topic 2.2.1
2.1Topic 2.1
2.2.02Topic 2.2.2
DATA

err1 = inp.select{|a| a =~ /^0|\.0/}
puts "A number may not start with zero" unless err1.empty?
print err1
puts
err2 = inp.select{|b| b =~ /^\d+\.\d+\.\d/}
puts "No more that two numbers may be prefixed" unless err2.empty?
print err2

Harry


Hi Ryan,
But, just in case you are interested in looking at it from a little
different view....
It is not a complete solution, just an idea.

I AM interested, and it's a great idea. Real Ruby rather than my C-
flavored Ruby :)

Just for our mutual amusement, I replaced your first criterion:

err1 = inp.select{|a| a =~ /^0|\.0/}

with:

err1 = inp.select{|a| a =~ /^0|^\d*\.0/}

to preclude false hits where the text proper might contain ".0", e.g.,
"7How to get 5.05% interest"

Bottom line: Your code is what I really need. Thanks, again.

Best wishes,
Richard
 
H

Harry Kakueki

Hi Ryan,


I AM interested, and it's a great idea. Real Ruby rather than my C-
flavored Ruby :)

Just for our mutual amusement, I replaced your first criterion:

err1 = inp.select{|a| a =~ /^0|\.0/}

with:

err1 = inp.select{|a| a =~ /^0|^\d*\.0/}

to preclude false hits where the text proper might contain ".0", e.g.,
"7How to get 5.05% interest"

Bottom line: Your code is what I really need. Thanks, again.

Best wishes,
Richard

Yeah, you need to modify the regular expressions and whatever to fit
your requirements.
Like I said, it's not a complete solution, just an idea you can use.

By the way, stop calling me Shirley. :)
And stop calling me Ryan. :)


Good luck,

Harry
 
R

RichardOnRails

Yeah, you need to modify the regular expressions and whatever to fit
your requirements.
Like I said, it's not a complete solution, just an idea you can use.

By the way, stop calling me Shirley. :)
And stop calling me Ryan. :)

Good luck,

Harry

Hi H
 
R

RichardOnRails

Yeah, you need to modify the regular expressions and whatever to fit
your requirements.
Like I said, it's not a complete solution, just an idea you can use.

By the way, stop calling me Shirley. :)
And stop calling me Ryan. :)

Good luck,

Harry

Hi Harry,

I apologize for the "Ryan" thing. Ryan was the first response on this
thread and got mixed up. I don't know about the "Shirley" thing.

I don't know what's going on there, but I apologize again for any way
I might have erred.

And I fear I owe you a third apology: I minute ago, I started to
reply to your most recent post when I stopped getting a timely
response to my typing. I continue to type, assuming my ISP would
catch up sooner or later and hit Enter. Suddenly, I got a message
that my half-baked response had been posted. Computers: uggggh!
Like I said, it's not a complete solution, just an idea you can use.

I know. I just wanted to show you I was giving my full attention to
your message.

Again, thanks for showing how Ruby can be used more effectively for
this kind of validation task.

Best wishes,
Richard
 
R

RichardOnRails

RichardOnRails said:
Below is a 33-line program that analyzes a set of names.
The names proper are prefixed with an optional set of period-separated
numbers.
The analysis checks for the following, and flags violations:
1. A number may not start with zeros.
2. No more that two numbers may be prefixed
3. Names proper must begin with a non-digit
..
I'd welcome any ideas on how to correct my mistake(s), especially with
suggestion for improved style.

Richard,

even after your correction, there are things that we can improve in
your program; the problems I see are:

1) the array of numbers for the prefix is configured with the maximum
(MLN) nr of components, rather than with the entries that are present
within the line. This leads to complicated code (evaluating $n variables
beyond the valid ones, etc).
2) the regular expression sName for the numbers would need to be updated
if MLN changes (in fact, it currently contains more values than needed).
3) the script does not catch if the prefix contains characters other
than digits and dots. Try eg, prefix '2.!1' ; the prefix will be
recognized as 2, and consider the entry valid.

So the problem is that the code is very complex, and fragile; the good
news is that we can have a simpler and more robust solution; let us
divide the problem in small steps:

1) we first collect everything until the first letter (not included); we
will consider this the Prefix.
2) we verify that the Prefix has the format that we want: numbers
separated by dots.
3) we then split the Prefix into an array of Numbers.
4) we verify that the Array size does not exceed the MLN max size.
5) we then check if any of the entries begins with 0.
*) oh, and we will use the /x modifier to make the regexps legible

Each of the previous steps will be one line of code, so we should have a
very small script; I will comment the lines with the same number used
above, so you can map every statement with the 'battle plan' above:

class FormatError < Exception
# used right now only for unrecoverable errors
end

input.each do |line|

# debug info
puts "\n" + "="*10 + "DBG", line, "="*10+ "DBG\n"

# 1) collect everything (not greedy) until first letter
if line =~ /^ (.*?) [a-zA-Z] /x

prefix = $1
puts "prefix= '#{prefix}'"

# 2) validate format: numbers separated by digits
if prefix =~ /^ (?:\d+ [.]?)+ $ /x

# 3) collect the numbers in an array
arr = prefix.split('.')

# 4) verify max array size
puts " (Error: depth of hierarchy > #{MLN}!)" if
arr.length > MLN

# 5) any number begins with 0?
puts " (Error: a nr begins with 0!) " if
arr.any? { |v| v =~ /^0/ }

print " Numbers: ", arr.join(', '), "\n"

else
raise FormatError, "Prefix format NOT nr.nr.nr.."
end

else
print "the line has no prefix (option or error?)"
end

end # input

This program generated, tested with the same data you used, this output:

==========DBG
05Topic 5
==========DBG
prefix= '05'
(Error: a nr begins with 0!)
Numbers: 05

==========DBG
1Topic 1
==========DBG
prefix= '1'
Numbers: 1

==========DBG
2.002.1Topic 2.2.1
==========DBG
prefix= '2.002.1'
(Error: depth of hierarchy > 2!)
(Error: a nr begins with 0!)
Numbers: 2, 002, 1

==========DBG
2.1Topic 2.1
==========DBG
prefix= '2.1'
Numbers: 2, 1

==========DBG
2.2.02Topic 2.2.2
==========DBG
prefix= '2.2.02'
(Error: depth of hierarchy > 2!)
(Error: a nr begins with 0!)
Numbers: 2, 2, 02

As you can see, it seems to work well (it also catches double errors,
which is nice). Notice that if the max array size had to be changed, we
just need to modify the MLN value.

I hope this made sense, and that it helped.
Good luck!

Raul

Hi Raul,

Thanks for your response.

I like your "battle plan". It works well for your exposition here. I
always have to prepare a high-level one when I propose a software
solution to a user.

Harry posted a succinct illustration of applying Ruby to this kind of
validation, which I appreciated. I also appreciate your slightly
longer approach because I need this structure
so that I can make use of data for my purpose beyond validation.

I especially appreciate your showing me how a regex can be written to
handle an arbitrary number of dot-separated numbers (rather than hard-
code distinct sub-expressions).
if line =~ /^ (.*?) [a-zA-Z] /x

I have one question about this regex. I have a book that I bought in
2002 but never read until today: "Mastering Regular Expressions", 2nd
Ed., from O'Reilly. I haven't been able to find any reference in
there to a question-mark following ".*".

I thought I could simply remove the question-mark. That caused the
match to fail and yield the programmed error msg. I tried omitting
the question-mark and add a closing "$" in the regex. That made the
parsing fail. So, your question mark is clearly working, but HOW?

I'm fully on board with the rest of your code. Many thanks for it
all!

Regards,
Richard
 
R

RichardOnRails

I realize you solved your problem, but there is a much more insidious
underlying issues here:

MLN = 2 # MaxLeadingNumbers
MLN
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName
n = Array.new
n
print "DBG0>> "; (1..MLN+1).each { |i|
printf("\t#{n}") };


nasty compounded lines and obtuse debugging output
i_thNumberExists = n ? true : false
iThNumbrerHasLeadingZero = i_thNumberExists &&


i_thNumberExists
iThNumbrerHasLeadingZero

ARGH! My eyes! STOP!!!

Ruby is NOT C.

For the love of all that is ruby, please write readable code!

Use English variable names (or whatever your native language is--just
use words). Don't use hungarian notation, it doesn't help--hell, your
variable names are so ugly you can't even read the misspellings in
them. Don't mix camelcase and underscores. Don't use 'n ? true :
false' when 'n' would do. Don't debug with logging. Write tests. /
[\d]+/ is the same as /\d+/. Those evals are TERRIFYING. Don't use them.

And, most importantly, 2 spaces per indent.

If you follow these suggestions, your code will be much more
understandable and less error prone. Really.


Hi Ryan,

Thanks for taking the trouble to critique my code. Since you were
kind enough to take a thorough look at the code and offer detailed
comments, I feel obligated to address each issue you raised.

I agree with some of your responses, don't understand some, and
suspect some are merely a matter of taste. I'm going to tackle them
one-by-one, but I especially want to thank you for suggesting Unit
Test in Ruby, which I haven't used yet (and don't immediately see how
I'd apply them).

Are you suggesting that rather than using an abbreviation for
MaxLeadingNumbers, I should use the longer string? If so, don't you
think that would create unnecessary clutter? Is this more than a
matter of personal taste?
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/

sName

Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?
n = Array.new
n
print "DBG0>> "; (1..MLN+1).each { |i|
printf("\t#{n}") };


nasty compounded lines and obtuse debugging output


There was one more piece of that line that got cut off somehow:
"puts". If we get aside from the tracing issue for a moment, don't
you think it's better to have one-liner debugging statements that can
be easily deleted later, rather than multi-line statements where
something might be overlooked when deleting them later?
i_thNumberExists = n ? true : false
iThNumbrerHasLeadingZero = i_thNumberExists &&


i_thNumberExists
iThNumbrerHasLeadingZero


To me, "i_thNumberExists" reads naturally in English as "i'th Number
Exists". I stated out with iTh ... and thought it less readable. I
eventually replaced all instances of iTh, but posted my code before I
completed that correction. My apologies; I agree that had I not made
them uniform, they would provide risks of misspelling.
Ruby is NOT C.

I'm a retired software developer with experience with a lot of
languages, but by far the most experience with C and C++ under
Windows. So it takes a little while to adopt the idioms of new
language.
Use English variable names (or whatever your native language is--just
use words). Don't use hungarian notation, it doesn't help
--hell, your
variable names are so ugly you can't even read the misspellings in
them. Don't mix camelcase and underscores.

I addressed those issues above.
Don't use 'n ? true : false'
when 'n' would do.


Agreed. In this case, I wasn't sure that Ruby was honoring that
equivalence, so I wanted to make it explicit.
Don't debug with logging. Write tests.

GOOD POINT!
/[\d]+/ is the same as /\d+/.
Thanks.

Those evals are TERRIFYING. Don't use them.

How else can you loop through $1, $2 ... without repetitive code?
And, most importantly, 2 spaces per indent.

Agreed. I thought I did that when I posted.
If you follow these suggestions, your code will be much more
understandable and less error prone. Really.

I'll keep my fingers crossed :)

Regards,
Richard
 
C

Clifford Heath

RichardOnRails said:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName

Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?

Yes. Adamantly, and definitely yes.

I might have been a bad idea when he had it, even though to his
credit he was trying to make the best of a bad situation, where
MS had bought the worst C compiler on the planet because the good
ones weren't for sale - they could make more money *not* selling
to MS. Because of a spate of bugs and bad code churned out by the
MS software factory, many caused by type mismatches on function
parameters that weren't detected either at compile time or at
runtime, Hungarian notation *might* have been a good idea once.

It's definitely *not* a good idea with modern C, and even less of
a good idea with Ruby.

Clifford Heath.
 
A

Alex Young

Clifford said:
RichardOnRails said:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName

Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?

Yes. Adamantly, and definitely yes.
Interesting article on why (and where it comes from, and why it might
not have been such a bad idea) here:

http://www.joelonsoftware.com/articles/Wrong.html
 
A

Alex Young

RichardOnRails said:
On Nov 19, 2:23 am, Ryan Davis <[email protected]> wrote:

How else can you loop through $1, $2 ... without repetitive code?

I don't think anyone has replied to this, so...

This is what you've got:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/


# Save match variables in n[1] ... n[MLN+1]
n = Array.new
(1..MLN+1).each { |i| eval %{n = $#{i}} }


You can get an equivalent for n with the String#match method:

irb(main):001:0> name = "2.2.02Topic 2.2.2"
=> "2.2.02Topic 2.2.2"
irb(main):002:0> m = name.match( /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/ )
=> #<MatchData:0x3a3a324>
irb(main):003:0> m[1]
=> "2"
irb(main):004:0> m[2]
=> "2"
irb(main):005:0> m[3]
=> "02"

You should be able to use that to get around your second eval, too.
 
R

RichardOnRails

RichardOnRails said:
On Nov 19, 2:23 am, Ryan Davis <[email protected]> wrote:
How else can you loop through $1, $2 ... without repetitive code?

I don't think anyone has replied to this, so...

This is what you've got:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
# Save match variables in n[1] ... n[MLN+1]
n = Array.new
(1..MLN+1).each { |i| eval %{n = $#{i}} }


You can get an equivalent for n with the String#match method:

irb(main):001:0> name = "2.2.02Topic 2.2.2"
=> "2.2.02Topic 2.2.2"
irb(main):002:0> m = name.match( /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/ )
=> #<MatchData:0x3a3a324>
irb(main):003:0> m[1]
=> "2"
irb(main):004:0> m[2]
=> "2"
irb(main):005:0> m[3]
=> "02"

You should be able to use that to get around your second eval, too.


Hi Alex,
m = name.match( /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/ )

Thanks.

Best wishes,
Richard
 
R

RichardOnRails

RichardOnRails said:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName
Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?

Yes. Adamantly, and definitely yes.

I might have been a bad idea when he had it, even though to his
credit he was trying to make the best of a bad situation, where
MS had bought the worst C compiler on the planet because the good
ones weren't for sale - they could make more money *not* selling
to MS. Because of a spate of bugs and bad code churned out by the
MS software factory, many caused by type mismatches on function
parameters that weren't detected either at compile time or at
runtime, Hungarian notation *might* have been a good idea once.

It's definitely *not* a good idea with modern C, and even less of
a good idea with Ruby.

Clifford Heath.

Hi Clifford,
It's definitely *not* a good idea with modern C, and even less of
a good idea with Ruby.

I don't know anything about Microsoft's choice of compilers. But I
used several C compilers in the '80s, and all cases found Hungarian
notation helpful. I don't think Microsoft's initial choice of
compilers is relevant to my and other's successful employment of that
convention.

Why would it be a bad idea with modern C compilers or with Ruby? You
offer no reason. All it does is add one or two letters before names!
That doesn't bother any human or compiler or interpreter. To my
ears, it sounds like you're telling me you like chocolate after
learning that I like vanilla.

I'm reminded of George Wallace's assessment of the difference between
Democrats vs. Republicans: "There's not a dime's worth of difference.

Best wishes,
Richard
 
A

Alex Young

RichardOnRails said:
RichardOnRails said:
sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/
sName
Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?
Yes. Adamantly, and definitely yes.

I might have been a bad idea when he had it, even though to his
credit he was trying to make the best of a bad situation, where
MS had bought the worst C compiler on the planet because the good
ones weren't for sale - they could make more money *not* selling
to MS. Because of a spate of bugs and bad code churned out by the
MS software factory, many caused by type mismatches on function
parameters that weren't detected either at compile time or at
runtime, Hungarian notation *might* have been a good idea once.

It's definitely *not* a good idea with modern C, and even less of
a good idea with Ruby.

Clifford Heath.

Hi Clifford,
It's definitely *not* a good idea with modern C, and even less of
a good idea with Ruby.

I don't know anything about Microsoft's choice of compilers. But I
used several C compilers in the '80s, and all cases found Hungarian
notation helpful. I don't think Microsoft's initial choice of
compilers is relevant to my and other's successful employment of that
convention.

Why would it be a bad idea with modern C compilers or with Ruby? You
offer no reason. All it does is add one or two letters before names!
That doesn't bother any human or compiler or interpreter.

It depends on the type of Hungarian that you're using, and it's not
clear from your code sample which it is. If you're using an
abbreviation prefix to denote a semantic difference within a type, then
that's (potentially) useful, in both Ruby and C:

us_username = read_unsafe_input()
s_username = sanitise(us_username)

with us_ meaning unsafe and s_ meaning safe, for example. Not something
I'd use myself, but I can see the utility. If it's denoting a class,
then it's not something I can see as useful, in either C or Ruby. In C,
you're duplicating the compiler's type-checking, and in Ruby,
duck-typing means that you shouldn't need to care; it becomes
readability-damaging line noise.
 
R

Rick DeNatale

sName =~ /^([\d]+)?\.?([\d]+)?\.?([\d]+)?\.?/

sName

Did you draw attention to this because of the Hungarian notation? If
so, do you think I'm unwise to adopt the style once advocated by
Charles Simoni, super-programmer and co-founder of a giant software
company?

Actually this form of Hungarian notation, which was called System
Hungarian in Microsoft, is NOT what Simonyi originally sugested (and
what was used in the Application Division).

http://talklikeaduck.denhaven2.com/articles/2007/04/09/hungarian-ducks
 
T

Todd Benson

Hi Harry,

I apologize for the "Ryan" thing. Ryan was the first response on this
thread and got mixed up. I don't know about the "Shirley" thing.

Harry is quoting from the movie Airplane (1980) :)

Todd
 
Y

yermej

if line =~ /^ (.*?) [a-zA-Z] /x

I have one question about this regex. I have a book that I bought in
2002 but never read until today: "Mastering Regular Expressions", 2nd
Ed., from O'Reilly. I haven't been able to find any reference in
there to a question-mark following ".*".

I thought I could simply remove the question-mark. That caused the
match to fail and yield the programmed error msg. I tried omitting
the question-mark and add a closing "$" in the regex. That made the
parsing fail. So, your question mark is clearly working, but HOW?

I can help with this one. Check "Mastering Regular Expressions" for
non-greedy operators. Normally, ".*" will match everything it possibly
can. Adding the "?" causes it to do a minimal match -- it will match
as little as necessary to still fill the requirements. In the above
case, it matches everything until the first letter. Without the "?" it
matches everything until the last letter.

Jeremy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top