Question regarding design of the String Class

M

Michael W. Ryder

Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is that
in other languages str returns the string starting at position i.
For example C uses t = strcpy(str) and Business Basic uses S$=T$(I)
to copy a string from position i.
I can see no way to do this in Ruby other than using something like: t =
str[i,9999]. It seemed strange that copying ranges of strings uses the
same format as C (t =strncpy(str,n)) but not when copying the remainder.
 
R

Roland Crosby

Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at
position i. For example C uses t = strcpy(str) and Business
Basic uses S$=T$(I) to copy a string from position i.
I can see no way to do this in Ruby other than using something
like: t = str[i,9999]. It seemed strange that copying ranges of
strings uses the same format as C (t =strncpy(str,n)) but not
when copying the remainder.


Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]
 
M

Michael W. Ryder

Roland said:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at position
i. For example C uses t = strcpy(str) and Business Basic uses
S$=T$(I) to copy a string from position i.
I can see no way to do this in Ruby other than using something like: t
= str[i,9999]. It seemed strange that copying ranges of strings uses
the same format as C (t =strncpy(str,n)) but not when copying the
remainder.


Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]

If I enter:
a = "This is a test."
b = a[1, -1]
puts b
irb returns nil. Obviously this is not what I want. If instead of -1 I
use 9999 it returns "his a test." which is what I was looking for. This
seems like a kludge and an inconsistency. Like I pointed out other
languages just use b = a[1] to get the remainder of the string instead
of 104. The string class already has methods like each_byte for
converting characters in a string to a number, so why does it need
another shortcut for something that is probably very rarely used.
 
R

Roland Crosby

Roland said:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this
is that in other languages str returns the string starting at
position i. For example C uses t = strcpy(str) and Business
Basic uses S$=T$(I) to copy a string from position i.
I can see no way to do this in Ruby other than using something
like: t = str[i,9999]. It seemed strange that copying ranges of
strings uses the same format as C (t =strncpy(str,n)) but not
when copying the remainder.

Try str[i,-1], or one of the myriad other ways to access ranges of
a string as defined in String#[]

If I enter:
a = "This is a test."
b = a[1, -1]
puts b
irb returns nil. Obviously this is not what I want. If instead of
-1 I use 9999 it returns "his a test." which is what I was looking
for. This seems like a kludge and an inconsistency. Like I
pointed out other languages just use b = a[1] to get the remainder
of the string instead of 104. The string class already has methods
like each_byte for converting characters in a string to a number,
so why does it need another shortcut for something that is probably
very rarely used.


Sorry, I meant a[1..-1] rather than a[1,-1]. I don't know why Ruby
returns the character codes like that, but for what it's worth, I
believe Ruby 1.9 is going to switch to returning a single-character
string when you put one integer in String#[].
 
D

Daniel Martin

Michael W. Ryder said:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at position
i. For example C uses t = strcpy(str) and Business Basic uses
S$=T$(I) to copy a string from position i.


I can't comment on what "Business Basic" uses, but your C code is
completely wrong. In C, str returns a char which, since C has
"char" as one of its integral types, is equivalent to returning the
character code.

The usual usage of strcpy to copy only from the second (index 1)
character onward is:

strcpy(dest, src + 1);

(And incidentally, using strcpy instead of strncpy is a practice that
often leads to security vulnerabilities)

In other words, ruby's behavior with str matches the behavior of C
- it returns the character at that position, where "character" is
viewed simply as a number.
 
R

Rick DeNatale

Michael W. Ryder said:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at position
i. For example C uses t = strcpy(str) and Business Basic uses
S$=T$(I) to copy a string from position i.


I can't comment on what "Business Basic" uses, but your C code is
completely wrong. In C, str returns a char which, since C has
"char" as one of its integral types, is equivalent to returning the
character code.


Daniel, your points are well taken, but if the rusty old neurons in
my brain which contain knowledge of C aren't mistaken, str isn't a
function, and therefore doesn't 'return' anything.

C doesn't really have a string type. A string literal is really an
array of chars, although in almost all cases (i.e. either than when
it's used in a string initializer, or as the argument to sizeof), it's
interpreted as a pointer to the first character, due to the
relationship between arrays and pointers in C.

So if str is declared either as:

char str[];
or
char *str;

the expression str is equivalent to *((str) + (i)), it's really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.

And in Ruby the whole notion of pointers is meaningless.

The point here, of course, is that when learning Ruby, or any other
language, one needs to be aware that things one knows from other
languages often don't carry over without conceptual modification, if
at all.

If all languages did everything exactly the same way, there'd be no
need for so many of them.

To sum it up, let Ruby be Ruby, don't expect it to be Java, C++,
Visual Basic, or anything else.
 
M

Michael W. Ryder

Rick said:
Michael W. Ryder said:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at position
i. For example C uses t = strcpy(str) and Business Basic uses
S$=T$(I) to copy a string from position i.


I can't comment on what "Business Basic" uses, but your C code is
completely wrong. In C, str returns a char which, since C has
"char" as one of its integral types, is equivalent to returning the
character code.


Daniel, your points are well taken, but if the rusty old neurons in
my brain which contain knowledge of C aren't mistaken, str isn't a
function, and therefore doesn't 'return' anything.

C doesn't really have a string type. A string literal is really an
array of chars, although in almost all cases (i.e. either than when
it's used in a string initializer, or as the argument to sizeof), it's
interpreted as a pointer to the first character, due to the
relationship between arrays and pointers in C.

So if str is declared either as:

char str[];
or
char *str;

the expression str is equivalent to *((str) + (i)), it's really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.

And in Ruby the whole notion of pointers is meaningless.

The point here, of course, is that when learning Ruby, or any other
language, one needs to be aware that things one knows from other
languages often don't carry over without conceptual modification, if
at all.

If all languages did everything exactly the same way, there'd be no
need for so many of them.

To sum it up, let Ruby be Ruby, don't expect it to be Java, C++,
Visual Basic, or anything else.


I guess my point was that str behaves totally different from all the
other implementations of []. All of the others return a string. This
seems to be an inconsistency. If there is a valid reason for it I have
no problem, it just makes it harder to transfer over 25 years of
experience to a new language.
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t. Others may not have this problem.
 
M

Michael W. Ryder

Roland said:
Roland said:
On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:
Was there a reason the string class was implemented with str
returning the code of position i in str? The reason I ask this is
that in other languages str returns the string starting at
position i. For example C uses t = strcpy(str) and Business Basic
uses S$=T$(I) to copy a string from position i.
I can see no way to do this in Ruby other than using something like:
t = str[i,9999]. It seemed strange that copying ranges of strings
uses the same format as C (t =strncpy(str,n)) but not when
copying the remainder.
Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]

If I enter:
a = "This is a test."
b = a[1, -1]
puts b
irb returns nil. Obviously this is not what I want. If instead of -1
I use 9999 it returns "his a test." which is what I was looking for.
This seems like a kludge and an inconsistency. Like I pointed out
other languages just use b = a[1] to get the remainder of the string
instead of 104. The string class already has methods like each_byte
for converting characters in a string to a number, so why does it need
another shortcut for something that is probably very rarely used.


Sorry, I meant a[1..-1] rather than a[1,-1].


I figured that out right after I posted my reply. I had forgotten about
ranges as I have never programmed in a language that used them before.
The little differences can really get you, especially when you find so
many similarities.


I don't know why Ruby
returns the character codes like that, but for what it's worth, I
believe Ruby 1.9 is going to switch to returning a single-character
string when you put one integer in String#[].
 
B

Brian Candler

I guess my point was that str behaves totally different from all the
other implementations of []. All of the others return a string.


You clearly know a lot of languages then :)

As pointed out before, in C, str is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn't do what you expect either:

$ perl -e '$a = "abcde"; print $a[2], "\n";'

$

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t. Others may not have this problem.


Personally I would be *very* surprised if str returned all the characters
from 'i' to the end of the string. But then I don't program in Business
Basic.

I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str

In Perl you have to be explicit and call substr()

Brian.
 
M

Michael W. Ryder

Brian said:
I guess my point was that str behaves totally different from all the
other implementations of []. All of the others return a string.


You clearly know a lot of languages then :)


I probably should have phrased that differently. What I meant that all
of the other implementations of [] in Ruby for the String class return a
string, only str returns a number.
As pointed out before, in C, str is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn't do what you expect either:

$ perl -e '$a = "abcde"; print $a[2], "\n";'

$

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t. Others may not have this problem.


Personally I would be *very* surprised if str returned all the characters
from 'i' to the end of the string. But then I don't program in Business
Basic.

Business Basic has been doing this for over the 25 years I have been
programming in it. For example if I enter: A$="abcdefg" and then say:
Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
contains cde. Other than the beginning number of the string they act
the same.
I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str


But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.
 
R

Robert Dober

Brian said:
I guess my point was that str behaves totally different from all the
other implementations of []. All of the others return a string.


You clearly know a lot of languages then :)


I probably should have phrased that differently. What I meant that all
of the other implementations of [] in Ruby for the String class return a
string, only str returns a number.

I copy that, you have made a somehow valid point, that has been
discussed before and do not like either that "ab"[0] == ?a (instead
of "a"). But it is not a clearcut error either.

The overloading (in human terms not computer science terms) of [] to
get elements and substrings of a string might not be the best choice
either. And that there is String#each_byte and not
String#each_character might hurt too.
But there are other tools around that make up for it.
As pointed out before, in C, str is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn't do what you expect either:

$ perl -e '$a = "abcde"; print $a[2], "\n";'

$

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t. Others may not have this problem.


I guess that the influence of *Basic and C* to Ruby are minimal. In
order to convince a rubyist that other features might be nice because
they are present in language X, I'd rather chose X from Python, Lisp,
Smalltalk, Self, IO or Lua (and I am leaving out some by laziness and
ignorance)
Personally I would be *very* surprised if str returned all the characters
from 'i' to the end of the string. But then I don't program in Business
Basic.

Business Basic has been doing this for over the 25 years I have been
programming in it. For example if I enter: A$="abcdefg" and then say:
Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
contains cde. Other than the beginning number of the string they act
the same.
I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str


But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.

Well if you want to get the maximum from Ruby I'd advice you, sorry if
this is sounding blunt, to take a break from too much comparing with
other languages.
Paradigm shifts are tough, after that break you might still think that
"ab"[0] == ?a is
not a good thing, but I am sure that you will be able to bring your
point across much better.

Sorry if I became lecturing, just thought it might help, after all ;).
I remember very well when I was lectured about duck typing, first I
was angry, and I said lots of stupid things (they were very clever in
my Ada world of course), but when I let go and looked at things as
they were I really shifted into the paradigm of Ruby, and yes I still
get bitten by duck typing and no I do not introduce type checking, I
just write better tests.

Welcome to Ruby.

Cheers
Robert
 
M

Michael W. Ryder

Robert said:
Brian said:
On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:
I guess my point was that str behaves totally different from all the
other implementations of []. All of the others return a string.

You clearly know a lot of languages then :)


I probably should have phrased that differently. What I meant that all
of the other implementations of [] in Ruby for the String class return a
string, only str returns a number.

I copy that, you have made a somehow valid point, that has been
discussed before and do not like either that "ab"[0] == ?a (instead
of "a"). But it is not a clearcut error either.

The overloading (in human terms not computer science terms) of [] to
get elements and substrings of a string might not be the best choice
either. And that there is String#each_byte and not
String#each_character might hurt too.
But there are other tools around that make up for it.
As pointed out before, in C, str is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn't do what you expect either:

$ perl -e '$a = "abcde"; print $a[2], "\n";'

$

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)

In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t. Others may not have this problem.

I guess that the influence of *Basic and C* to Ruby are minimal. In
order to convince a rubyist that other features might be nice because
they are present in language X, I'd rather chose X from Python, Lisp,
Smalltalk, Self, IO or Lua (and I am leaving out some by laziness and
ignorance)


I am not worried about the influence of other languages or trying to
make Ruby like language X, I am trying to understand the logic behind
some of the choices. It makes it hard when at first glance it looks
familiar, but then you get bitten because of the differences that are
not readily apparent.
Personally I would be *very* surprised if str returned all the characters
from 'i' to the end of the string. But then I don't program in Business
Basic.

Business Basic has been doing this for over the 25 years I have been
programming in it. For example if I enter: A$="abcdefg" and then say:
Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
contains cde. Other than the beginning number of the string they act
the same.
I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str


But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.

Well if you want to get the maximum from Ruby I'd advice you, sorry if
this is sounding blunt, to take a break from too much comparing with
other languages.
Paradigm shifts are tough, after that break you might still think that
"ab"[0] == ?a is
not a good thing, but I am sure that you will be able to bring your
point across much better.

Sorry if I became lecturing, just thought it might help, after all ;).
I remember very well when I was lectured about duck typing, first I
was angry, and I said lots of stupid things (they were very clever in
my Ada world of course), but when I let go and looked at things as
they were I really shifted into the paradigm of Ruby, and yes I still
get bitten by duck typing and no I do not introduce type checking, I
just write better tests.


I program for a living in Business Basic so I can't take a break from
the language to learn Ruby. Instead, I am trying to convert functions
from Business Basic to Ruby and may consider then trying to convert some
of my programs to Ruby. I have already completed one of the function
conversions and may post the results later. Part of the reason for the
conversion would be the ability to "upgrade" the programs to use things
like browsers and SQL instead of the glass tty and flat files built into
Business Basic.
 
D

Daniel Martin

One small nit:

Rick DeNatale said:
the expression str is equivalent to *((str) + (i)), it's really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.


No. str is indeed equivalent to *((str) + (i)), but that's not a
pointer to a char.

((str) + (i)) is a pointer to a char.

*((str) + (i)) is a char.

*((str) + (i)) cannot be interpreted as an array of chars starting i
characters into the original string.

((str) + (i)) could be so interpreted. ((str) + (i)) is equivalent to
&(str) which is a very different thing from str.
 
L

Lee

You know, you can always type:

s = 'Hello World'
puts s[1,s.length - 1]

Maybe that's a bit wordy for you? I think Ruby's use of [] really
follows convention that you are accessing an index. That is what most
programmers think of when they see []. If you provide a single
Fixnum, one would expect you would get back the contents of that
index, and since strings are arrays of characters, that is what you
are indeed getting back.
 
R

Robert Dober

I program for a living in Business Basic so I can't take a break from
the language to learn Ruby. Instead, I am trying to convert functions
from Business Basic to Ruby and may consider then trying to convert some
of my programs to Ruby. I have already completed one of the function
conversions and may post the results later. Part of the reason for the
conversion would be the ability to "upgrade" the programs to use things
like browsers and SQL instead of the glass tty and flat files built into
Business Basic.
I see that makes it particularly difficult, well it just might be a
slower process.
My hint would than be to concentrate on things that seem logical to
you and continue discussing things that do not on this list.
The problem with the single item you chose is that it is indeed a
little odd, but useful in some circumstances.
Maybe you should just live with the str[i..i] notation for a while and
focus on other things.
Sorry for not being more helpful :(

Cheers
Robert
 
M

Michael W. Ryder

Lee said:
You know, you can always type:

s = 'Hello World'
puts s[1,s.length - 1]

Maybe that's a bit wordy for you? I think Ruby's use of [] really
follows convention that you are accessing an index. That is what most
programmers think of when they see []. If you provide a single
Fixnum, one would expect you would get back the contents of that
index, and since strings are arrays of characters, that is what you
are indeed getting back.

If I were getting back a character or string I would not have a problem.
The problem is that every other use of [] in the string class returns
a string or nil, this one returns an integer. I was curious if this was
necessary for some other part of the language or just an "accident".
 
M

Michael W. Ryder

Robert said:
I program for a living in Business Basic so I can't take a break from
the language to learn Ruby. Instead, I am trying to convert functions
from Business Basic to Ruby and may consider then trying to convert some
of my programs to Ruby. I have already completed one of the function
conversions and may post the results later. Part of the reason for the
conversion would be the ability to "upgrade" the programs to use things
like browsers and SQL instead of the glass tty and flat files built into
Business Basic.
I see that makes it particularly difficult, well it just might be a
slower process.
My hint would than be to concentrate on things that seem logical to
you and continue discussing things that do not on this list.
The problem with the single item you chose is that it is indeed a
little odd, but useful in some circumstances.
Maybe you should just live with the str[i..i] notation for a while and
focus on other things.
Sorry for not being more helpful :(

Thats what I plan on doing. Of course my next project will really test
me as I am going to have to see if there is a way to duplicate a
function in Business Basic that returns the position of a string in
another string. I know that if I "hard code" the search string I can
use str =~ /test/ to get what I want for some cases. The problem is
that I am trying to make test also a string and haven't figured out how
to do this yet. Just using str.include? test returns either true or
false but not the position of test in str. To make this even more
complicated I want to be able to say something like a = pos(b,c,x) where
b and c are strings and x is an integer. This would mean that the
search only started on every x characters -- i.e. if b = "1234" and c =
"234123411234" a would equal 9 rather than 3.
Thanks for the input.
 
P

Peña, Botp

From: Michael W. Ryder [mailto:[email protected]]=20
# If I were getting back a character or string I would not have=20
# a problem. The problem is that every other use of [] in the string=20
# class returns a string or nil, this one returns an integer. I was=20
# curious if this was necessary for some other part of the language=20
# or just an "accident".

evolution perhaps? ruby caters low/old to high-level/newer problem =
domains, from specific to general, so...

http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html
 
G

Gary Wright

If I were getting back a character or string I would not have a
problem. The problem is that every other use of [] in the string
class returns a string or nil, this one returns an integer. I was
curious if this was necessary for some other part of the language
or just an "accident".

A string serves a dual purpose in Ruby, it is a container for textual
data and it is also a container for binary data (an array of bytes).

When dealing with binary data, the ability to extract a single byte
at an offset is a common operation and so

buffer[offset] # returns fixnum

is a very handy syntax. As you've noted, that isn't as useful when
you consider a string to be text and I believe the direction in Ruby
1.9 is to make the syntax lean towards the string as text
interpretation.
In Ruby 1.9, s[offset] return the one character string starting at
offset, where offset is interpreted within the context of the string
encoding. That is to say, offset is not a *byte* offset in this case
since the goal is to support a variety of encodings.

Gary Wright
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top