String#capitalize more complex

  • Thread starter Iñaki Baz Castillo
  • Start date
I

Iñaki Baz Castillo

Hi, I receive headers (the protocol syntax allows them being lowcase,
upcase or a mix). For example, the following header names are
equivalent:

a) Record-Route
b) record-route
c) RECORD-ROUTE
d) Record-ROUTE

I'm trying to do a method to convert all of them to:

Record-Route


Basically what I need is to capitalize all the header name parts after
splitting it using "-" as separator. I can do it as follows:

------------
hname =3D "Record-ROUTE"
hname.split("-").map {|w| w.capitalize }.join("-") } =3D> "Record-Route"
------------

Benchmark.realtime { hname.split("-").map {|w| w.capitalize }.join("-") }
1.69277191162109e-05


Is there any option even faster? I think it would be faster if I
operate directly in the original string instead of doing "split" and
"join" (since these last methods create more strings so allocate
memory for them and so). I would prefer to do the modification "in
place".

BTW, could I know how to convert "a" char into "A"? I expected
something as in c:

'a' + 1 =3D> 'b'
'a' + =C2=BFX? =3D> 'A'

but in Ruby#String I just find String#next which just increment the
char in one unit ("a".next =3D> "b").

Any suggestion? Thanks a lot.

Thanks a lot.


--=20
I=C3=B1aki Baz Castillo
<[email protected]>
 
A

Andrew Timberlake

Hi, I receive headers (the protocol syntax allows them being lowcase,
upcase or a mix). For example, the following header names are
equivalent:

a) Record-Route
b) record-route
c) RECORD-ROUTE
d) Record-ROUTE

I'm trying to do a method to convert all of them to:

=A0Record-Route


Basically what I need is to capitalize all the header name parts after
splitting it using "-" as separator. I can do it as follows:

------------
hname =3D =A0"Record-ROUTE"
hname.split("-").map {|w| w.capitalize }.join("-") } =A0=3D> =A0"Record-R= oute"
------------

Benchmark.realtime { hname.split("-").map {|w| w.capitalize }.join("-") }
1.69277191162109e-05


Is there any option even faster? I think it would be faster if I
operate directly in the original string instead of doing "split" and
"join" (since these last methods create more strings so allocate
memory for them and so). I would prefer to do the modification "in
place".

BTW, could I know how to convert "a" char into "A"? I expected
something as in c:

=A0'a' + 1 =3D> 'b'
=A0'a' + =BFX? =3D> 'A'

but in Ruby#String I just find String#next which just increment the
char in one unit ("a".next =3D> "b").

Any suggestion? Thanks a lot.

Thanks a lot.

Have a look at Net::HTTPHeader and it's code as all of this is handled in t=
here
http://www.ruby-doc.org/core/classes/Net/HTTPHeader.html

Andrew Timberlake
http://ramblingsonrails.com
http://www.linkedin.com/in/andrewtimberlake

"I have never let my schooling interfere with my education" - Mark Twain
 
I

Iñaki Baz Castillo

2009/4/2 I=C3=B1aki Baz Castillo said:
Is there any option even faster? I think it would be faster if I
operate directly in the original string instead of doing "split" and
"join" (since these last methods create more strings so allocate
memory for them and so). I would prefer to do the modification "in
place".

BTW, could I know how to convert "a" char into "A"? I expected
something as in c:

=C2=A0'a' + 1 =3D> 'b'
=C2=A0'a' + =C2=BFX? =3D> 'A'


I've done it in two ways:


-----------------------------
require "benchmark"


hname1 =3D "record-route"
hname2 =3D "record-route"

rt1=3D0.0
rt2=3D0.0


### Method 1

1.upto(10) do
rt1 +=3D Benchmark.realtime { hname1 =3D hname1.split("-").map {|w|
w.capitalize }.join("-") }
end
puts "hname1 =3D #{hname1} (time =3D #{rt1/10})"


### Method 2

1.upto(10) do
i=3D0
rt2 +=3D Benchmark.realtime do
hname2.capitalize!.each_byte do |b|
if b=3D=3D45 && i<hname2.bytesize-1
hname2[i+1] =3D [hname2[i+1].unpack("*c")[0] - 32].pack("*c")
end
i+=3D1
end
end
end
puts "hname2 =3D #{hname2} (time =3D #{rt2/10})"
------------------------------

Results:

hname1 =3D Record-Route (time =3D 8.24928283691406e-06)
hname2 =3D Record-Route (time =3D 1.05381011962891e-05)


This is: even if method 1 uses split and join (so it generates new
strings and stores them in memory) it's faster than method 2.
Unfortunatelly method 2 also generates strings. I really wonder if
there is a nicer way to do the same in a more fast way.

Thanks.





--=20
I=C3=B1aki Baz Castillo
<[email protected]>
 
T

trans

Hi, I receive headers (the protocol syntax allows them being lowcase,
upcase or a mix). For example, the following header names are
equivalent:

a) Record-Route
b) record-route
c) RECORD-ROUTE
d) Record-ROUTE

I'm trying to do a method to convert all of them to:

=A0 Record-Route

Basically what I need is to capitalize all the header name parts after
splitting it using "-" as separator. I can do it as follows:

------------
hname =3D =A0"Record-ROUTE"
hname.split("-").map {|w| w.capitalize }.join("-") } =A0=3D> =A0"Record-R= oute"
------------

Benchmark.realtime { hname.split("-").map {|w| w.capitalize }.join("-") }
1.69277191162109e-05

Is there any option even faster? I think it would be faster if I
operate directly in the original string instead of doing "split" and
"join" (since these last methods create more strings so allocate
memory for them and so). I would prefer to do the modification "in
place".

BTW, could I know how to convert "a" char into "A"? I expected
something as in c:

=A0 'a' + 1 =3D> 'b'
=A0 'a' + =BFX? =3D> 'A'

but in Ruby#String I just find String#next which just increment the
char in one unit ("a".next =3D> "b").

Any suggestion? Thanks a lot.

Thanks a lot.

I would look at the camelcase implementations of Facets, English or
ActiveSupport. They are pretty close to what you want. You may even be
able to use one of them outright.

T.
 
I

Iñaki Baz Castillo

El Jueves 02 Abril 2009, trans escribi=F3:
I would look at the camelcase implementations of Facets, English or
ActiveSupport. They are pretty close to what you want. You may even be
able to use one of them outright.

Thanks, but finally I've decided to create a Ruby C extension for that. The=
=20
code is very simple and fast:

=2D--------------------

/*
* Capitalize each word of a header name. Examples:
* - "record-route" =3D> "Record-Route"
* - "from" =3D> "From"
* - "VIA" =3D> "Via"
*
*/

char *string;
int i;
=09
string =3D XXXXXX
=09
if ( string[0] >=3D 'a' && string[0] <=3D 'z' ) // 'a'=3D97, 'z'=3D122
string[0] -=3D 32;
=09
for (i=3D1; i<strlen(string); i++) {
if ( string[i-1] =3D=3D '-' && string >=3D 'a' && string <=3D 'z' )
string -=3D 32;
else if ( string >=3D 'A' && string <=3D 'Z' )
string +=3D 32;
}


=2D-=20
I=F1aki Baz Castillo <[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top