Capitalization

J

Jason Vogel

Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.

Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')

Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork

Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork

Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")

Thanks,
Jason
 
D

Daniel Finnie

Try this:
str.gsub(/[A-Za-z]+/) {|x| x.capitalize}

If you want the W of W/ uncapitalized:
str.downcase.gsub(/[A-Za-z]+(?!\/)/) {|x| x.capitalize}

Paul said:
Jason said:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.

Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')

Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork

Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork

Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")

How many special cases? In the worst case, you would have to use a
dictionary to avoid treating acronyms as a word. You already have two
rather difficult rules, one having to do with acronyms, another having to
do with special treatment of the sequence "w/".

What I am saying is this is likely to be more difficult than it seems,
especially because we only have one example of what might end up being
thousands of examples of free-form text.
 
D

Daniel Finnie

Oops, forgot to paste this one in:
To get keep words like "of" and "is" lowercase: (basically anything
under 3 letters)
text.downcase.gsub(/[A-Za-z]{3,}(?!\/)/) {|x| x.capitalize}


Daniel said:
Try this:
str.gsub(/[A-Za-z]+/) {|x| x.capitalize}

If you want the W of W/ uncapitalized:
str.downcase.gsub(/[A-Za-z]+(?!\/)/) {|x| x.capitalize}

Paul said:
Jason said:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.

Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')

Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork

Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork

Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")

How many special cases? In the worst case, you would have to use a
dictionary to avoid treating acronyms as a word. You already have two
rather difficult rules, one having to do with acronyms, another having to
do with special treatment of the sequence "w/".

What I am saying is this is likely to be more difficult than it seems,
especially because we only have one example of what might end up being
thousands of examples of free-form text.
 
J

Jacob Fugal

Oops, forgot to paste this one in:
To get keep words like "of" and "is" lowercase: (basically anything
under 3 letters)
text.downcase.gsub(/[A-Za-z]{3,}(?!\/)/) {|x| x.capitalize}

I agree with Paul Lutus, there are too many special cases. And
Daniel's regex here is a good example. I can spot at least three (to
me) obvious errors:

1) Anything with a '/' trailing will not get capitalized, so in the
OP's example, neither "heat" nor "ac" would be capitalized at all.

2) There are plenty of words with fewer than three letters that should
be capitalized. The first person pronoun "I", for instance. Or even
"of" or "is", if they're the first word in the sentence.

3) In the absence of 1 and 2, "ac" would still get turned into "Ac"
rather than "AC".

Jacob Fugal
 
D

Daniel Finnie

Jacob said:
Oops, forgot to paste this one in:
To get keep words like "of" and "is" lowercase: (basically anything
under 3 letters)
text.downcase.gsub(/[A-Za-z]{3,}(?!\/)/) {|x| x.capitalize}

I agree with Paul Lutus, there are too many special cases. And
Daniel's regex here is a good example. I can spot at least three (to
me) obvious errors:

1) Anything with a '/' trailing will not get capitalized, so in the
OP's example, neither "heat" nor "ac" would be capitalized at all.
Trailing /'s do work as long as the word before it is at least 3 letters
long.
irb(main):004:0> src.downcase.gsub(/[A-Za-z]{3,}(?!\/)/) {|x| x.capitalize}
=> "Additional Spa (Only Available w/Purchase of Pool or Spa) Seller
Heat/ac/Ductwork "
2) There are plenty of words with fewer than three letters that should
be capitalized. The first person pronoun "I", for instance. Or even
"of" or "is", if they're the first word in the sentence.

3) In the absence of 1 and 2, "ac" would still get turned into "Ac"
rather than "AC".

These are valid points that I feel shouldn't be incorporated into the
original regexp.
 
W

William James

Jason said:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.

Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')

Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork

Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork

Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")

Thanks,
Jason

specials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }

puts DATA.read.downcase.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join

__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK


--- output -----
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork
 
J

Jason Vogel

William,

This is exactly what I'm looking for. I don't understand it, but it's
what I'm looking for.

Would you mind explaining what your code does?

Thanks,
Jason



Jason said:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.
Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')
Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK
Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork
Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork
Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")
Thanks,
Jasonspecials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }

puts DATA.read.downcase.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join

__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

--- output -----
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork
 
W

William James

Jason said:
William,

This is exactly what I'm looking for. I don't understand it, but it's
what I'm looking for.

Would you mind explaining what your code does?

Thanks,
Jason



Jason said:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.
Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')
Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK
Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork
Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork
Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")
Thanks,
Jasonspecials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }

puts DATA.read.downcase.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join

__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK

--- output -----
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork


It helps to inspect the data structures.

Try:

specials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }

p specials

text = DATA.read.downcase
p text.split( /([^a-z]+)/ )
puts text.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join

__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK
 
J

Jason Vogel

Jason said:
This is exactly what I'm looking for. I don't understand it, but it's
what I'm looking for.
Would you mind explaining what your code does?
Thanks,
Jason

Jason Vogel wrote:
Disclaimer : Ruby Nuby and I don't know RegEx basically at all. I know
RegEx is the answer, just don't know where to start.
Current Source:
str.split(' ').each {|w| w.capitalize!}.join(' ')
Text:
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK
Result:
Additional Spa (only Available W/purchase Of Pool Or Spa)
Seller Heat/ac/ductwork
Desired:
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/Ductwork
Isssus:
- Need to capitalize after a "/'
- Need specific word case handling (e.g. "Ac" => "AC","or" => "or",
"w/[a]" => "w/[A]")
Thanks,
Jasonspecials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }
puts DATA.read.downcase.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join
__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK
--- output -----
Additional Spa (Only Available w/Purchase of Pool or Spa)
Seller Heat/AC/DuctworkIt helps to inspect the data structures.


Try:

specials = %w( of or w AC ).
inject({}){|h,s| h.update({s.downcase,s}) }

p specials

text = DATA.read.downcase
p text.split( /([^a-z]+)/ )
puts text.split( /([^a-z]+)/ ).map{|s|
specials or s.capitalize }.join

__END__
ADDITIONAL SPA (ONLY AVAILABLE W/PURCHASE OF POOL OR SPA)
SELLER HEAT/AC/DUCTWORK


Paul and William,

Thank you both for taking the time to respond and explain. I really
appreciate it.

Thanks,
Jason
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,088
Latest member
JeremyMedl

Latest Threads

Top