Quick and dirty word wrapping.

E

Erik Terpstra

In case anyone needs it,

str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)

=> [["This is a test of the"], ["emergency broadcasting"], ["services"]]
 
J

James Edward Gray II

In case anyone needs it,

str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)

=> [["This is a test of the"], ["emergency broadcasting"],
["services"]]

Dang that's cool!

I'm still puzzling out how that works...

James Edward Gray II
 
G

Gavin Kistner

str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)

=> [["This is a test of the"], ["emergency broadcasting"],
["services"]]

Nice. Not to golf, but how about simply:

str = 'This is a test of the emergency broadcasting services'
p str.scan(/.{1,30}\b/)
#=> ["This is a test of the ", "emergency broadcasting ", "services"]

(No flattening required.)
 
G

Gavin Kistner

str = 'This is a test of the emergency broadcasting services'
str.scan(/(.{1,30})(?:\s+|$)/)

=> [["This is a test of the"], ["emergency broadcasting"],
["services"]]

Nice. Not to golf, but how about simply:

str = 'This is a test of the emergency broadcasting services'
p str.scan(/.{1,30}\b/)
#=> ["This is a test of the ", "emergency broadcasting ", "services"]

Oops, because mine will split punctuation from its characters.
However, both of ours will lose lines that are \S{31,}

So:

str = '123456789012345678901234567890This is a test of the emergency
broadcasting system. This is only a test.'

class String
def wrap_to( col_width )
str = self.gsub( /(\S{#{col_width}})(\S)/, '\1 \2' )
str.scan(/(.{1,#{col_width}})(?:\s+|$)/).flatten.join( "\n" )
end
end

puts str.wrap_to( 30 )
123456789012345678901234567890
This is a test of the
emergency broadcasting system.
This is only a test.


puts str.wrap_to( 29 )
12345678901234567890123456789
0This is a test of the
emergency broadcasting
system. This is only a test.
 
G

Gavin Kistner

--Apple-Mail-1-225733389
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

I'm still puzzling out how that works...

Up to 30 characters, but there has to be whitespace after it (to keep
it from splitting in the middle of the word) or be the very end of
the string. The greedy regexp will grab all 30 if it can find them
with whitespace after, otherwise it will backtrack until it finds the
right spot.

Very nice, Erik. I like how it also strips the whitespace that will
be wrapped.


--Apple-Mail-1-225733389--
 
J

James Edward Gray II

Up to 30 characters, but there has to be whitespace after it (to
keep it from splitting in the middle of the word) or be the very
end of the string. The greedy regexp will grab all 30 if it can
find them with whitespace after, otherwise it will backtrack until
it finds the right spot.

I do understand the Regexp, but isn't that a look-ahead assertion at
the end? That's not supposed to consume characters, right? So why
doesn't the very next match start with the leading whitespace that
ended the last match?

I know I just haven't got me head all the way around it yet. I'm
working on it... ;)

James Edward Gray II
 
J

James Edward Gray II

I do understand the Regexp, but isn't that a look-ahead assertion
at the end?

Answering my own dumb question, "No James, that's simple clustering
not a look-ahead. Get your Regexp symbology right man!" Clustering
does consume characters of course, so it now all makes sense to me.

I guess it was just too early in the morning for me... ;)

James Edward Gray II
 
E

email55555

James said:
I do understand the Regexp, but isn't that a look-ahead assertion at
the end? That's not supposed to consume characters, right? So why
doesn't the very next match start with the leading whitespace that
ended the last match?

I know I just haven't got me head all the way around it yet. I'm
working on it... ;)

James Edward Gray II

No, at the end, it is not look-ahead assertion,
the (?: ... ) still consume characters but without grouping.

And when use String#scan and have group, the result will just return
group
anything not in the group will just ignore, for example:
'abcdef'.scan(/(.)./) # ==> [['a'], ['c'], ['e']]

So the str.scan(/(.{1,30})(?:\s+|$)/)
the part (?:\s+|$) will consume space characters but will not be part
of scan result.
 
E

Ezra Zygmuntowicz

I do understand the Regexp, but isn't that a look-ahead assertion
at the end? That's not supposed to consume characters, right? So
why doesn't the very next match start with the leading whitespace
that ended the last match?

I know I just haven't got me head all the way around it yet. I'm
working on it... ;)

James Edward Gray II

I have been using something extremely similar to add some tags to a
text dump of the classified ads for my newspaper. Like this:

line = "<begad:11560454>Clinician PT front office, 50-60 hrs/mo. Send
resumes to: www.omacime.com<endad>"

line.gsub!(/(<begad:[^>]+>)(.{1,50}.*?\b)/, "\\1<ftditm>\\2<\/ftditm>")

#=> "<begad:11560454><ftditm>Clinician PT front office, 50-60 hrs/mo.
Send resumes</ftditm> to: www.omacime.com<endad>"

This takes a line and wraps the <ftditm></ftditm> tags around 50
chars plus whatever is needed to make it to a whitespace char.


Cheers-

-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
509-577-7732
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top