a helpful idea if you're looking for something to do

S

Simon Schuster

compile a whole lot of ruby regex examples, with commentary on what's
going on. the few websites I've found, and books I've looked through
just touch on the basics with minimal examples and explanation, or are
specifically for perl/etc. a nice-looking and lengthy site could be
extremely helpful to a lot of people starting with ruby, I imagine.

- dealing with unicode?
- mingling literal " / \ etc, with their regex counterparts, in ways
that would be daunting for the inexperienced
- just generally "higher-level" regex, leave the "intro to regex" to
all the other places. that's easy enough to find.
 
K

Konrad Meyer

--DSPAM_MULTIPART_EX-12044
Content-Type: multipart/signed;
boundary="nextPart1309867.UVUxRD5DY7";
protocol="application/pgp-signature";
micalg=pgp-sha1

--nextPart1309867.UVUxRD5DY7
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

compile a whole lot of ruby regex examples, with commentary on what's
going on. the few websites I've found, and books I've looked through
just touch on the basics with minimal examples and explanation, or are
specifically for perl/etc. a nice-looking and lengthy site could be
extremely helpful to a lot of people starting with ruby, I imagine.
=20
- dealing with unicode?
- mingling literal " / \ etc, with their regex counterparts, in ways
that would be daunting for the inexperienced
- just generally "higher-level" regex, leave the "intro to regex" to
all the other places. that's easy enough to find.

=46or the most part, the ruby regex engine is perl-like. And instead of hav=
ing
to escape /s, we get stuff like %r@regex/bar@i. ZenSpider's Ruby QuickRef i=
s a
great place to go for beginner help.

http://www.zenspider.com/Languages/Ruby/QuickRef.html#11

=2D-=20
Konrad Meyer <[email protected]> http://konrad.sobertillnoon.com/

--nextPart1309867.UVUxRD5DY7
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBGxI8pCHB0oCiR2cwRAjyuAJ9t1p0TMLSkCOvE8ZQYH6iI1G32TwCfc0pq
cZHRMq7aryX+pjfrIdf5Fuc=
=RyOV
-----END PGP SIGNATURE-----

--nextPart1309867.UVUxRD5DY7--

--DSPAM_MULTIPART_EX-12044
Content-Type: text/plain
X-DSPAM-Signature: 46c48f2e120441198164824

!DSPAM:46c48f2e120441198164824!
--DSPAM_MULTIPART_EX-12044--
 
D

Dan Zwell

Simon said:
compile a whole lot of ruby regex examples, with commentary on what's
going on. the few websites I've found, and books I've looked through
just touch on the basics with minimal examples and explanation, or are
specifically for perl/etc. a nice-looking and lengthy site could be
extremely helpful to a lot of people starting with ruby, I imagine.

- dealing with unicode?
This one bothered me a lot, but the solution is simple. At the beginning
of the document, set
$KCODE = "u"

This will fix regex behavior for use with regular expressions. I assume
the default behavior will be improved with Ruby 2.0, but I'm not using
1.9 so can't say for sure.
- mingling literal " / \ etc, with their regex counterparts, in ways
that would be daunting for the inexperienced
The first think to keep in mind is that it never hurts to accidentally
escape something in a double quoted (soft quoted) string or regex. So if
you aren't sure, "\"", "\'", "\\" are all okay, as are /\"/, /\//, and
%r|\/| (the latter being an alternative way to specify a regex. But you
only need to escape characters that have special meaning. So in a
slash-delimited regex, a slash has special meaning, but in a %r regex,
it does not:
%{/} is the same as /\//, as the former does not need to be escaped.

If you use Regexp.new(" ... "), then the regexp comes from a string, and
needs to follow the escaping rules for strings--you need to escape
double quotes.

A single quoted string is sometimes called "hard quoted". This means
nothing is expanded / nothing has special meaning, so nothing needs to
be escaped. Slash is not an escape character, here. The one exception is
if the slash is before a single quote, in which case it will escape it.

Sorry if these rules are confusing. You will get used to them. The way
to learn regular expressions is to use them. You will get comfortable
with them when you need them.
- just generally "higher-level" regex, leave the "intro to regex" to
all the other places. that's easy enough to find.

Here's one of mine:
/<a[^>]+?href=['"]?(.+?)['"\s>][^>]*>/im
This matches a link. Throughout the regex I use [^>] frequently, which
means "any character that does not end the tag". Think of [^>]* as a
better .*
Interesting bits:
-using +? says that the match is non-greedy. It will match as little as
possible. *? does the same think, but I find less use for it, as it
usually matches an empty string.
-the /i and /m at the end mean "case insensitive" and "multi-line". You
can mix and match from /i, /m, /x (extended--ignores whitespace in the
regex).

I don't know what your level is, so this may be a bit too cryptic, but
you can probably puzzle it out if you are complaining about regex
tutorials being too basic.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top