Question about text parsing

S

Stephen Beard

Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(':'))[1].split)[0]

Is there a better way to do this? Thanks in advance.
 
S

spox

Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(':'))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(':')+1, str.index(' B')-str.index(':')-2)
 
R

Rob Biedenharn

Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do
it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(':'))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(':')+1, str.index(' B')-str.index(':')-2)



irb> str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
=> " inet addr:192.168.1.118 Bcast:192.168.1.255 Mask:
255.255.255.0"
irb> re = /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/
=> /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/
irb> str.scan(re)
=> ["192.168.1.118", "192.168.1.255", "255.255.255.0"]

So the first one is just:
str.scan(re).first
or
str.scan(re)[0]

If you want to be tighter about matching valid IP addresses, look at
the alternate regexps about halfway down this page:
http://www.regular-expressions.info/examples.html

or the grand-daddy one with explanation of all the details:
http://www.regular-expressions.info/regexbuddy/ipaccuratecapture.html

If you include the capturing groups, the String#scan keeps them:

irb> re1 = /\b(?:\d{1,3}\.){3}\d{1,3}\b/
=> /\b(?:\d{1,3}\.){3}\d{1,3}\b/
irb> re2 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
=> /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
irb> re3 = /\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
=> /\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
irb> [re, re1, re2, re3].each do |r|
?> puts '-'*30
irb> puts r
irb> p str.scan(r)
irb> end; nil
------------------------------
(?-mix:\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)
["192.168.1.118", "192.168.1.255", "255.255.255.0"]
------------------------------
(?-mix:\b(?:\d{1,3}\.){3}\d{1,3}\b)
["192.168.1.118", "192.168.1.255", "255.255.255.0"]
------------------------------
(?x-mi:\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
[["192", "168", "1", "118"], ["192", "168", "1", "255"], ["255",
"255", "255", "0"]]
------------------------------
(?x-mi:\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
["192.168.1.118", "192.168.1.255", "255.255.255.0"]
=> nil

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
S

Stephen Beard

Thanks for the quick and thorough replies. They all look great.

I am so terrible with regular expressions. Those links look like a good
place to go to improve, thanks Rob.
 
7

7stud --

Stephen said:
Thanks for the quick and thorough replies. They all look great.

I am so terrible with regular expressions. Those links look like a good
place to go to improve, thanks Rob.

Always look to a string method first. split() rules the world, and
you've made good use of it.
 
D

David A. Black

Hi --

Always look to a string method first. split() rules the world, and
you've made good use of it.

String method isn't the opposite of regular expression, though. It's
important to understand regexes to use split (as well as scan and
(g)sub) effectively.


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
Training! Intro to Ruby, with Black & Kastner, September 14-17
(More info: http://rubyurl.com/vmzN)
 
D

David A. Black

Hi --

Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(':'))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(':')+1, str.index(' B')-str.index(':')-2)

Here's a technique involving subscripting a string with a regular
expressions. The number 1 at the end causes the whole thing to return
the contents of the first parenthetical capture (which is all
consecutive non-space characters following the first colon).

str[/:(\S+)/,1]


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
Training! Intro to Ruby, with Black & Kastner, September 14-17
(More info: http://rubyurl.com/vmzN)
 
W

Wesley Chen

[Note: parts of this message were removed to make it a legal post.]

Perfect.

Thanks.
Wesley Chen.


Hi --


Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(':'))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(':')+1, str.index(' B')-str.index(':')-2)

Here's a technique involving subscripting a string with a regular
expressions. The number 1 at the end causes the whole thing to return
the contents of the first parenthetical capture (which is all
consecutive non-space characters following the first colon).

str[/:(\S+)/,1]



David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
Training! Intro to Ruby, with Black & Kastner, September 14-17
(More info: http://rubyurl.com/vmzN)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top