Scanning a string for decimal numbers

J

Jeppe Jakobsen

------=_Part_20441_2617855.1139091631058
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a =3D "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?

------=_Part_20441_2617855.1139091631058--
 
E

Ernest Ellingson

Kent said:
"24,4 + 55,2".scan /[\d,]+/

Kent

Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a = "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?
try
a.scan!(/\d+,\d+/)
Ernie
 
A

ara.t.howard

Kent said:
"24,4 + 55,2".scan /[\d,]+/

Kent

Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a = "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?
try
a.scan!(/\d+,\d+/)
Ernie

careful. you'll kill negatives.

-a
 
J

Josef 'Jupp' SCHUGT

Hi!

a.scan /[-+]?[0-9]*\,?[0-9]+/

Shouldn't that rather be the following?

a.scan /[-+]?([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)/

Josef 'Jupp' Schugt
 
J

Jeppe Jakobsen

------=_Part_1510_14941765.1139699553034
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Will that expression include both integers and decimal numbers?

------=_Part_1510_14941765.1139699553034--
 
W

Wilson Bilkovich

Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a =3D "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?
This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]
 
J

Jeppe Jakobsen

------=_Part_1666_16346009.1139701742380
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Nice, that was the thing I was looking for :)

2006/2/12 said:
Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a =3D "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?
This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]

------=_Part_1666_16346009.1139701742380--
 
A

Alexis Reigel

This should handle periods or commas as the separator.

a = "24,4 + 55,2 + 55 - 44,0"
=> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=\s|$)/
=> [["24,4"], ["55,2"], ["55"], ["44,0"]]

Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character"
(except newline), so many invalid numbers are accepted (e.g. "24w"...)
- If something different from whitespace follows the number, it is not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.
 
J

Jeppe Jakobsen

------=_Part_2018_3018110.1139707906280
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

2006/2/12 said:
This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]

Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character"
(except newline), so many invalid numbers are accepted (e.g. "24w"...)
- If something different from whitespace follows the number, it is not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.

Let me see if I got it right then. I'll like to use periods only for my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.

------=_Part_2018_3018110.1139707906280--
 
J

Jeppe Jakobsen

------=_Part_2066_12843267.1139708384867
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Seems I accidently got my text marked as a qoute in my last mail, so I'll
just send it a again:

Let me see if I got it right then. I'll like to use periods only for my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.


2006/2/12 said:
2006/2/12 said:
This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]

Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character"
(except newline), so many invalid numbers are accepted (e.g. "24w"...)
- If something different from whitespace follows the number, it is not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.

------=_Part_2066_12843267.1139708384867--
 
W

Wilson Bilkovich

Well, that's what I get for dashing off a quick e-mail before dinner.=20
The last problem Alexis mentioned is caused by the overly-specific
lookahead at the end. Here's a version that fixes that:

irb(main):013:0> a =3D '24.5 + 24 + 24. + 24.4.'
=3D> "24.5 + 24 + 24. + 24.4."
irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=3D[^\d])/
=3D> [["24.5"], ["24"], ["24"], ["24.4"]]
irb(main):015:0>

One of the characters '-' or '+', optionally
Followed by at least one digit.
Followed by an optional group containing a period, and one or more digits.
The capturing group ends when the next character is something other
than a digit.

The (?:) mess is there so that '24.' doesn't end up with the period on the =
end.

Seems I accidently got my text marked as a qoute in my last mail, so I'll
just send it a again:

Let me see if I got it right then. I'll like to use periods only for my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.


2006/2/12 said:
2006/2/12 said:
This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]


Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character= "
(except newline), so many invalid numbers are accepted (e.g. "24w"...= )
- If something different from whitespace follows the number, it is no= t
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.
 
W

Wilson Bilkovich

The scan process returns an array of arrays, so:
digits[0] is an Array containing '24.4'.
You could do:
digits.flatten!
just before digits[0], and get what you expect.


Yes that worked, but I intend to convert the digits of my array to floats= ,
and I get a NoMethodError on to_f now when I do this:

digits[0] =3D digits[0].to_f

I don't understand that :-/


2006/2/12 said:
Well, that's what I get for dashing off a quick e-mail before dinner.
The last problem Alexis mentioned is caused by the overly-specific
lookahead at the end. Here's a version that fixes that:

irb(main):013:0> a =3D '24.5 + 24 + 24. + 24.4.'
=3D> "24.5 + 24 + 24. + 24.4."
irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=3D[^\d])/
=3D> [["24.5"], ["24"], ["24"], ["24.4"]]
irb(main):015:0>

One of the characters '-' or '+', optionally
Followed by at least one digit.
Followed by an optional group containing a period, and one or more digi= ts.
The capturing group ends when the next character is something other
than a digit.

The (?:) mess is there so that '24.' doesn't end up with the period on = the
end.

Seems I accidently got my text marked as a qoute in my last mail, so I'll
just send it a again:

Let me see if I got it right then. I'll like to use periods only for = my
decimal numbers. I also need normal integers so 24. being accepted wo= n't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.


2006/2/12, Jeppe Jakobsen <[email protected]>:

2006/2/12, Alexis Reigel <[email protected]>:


This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]


Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character"
(except newline), so many invalid numbers are accepted (e.g. "24w"...)
- If something different from whitespace follows the number, it i=
s
not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.
 
J

Jeppe Jakobsen

------=_Part_7952_10250622.1139781548246
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

ok, but I think but wouldn't this regex do the same for me?:

/[-+]?\d+\.?\d+/

Except that it will return an array containing my digit?

2006/2/12 said:
The scan process returns an array of arrays, so:
digits[0] is an Array containing '24.4'.
You could do:
digits.flatten!
just before digits[0], and get what you expect.


Yes that worked, but I intend to convert the digits of my array to floats,
and I get a NoMethodError on to_f now when I do this:

digits[0] =3D digits[0].to_f

I don't understand that :-/


2006/2/12 said:
Well, that's what I get for dashing off a quick e-mail before dinner.
The last problem Alexis mentioned is caused by the overly-specific
lookahead at the end. Here's a version that fixes that:

irb(main):013:0> a =3D '24.5 + 24 + 24. + 24.4.'
=3D> "24.5 + 24 + 24. + 24.4."
irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=3D[^\d])/
=3D> [["24.5"], ["24"], ["24"], ["24.4"]]
irb(main):015:0>

One of the characters '-' or '+', optionally
Followed by at least one digit.
Followed by an optional group containing a period, and one or more digits.
The capturing group ends when the next character is something other
than a digit.

The (?:) mess is there so that '24.' doesn't end up with the period o=
n
r
my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.


2006/2/12, Jeppe Jakobsen <[email protected]>:

2006/2/12, Alexis Reigel <[email protected]>:


This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]


Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any
character"
(except newline), so many invalid numbers are accepted (e.g.
"24w"...)
- If something different from whitespace follows the number, it is
not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.
 
W

Wilson Bilkovich

Yes, as long as the numbers are always at least two digits.

ok, but I think but wouldn't this regex do the same for me?:

/[-+]?\d+\.?\d+/

Except that it will return an array containing my digit?

2006/2/12 said:
The scan process returns an array of arrays, so:
digits[0] is an Array containing '24.4'.
You could do:
digits.flatten!
just before digits[0], and get what you expect.


Yes that worked, but I intend to convert the digits of my array to floats,
and I get a NoMethodError on to_f now when I do this:

digits[0] =3D digits[0].to_f

I don't understand that :-/


2006/2/12, Wilson Bilkovich <[email protected]>:

Well, that's what I get for dashing off a quick e-mail before dinne= r.
The last problem Alexis mentioned is caused by the overly-specific
lookahead at the end. Here's a version that fixes that:

irb(main):013:0> a =3D '24.5 + 24 + 24. + 24.4.'
=3D> "24.5 + 24 + 24. + 24.4."
irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=3D[^\d])/
=3D> [["24.5"], ["24"], ["24"], ["24.4"]]
irb(main):015:0>

One of the characters '-' or '+', optionally
Followed by at least one digit.
Followed by an optional group containing a period, and one or more digits.
The capturing group ends when the next character is something other
than a digit.

The (?:) mess is there so that '24.' doesn't end up with the period=
on
the
end.

Seems I accidently got my text marked as a qoute in my last mail,= so
I'll
just send it a again:

Let me see if I got it right then. I'll like to use periods only =
for
my
decimal numbers. I also need normal integers so 24. being accepte=
d
won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=3D\s|$)/


I don't know if it takes care of the last problem, because I didn= 't
understand it.


2006/2/12, Jeppe Jakobsen <[email protected]>:

2006/2/12, Alexis Reigel <[email protected]>:


This should handle periods or commas as the separator.

a =3D "24,4 + 55,2 + 55 - 44,0"
=3D> "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=3D\s|$)/
=3D> [["24,4"], ["55,2"], ["55"], ["44,0"]]


Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,= "
- "." should be escaped. As you used it here, it means "any
character"
(except newline), so many invalid numbers are accepted (e.g.
"24w"...)
- If something different from whitespace follows the number, =
it
is
not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4= "
- ...


Alexis.
 
J

Josef 'Jupp' SCHUGT

Hi!

Will that expression include both integers and decimal numbers?

[-+]?([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)

has two parts:

[-+]?
([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)


The first one is an optional sign. The second one is an alternative
between to two cases:

[1-9]\d*(\,[0-9]+)?
0(\,[0-9]+)?

Let's first consider the first case

[1-9]\d*(\,[0-9]+)?

It has two parts, namely

[1-9]\d*
(\,[0-9])?

The first part by itself covers all integers larger than zero. The
overall expression additionally covers all floating point numbers
larger than 1.

Now the second case

0(\,[0-9]+)?

This one covers zero and all decimal numbers larger than 0 and smaller
than 1.

The regex I provided intentionally supports none of

[+-],\d+
[+-]0+\d+

You may as well use the shorter version

[-+]?(([1-9]\d*(\,\d+)?)|(0(\,\d+)?))

Wait a moment, I am not sure if that is correct. To be on the safe
side I'd rather use one of these where anything that follows the
optional sign has been put into another pair of parentheses:

[-+]?(([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?))
[-+]?((([1-9]\d*(\,\d+)?)|(0(\,\d+)?)))

I am one of those guys who sometime run out of placeholders when doing
search and replace in vim (which has nine of them).

Josef 'Jupp' Schugt
 
J

Jeppe Jakobsen

------=_Part_185_24913326.1139940444996
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Thank you for clearing things, up for me, but could you explain what the
last part of the expression Wilson provided me with means?

it's (?=3D[^\d])


2006/2/13 said:
Hi!

Will that expression include both integers and decimal numbers?

[-+]?([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)

has two parts:

[-+]?
([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)


The first one is an optional sign. The second one is an alternative
between to two cases:

[1-9]\d*(\,[0-9]+)?
0(\,[0-9]+)?

Let's first consider the first case

[1-9]\d*(\,[0-9]+)?

It has two parts, namely

[1-9]\d*
(\,[0-9])?

The first part by itself covers all integers larger than zero. The
overall expression additionally covers all floating point numbers
larger than 1.

Now the second case

0(\,[0-9]+)?

This one covers zero and all decimal numbers larger than 0 and smaller
than 1.

The regex I provided intentionally supports none of

[+-],\d+
[+-]0+\d+

You may as well use the shorter version

[-+]?(([1-9]\d*(\,\d+)?)|(0(\,\d+)?))

Wait a moment, I am not sure if that is correct. To be on the safe
side I'd rather use one of these where anything that follows the
optional sign has been put into another pair of parentheses:

[-+]?(([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?))
[-+]?((([1-9]\d*(\,\d+)?)|(0(\,\d+)?)))

I am one of those guys who sometime run out of placeholders when doing
search and replace in vim (which has nine of them).

Josef 'Jupp' Schugt
--
Let the origin be the middle of the earth, p(x,r) be the probability
density for finding person x at distance r. Make sure that a permanent
solution of int_0^R p(x,r) dr < 1 exists for R being the instantanous
value of the distance between earth and mars.
 
D

David Vallner

D=C5=88a Utorok 14 Febru=C3=A1r 2006 19:07 Jeppe Jakobsen nap=C3=ADsal:
Thank you for clearing things, up for me, but could you explain what the
last part of the expression Wilson provided me with means?

it's (?=3D[^\d])

That's a positive zero-width lookahead. I think. Gotta love regexspeak.

In English: look for a single character that's not a decimal digit, and don=
't=20
include it in the match.

David Vallner
 
R

Robert Klemme

David said:
Dňa Utorok 14 Február 2006 19:07 Jeppe Jakobsen napísal:
Thank you for clearing things, up for me, but could you explain what
the last part of the expression Wilson provided me with means?

it's (?=[^\d])

These is equivalent (?=\D)

A negative lookahead might work, too: (?!\d)
That's a positive zero-width lookahead. I think. Gotta love
regexspeak.

In English: look for a single character that's not a decimal digit,
and don't include it in the match.

I'd go with this quite simple regexp

/[-+]?\d+(?:,\d+)?/

If numbers like "1," should be detected, too, then just change the "+" in
the last group to "*".

If one wants to prevent to match numbers with leading zeros then it
becomes more complicated but it seems not be worth the effort in this
case.

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top