translate human-readable time shorthand

  • Thread starter Mathias KÅ‘rber
  • Start date
M

Mathias KÅ‘rber

I am looking for a module which can help
translate human input for durations such as

3w4d20m10s
into seconds (2161210). Spaces inside the
input should be ignored. If it can accept
other formats, the better.

thanks
 
U

Ulli Horlacher

Mathias K?rber said:
I am looking for a module which can help
translate human input for durations such as

3w4d20m10s
into seconds (2161210). Spaces inside the
input should be ignored. If it can accept
other formats, the better.

This is easy:

sub seconds {
local $_ = shift;
my $seconds = 0;

s/\s//g;

$seconds += $1*60*60*24*7 if /(\d+)w/;
$seconds += $1*60*60*24 if /(\d+)d/;
$seconds += $1*60*60 if /(\d+)h/;
$seconds += $1*60 if /(\d+)m/;
$seconds += $1 if /(\d+)s/;

return $seconds;
}

--
Ullrich Horlacher Informationssysteme und Serverbetrieb
Rechenzentrum IZUS/TIK E-Mail: (e-mail address removed)-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
 
G

George Mpouras

# There is the correct answer and the fast one. Here is the fast one !



use strict;
use warnings;

print SecondThis( '3w4d20m 10s' );
print SecondThis( '2h 3s' );

# Number of seconds of a string like "3w4d 20m 10s"
#
# Y years
# M months
# d days
# w weeks
# h hours
# m minutes
# s seconfs
#
sub SecondThis
{
(local $_ = $_[0]) =~s/[^\dYMdwhms]+//g;
my $years = /(\d+)\s*Y/ ? $^N : 0;
my $months = /(\d+)\s*M/ ? $^N : 0;
my $days = /(\d+)\s*d/ ? $^N : 0;
my $weeks = /(\d+)\s*w/ ? $^N : 0;
my $hours = /(\d+)\s*h/ ? $^N : 0;
my $minutes = /(\d+)\s*m/ ? $^N : 0;
my $seconds = /(\d+)\s*s/ ? $^N : 0;
$seconds+
(60*$minutes)+
(3600*$hours)+
(86400*$days)+
(604800*$weeks)+
(2592000*$months)+
(31536000*$years)
}
 
P

Peter Makholm

# There is the correct answer and the fast one. Here is the fast one !

It would be nice to document, at least partly, where this answer is
incorrect.

- It assumes 24 hours days. This assumption breaks twice a year in
locations observing summer time.

- It assumes 30 days months, which is clearly an approximation.

- It assumes 365 days yeas, which doesn't account for leap years.

- It doesn't take leap seconds into account.

Depending on the scenario at least some of these assumptions might be
regarded as either valid or invalid.

//Makholm
 
G

George Mpouras

Στις 7/8/2013 11:46, ο/η Peter Makholm έγÏαψε:
It would be nice to document, at least partly, where this answer is
incorrect.

- It assumes 24 hours days. This assumption breaks twice a year in
locations observing summer time.

- It assumes 30 days months, which is clearly an approximation.

- It assumes 365 days yeas, which doesn't account for leap years.

- It doesn't take leap seconds into account.

Depending on the scenario at least some of these assumptions might be
regarded as either valid or invalid.

//Makholm



You answer yourself.
Many many times customers, without thinking a lot, just saying, please
automate the retirement of data that are e.g. older than 2 months !

So a very interesting conversation is started.
As I said there is the absolute correct solution but its out of scope of
initial request.
 
R

Rainer Weikusat

Peter Makholm said:
It would be nice to document, at least partly, where this answer is
incorrect.

- It assumes 24 hours days. This assumption breaks twice a year in
locations observing summer time.

- It assumes 30 days months, which is clearly an approximation.

- It assumes 365 days yeas, which doesn't account for leap years.

- It doesn't take leap seconds into account.

While this is true, it already applies to the input data which uses
undefined units whose 'conventionally assumed meanings' are not really
well-defined: The exact meaning depends not only on a start date but
even on unpredictable, external events such as 'leap second
insertion'.
 
R

Rainer Weikusat

Mathias KÅ‘rber said:
I am looking for a module which can help
translate human input for durations such as

3w4d20m10s
into seconds (2161210). Spaces inside the
input should be ignored. If it can accept
other formats, the better.

If you want to do this correctly, a start date is needed and the code
needs to calculate the 'target date' based on that. Eg, '3m starting
on March 15th' would mean 'June 15th', a period of 61 days, the same
starting on 'July 15' would mean 'September 15th', 62 days. After this
has been done, the target date can be converted into a second count
and the difference can be calculated.

It would be simpler to use an approximate unit definition and code
like the two examples posted in this thread.
 
I

Ivan Shmakov

If you want to do this correctly, a start date is needed and the code
needs to calculate the 'target date' based on that. Eg, '3m starting
on March 15th' would mean 'June 15th', a period of 61 days,

I took it that the intervals the OP's interested in are weeks,
days, hours (?), minutes, and seconds, and it's certainly
possible (leap seconds issue put aside) to define these in the
terms of seconds, irrespective of the starting point.

Also to note is that the conventional "Unix time" system
effectively accounts for leap seconds by "stretching" the plain
seconds over certain interval.

As for the "daylight savings time," it's indeed possible that
"1 hour from 02:34, local time" would come to be either "02:34"
or "04:34." Which is one more reason to use UTC when one's
interested in precise intervals specifically.

[...]
 
R

Rainer Weikusat

Ivan Shmakov said:
I took it that the intervals the OP's interested in are weeks,
days, hours (?), minutes, and seconds, and it's certainly
possible (leap seconds issue put aside) to define these in the
terms of seconds, irrespective of the starting point.

It is certainly possible to define anything as anything. Eg, the
calculation becomes really simple when all of the involved units are
defined as '0 seconds': The result is then always 0, problem solved.
But this may not be what the people using the notation wanted to have.
Also to note is that the conventional "Unix time" system
effectively accounts for leap seconds by "stretching" the plain
seconds over certain interval.

It doesn't. Provided the wallclock is managed by a 'sensibly working
NTP daemon' observing the NTP 'clock correctness principle' (and there
are loads and loads of people who think that the concept of 'time'
simply doesn't make sense and hence, using a mostly monotonic PRNG
with a sufficiently fine-grained resolution that humans are unlikely
to notice the problem easily ought to be 'good enough'), time
adjustments will be accomplished by changing the frequency of the
clock such that it converges towards 'the real time'. That's
unrelated to leap seconds which are inserted whenever 'the
timekeepers' think they have to and implemented such that the last
minute of the day is extended to have 61 instead of 60 seconds.
 
P

Peter J. Holzer

This one above seems a bit redundant...


... given that these REs are already going to ignore spaces.

No, they don't. For example, they won't accept "1 w" as input.

And not only spaces, BTW. Consider, e. g.:

my $r
= seconds ("Hello, wor1d!");

Now $r is 86400.

Yes. So Ulli's solution is accepts a lot of input which should probably
not be accepted. However, to satisfy the spec "Spaces inside the input
should be ignored" the statement s/\s//g is necessary.

hp
 
P

Peter J. Holzer

This one above seems a bit redundant...


... given that these REs are already going to ignore spaces.

No, they don't. For example, they won't accept "1 w" as input.

And not only spaces, BTW. Consider, e. g.:

my $r
= seconds ("Hello, wor1d!");

Now $r is 86400.

Yes. So Ulli's solution accepts a lot of input which should probably not
be accepted. However, to satisfy the spec "Spaces inside the input
should be ignored" the statement s/\s//g is necessary.

hp
 
R

Rainer Weikusat

Ben Morrow said:
If you s/m/mn/g; s/(?<!\d)(?=\d)/ /g; then these can be parsed by
Date::Manip::Delta.

In other words: Date::Manip::Delta can't solve the problem.
You could also use DateTime::Format::DateManip; in general I'd
recommend using DateTime, because it gets all the nasty corner cases
right.

Which 'nasty corner cases'? The only real problem are the ill-defined
units. When assuming that George's/ Ulli's 'approximations' (which is a
euphemism for 'garbage in, garbage out' here) are appropriate, the
problem is simple:

----------------
$d = $ARGV[0];
$d =~ s/\s//g;

%units = (
Y => 365 * 86400,
M => 30 * 86400,
w => 7 * 86400,
d => 86400,
h => 3600,
m => 60,
s => 1);

$p = 0;
$p = pos($d), $seconds += $1 * $units{$2} while $d =~ /\G(\d+)([YMdwhms])/g;

die("error at $p") if $p < length($d);

print($seconds, "\n");
----------------

In scalar context, the $d =~ /\G(\d+)([YMdwhms])/g matches a sequence of
digits followed by a 'unit abbreviation'. The is put into $1, the
latter into $2. The expression rertuns true if a match could be found
and false otherwise. The \G means 'start where the last match stopped'
the /g 'continue with this string' (also an approximation).
 
R

Rainer Weikusat

Rainer Weikusat said:
[...]

----------------
$d = $ARGV[0];
$d =~ s/\s//g;

%units = (
Y => 365 * 86400,
M => 30 * 86400,
w => 7 * 86400,
d => 86400,
h => 3600,
m => 60,
s => 1);

$p = 0;
$p = pos($d), $seconds += $1 * $units{$2} while $d =~ /\G(\d+)([YMdwhms])/g;

[...]

It is possible to do without the explicit whitespace removal:

$p = pos($d), $seconds += $1 * $units{$2} while $d =~ /\G\s*(\d+)\s*([YMdwhms])\s*/g;

NB: Thanks to Unicode, the \d and \s might match 'arbitrary
garbage'. In particular, \d+ might match something Perl doesn't
consider to be a number.
 
R

Rainer Weikusat

Ben Morrow said:
Quoth Rainer Weikusat <[email protected]>:
[...]
Which 'nasty corner cases'? The only real problem are the ill-defined
units. When assuming that George's/ Ulli's 'approximations' (which is a
euphemism for 'garbage in, garbage out' here) are appropriate, the
problem is simple:

----------------
$d = $ARGV[0];
$d =~ s/\s//g;

%units = (
Y => 365 * 86400,
M => 30 * 86400,
w => 7 * 86400,
d => 86400,
h => 3600,
m => 60,
s => 1);

A year is not always 365 days. A month is only occasionally 30 days. A
day is not

[...]

I figure I have now written 3 or 4 postings pointing out that the
units used in this example are not well-defined, including the one
you're replying to, cf first paragraph. DateTime can't "handle that"
because in absence of a start date the interval is supposed to apply
to, the problem can't be solved. Even then, it can't be solved,
neither by DateTime nor anything because 'leap second insertion' is
not predictable. Consequently, I take this as 'now such corner cases
exist in the parser'.
 
R

Rainer Weikusat

Ben Morrow said:
Quoth Rainer Weikusat <[email protected]>:
[...]
Which 'nasty corner cases'? The only real problem are the ill-defined
units. When assuming that George's/ Ulli's 'approximations' (which is a
euphemism for 'garbage in, garbage out' here) are appropriate, the
problem is simple:

----------------
$d = $ARGV[0];
$d =~ s/\s//g;

%units = (
Y => 365 * 86400,
M => 30 * 86400,
w => 7 * 86400,
d => 86400,
h => 3600,
m => 60,
s => 1);

A year is not always 365 days. A month is only occasionally 30 days. A
day is not

[...]

I figure I have now written 3 or 4 postings pointing out that the
units used in this example are not well-defined, including the one
you're replying to, cf first paragraph. DateTime can't "handle that"
because in absence of a start date the interval is supposed to apply
to, the problem can't be solved. Even then, it can't be solved,
neither by DateTime nor anything because 'leap second insertion' is
not predictable. Consequently, I take this as 'no such corner cases
exist in the parser'.
 
T

Tim McDaniel

If you s/m/mn/g; s/(?<!\d)(?=\d)/ /g; then these can be parsed by
Date::Manip::Delta.

I am not at all familiar with the fancy-pants newfangled stuff in
regexps like in that second example. To save other people trouble,
- "m" has to be expressed as "mn" (in Date::Manip::Delta,
"m" appears to be "month" and "mn" is "minute")
- Date::Manip::Delta requires space (or comma) before digits.
Find each place where the character before is not a digit and the
character following is a digit, and put a space there.
(Those are zero-width assertions.) I see no reason why it could not
be expressed, albeit probably with less efficiency, as
s/(\d+)/ $1/g
 
J

Jim Gibson

A year is not always 365 days. A month is only occasionally 30 days. A
day is not always 86400 seconds. DateTime will handle expressions like
'one month and three days' correctly; it also handles Summer Time
correctly when working in local time.

There is some ambiguity in the concept of /duration/ when applied to
such large units as years and months. A "unit" of duration should have
a fixed definition that doesn't depend upon the date and time when the
period starts and stops.

For example, if you say you want your egg boiled for "three minutes",
you mean exactly 180 seconds, regardless of when you actually place the
egg into the boiling water and whether or not a leap second is added to
the calendar during the cooking. If you want your concrete driveway to
cure for "three days" before parking your car on it, you want to wait
no less than 259,200 seconds, even if you happen to pour the concrete
on the Friday before daylight savings goes into effect.

So we should try to agree that one minute of duration is 60 seconds,
one hour is 60 minutes, one day is 24 hours, and one week is 7 days,
regardless of when those periods start or stop.

When it comes to months and years, there is more ambiguity and more
possibility for disagreement. A solar year is 365.242 days, or
31,556,908.8 seconds. An average "month" would be one-twelfth of that,
or 2,629,742.4 seconds = 30.4368 days. The potential disagreements of
what constitutes a "year" or a "month" probably preclude those terms
from being used as units of duration unless it is obvious what values
are being used.
 
T

Tim McDaniel

Quoth (e-mail address removed):

'Newfangled'?!

FANCY-PANTS newfangled. 'Tweren't in Perl 4.038.
*ptooo* [ping!]

Seriously, I've never had cause to use a zero-width assertion, or
possibly any other "?" syntax other than the one that keeps (...) from
touching $digit ("(?: ... )"), and that I had to look up just now.
 
G

George Mpouras

Seriously, I've never had cause to use a zero-width assertion, or
possibly any other "?" syntax other than the one that keeps (...) from
touching $digit ("(?: ... )"), and that I had to look up just now.

all these are very wise and clever except ... that the OP is missing !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top