Regex help

B

BernieH

I need a regex which will extract the year only from a variety of date
strings such as -

April 8, 2000
13 May 1999
April 1999
1999
20050908 (yyyymmdd format)

The year will always be either the 1st 4 or the last 4 characters of the
string.


What I have at the moment is ^([0-9]{4}).*$ for the YYYYMMDD format, but I'm
not sure how to accomodate the first 3 examples. .*[0-9]{4}$ doesn't seem to
cut it.

I don't do enough regexes to be very confident with them.

TIA for any help

bernieh
 
M

Michael Winter

I need a regex which will extract the year only from a variety of
date strings such as -

Why a variety of formats? To ensure reliable results, it would be
necessary to first determine the format, then parse the string according
to what was detected. Obviously, the more formats you add, more code is
necessary (and subsequent testing) and the more chance of mismatches.
Unless you have a /real/ reason to be so flexible, enforce a small
number (preferably one) of distinct formats.
April 8, 2000
13 May 1999
April 1999
1999
20050908 (yyyymmdd format)

The year will always be either the 1st 4 or the last 4 characters of
the string.

In properly constructed strings, yes. However, are you parsing user
input or something generated mechanically? If it's the former, your
statement above cannot be regarded with /any/ confidence. If it's the
latter, you should be able to enforce just one format.
What I have at the moment is ^([0-9]{4}).*$ for the YYYYMMDD format,
but I'm not sure how to accomodate the first 3 examples. .*[0-9]{4}$
doesn't seem to cut it.

You shouldn't be trying to match any character (that applies to both
expressions). Instead, you should be specific. The last format should
only be matched if there are eight consecutive digits.

/^(\d{4})\d{4}$/

The first four should be matched by zero or more non-digit characters,
followed by four digits that end the string.

/\D*(\d{4})$/

These could even be combined, but as I said at the beginning, I wouldn't
even use the second regular expression on its own, let alone combined
with another.

[snip]

Mike
 
J

JDS

I don't do enough regexes to be very confident with them.

Not to nitpick, but last I checked, HTML did not support regular
expressions. What programming/scripting language is this question about?
 
T

Toby Inkster

BernieH said:
I need a regex which will extract the year only from a variety of date
strings such as -

April 8, 2000
13 May 1999
April 1999
1999
20050908 (yyyymmdd format)

It's unlikely a regexp would be able to do that. If you're using PHP,
check out the strtotime() function.

<?php
$testdates = array (
"April 8, 2000",
"13 May 1999",
"April 1999",
"1999",
"20050908",
"Next Tuesday",
"Yesterday",
"6 years"
);

while ( $somedate = array_pop($testdates) )
{
$year = (int)date("Y", $somedate);
print "<p>DATE is $date<br>YEAR is $year</p>\n";
}
?>

If you really, really need to use a regexp, the following will match a
year in many cases:

/[12][0-9]{3}/

but it won't be 100% reliable.
 
N

Neredbojias

With neither quill nor qualm, JDS quothed:
Not to nitpick, but last I checked, HTML did not support regular
expressions. What programming/scripting language is this question about?

Yeah, I'm confused, too. Unless I really like something, my regular
expression is "Ah, shit."
 
B

BernieH

Thank you for your help. I'm using Javascript.
Why a variety of formats?

Unfortunately, I don't have any control over the input ... the dates are
coming in from a variety of third parties via OpenURL key-pairs. So I have
to work with what I've got. The formats that are coming in at this point are

YYYY
YYYYMMDD
MMM DD, YYYY

However, it may be that other formats will come through at some point.
You shouldn't be trying to match any character (that applies to both
expressions). Instead, you should be specific. The last format should only
be matched if there are eight consecutive digits.

/^(\d{4})\d{4}$/

The first four should be matched by zero or more non-digit characters,
followed by four digits that end the string.

/\D*(\d{4})$/

Thank you for these suggestions; I'll give them a go.

bernieh
 
M

Mitja Trampus

BernieH said:
I need a regex which will extract the year only from a variety of date
strings such as -

April 8, 2000
13 May 1999
April 1999
1999
20050908 (yyyymmdd format)

The year will always be either the 1st 4 or the last 4 characters of the
string.

Then why regexes? Just cut off the first 4 and the last 4
characters, then see if any of them is a valid year (i.e.,
convert them to int and check something like 1900 < year <
2010).
 
N

Neredbojias

With neither quill nor qualm, JDS quothed:
A geeky joke, to be sure. but i chuckled.

Okay, you have a point. I may not be Rodney Dangerfield but I'm almost
as ugly.
 
D

dorayme

From: JDS said:
A geeky joke, to be sure. but i chuckled.


Not for sure or at all JDS, it is an anti-geeky joke. But not a bad one, I
agree! Boji seems able to pull a few off in between saying horrid things
about women... But all will be fixed after his dance with Officer Bud
White...

dorayme
 
N

Neredbojias

With neither quill nor qualm, dorayme quothed:
Not for sure or at all JDS, it is an anti-geeky joke. But not a bad one, I
agree! Boji seems able to pull a few off in between saying horrid things
about women... But all will be fixed after his dance with Officer Bud
White...

I don't say horrid things about women. I just point out the
discrepancies between them and logic.
 
D

dorayme

From: Neredbojias said:
I don't say horrid things about women. I just point out the
discrepancies between them and logic.

I don't think so Boji! There is deep trouble going on, I don't
much believe in psychotherapy and that is why I have urged
alternative treatment for you. A meeting, a little dancy wancy
with Officer Bud White...

dorayme

(Just btw, if it is a "logical discrepancy" you are talking
about - and could there be any other kind really? - you need to
pause to reflect about what you have said. Logical relations are
(roughly) between statements. They are not between statements
and people. If it was not such a mistake I would say there is a
discrepancy between your boastful male self and logic.)
 
N

Neredbojias

With neither quill nor qualm, dorayme quothed:
I don't think so Boji! There is deep trouble going on, I don't
much believe in psychotherapy and that is why I have urged
alternative treatment for you. A meeting, a little dancy wancy
with Officer Bud White...

dorayme

(Just btw, if it is a "logical discrepancy" you are talking
about - and could there be any other kind really? - you need to
pause to reflect about what you have said. Logical relations are
(roughly) between statements. They are not between statements
and people. If it was not such a mistake I would say there is a
discrepancy between your boastful male self and logic.)

He he, no, it's not a mistake. But it's not exactly "boastful male"
ego, either. It's called *venting*. I know and freely admit that women
aren't inferior to men, but even women must stand up for their rights if
they want to retain them. Many women, too many, expect a _man_ to
defend their rights for them and then complain when they are treated
unequally. My answer to that is, "Tough titties, sister." Now I
slandered and defamed and denigrated femininity up, down, and sideways
in this forum, and what woman had the gumption to defend *her* rights
herself? Nah, they all sidled away liked whipped dogs in a kitty litter
factory. Ergo, what could I expect and what can any man expect when all
we see is mass wuss?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Help with code plsss 0
Taskcproblem calendar 4
Creating a regex to get multiple values and print 0
Function Help 1
Code help please 4
Minimum Total Difficulty 0
Help with my responsive home page 2
Help please 8

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top