Simple regex in java to extract the domain name

B

Berlin Brown

I am trying to convert a regex expression that I have in ruby to do the
same in java, but the dialects are different.

I am trying to parse a URL such that I get the domain name (possibly
with the www).

http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf"

For example, above would return:

www.yahoo.com

My ruby regex expression is such that:

/^(?:[^\/]+:\/\/)?([^\/:]+)/

And I was working on the java one, havent made much progress:

p = Pattern.compile("^http://([a-z0-9]*\.)*")

m = p.matcher( String("http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf")
)
 
R

Roedy Green

And I was working on the java one, havent made much progress:

p = Pattern.compile("^http://([a-z0-9]*\.)*")

the key is match one character at a time, then when you have that
working, extend your pattern by one more character. The problem is
regexes are like working with a blindfold. You can't see why they are
failing to give the expected results.
 
D

Dave Mandelin

What is the problem exactly? Does your regexp not match the string, or
does the matched group not extract the part you want?
 
B

Berlin Brown

Ok, without the spoon feeding I did what you said. Thanks. It is a
start, but this is what I ended up with. (and for those complete
regex-java newbies)

"http://(.*?)\\/(.*)

My thought process:
1. Clearly the 'http://' means that find 'http://' at the start of the
string.

2. I wanted the host(I will leave the www for now), so I wanted any
characters between the http and the first '/'. So the 'dot' means
seek for any character, the '*' match zero or more times, greedy
(opposite lazy, where lazy means fail after first match?)
So, I ended up with (.*?) and where the '(' and ')' represent a group.

3. Next, I needed to acknowledge the '/', a literal, so I also added
the '\\' for a literal.

4. Add the rest of the URL string in another group.
 
B

Bart Cremers

Just using your ruby regex works if used correctly in Java. I removed
the escapes to simplify it a bit, but it's not needed to remove them:

String pattern = "^(?:[^/]+://)?([^/:]+)";
String input = "http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf";

Matcher matcher = Pattern.compile(pattern).matcher(input);
if (matcher.find()) {
int start = matcher.start(1);
int end = matcher.end(1);

System.out.println(input.substring(start, end));
}

Regards,

Bart
 
G

Greg R. Broderick

I am trying to parse a URL such that I get the domain name (possibly
with the www).

http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf"

For example, above would return:

www.yahoo.com

Why waste your time re-inventing the wheel? Java has the built-in
java.net.URL class that will do this for you, via its getHost() method.

Cheers
GRB

--
---------------------------------------------------------------------
Greg R. Broderick [rot13] (e-mail address removed)

A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------
 
N

Nigel Wade

Berlin said:
I am trying to convert a regex expression that I have in ruby to do the
same in java, but the dialects are different.

I am trying to parse a URL such that I get the domain name (possibly
with the www).

http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf"

For example, above would return:

www.yahoo.com

My ruby regex expression is such that:

/^(?:[^\/]+:\/\/)?([^\/:]+)/

And I was working on the java one, havent made much progress:

p = Pattern.compile("^http://([a-z0-9]*\.)*")

m = p.matcher( String("http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf")
)

There's no point in re-inventing wheels.
If you are working with URIs, why not use the URI tools available to you?

URI uri = new URI("http://www.yahoo.com/suckit/kjlaflaj/ljl?lsklf");
String domainName = uri.getHost();
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top