Getting a valid URL from a command line

Hunt Jon · Sep 28, 2009

Hi - I'm working on the script below, which attempts at getting
a user input and validate that the input is formed like a URL.
And if the user fails to input, it should ask again.

require 'uri'
puts "Type a URL"
begin
url = gets.chomp
URI.parse(url) # should raise if a variable 'url' is malformed.
rescue URI::InvalidURIError
puts "That is not a valid URL. Try again."
retry
end

I expect that if I run "URI.parse()" it should raise an error, but
it doesn't happen.

Can anybody help me on this one?

Jon

Rob Biedenharn · Sep 28, 2009

Hi - I'm working on the script below, which attempts at getting
a user input and validate that the input is formed like a URL.
And if the user fails to input, it should ask again.

require 'uri'
puts "Type a URL"
begin
url = gets.chomp
URI.parse(url) # should raise if a variable 'url' is malformed.
rescue URI::InvalidURIError
puts "That is not a valid URL. Try again."
retry
end

I expect that if I run "URI.parse()" it should raise an error, but
it doesn't happen.

Can anybody help me on this one?

Jon

require 'uri'
print "Type a URL: "
begin
url = gets.chomp
puts "You said: #{url.inspect}"
uri = URI.parse(url) # should raise if a variable 'url' is malformed.
puts uri.inspect
rescue URI::InvalidURIError
puts "That is not a valid URL. Try again."
retry
end

Try getting a little bit more information out (and post what input you
are trying that you expect to be malformed).

Note that some URI's are HTTP and some might be Generic. There are a
lot more types of URI that just those that start with http://. Have
you ever seen a jdbc resource string?

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)

Hunt Jon · Sep 28, 2009

require 'uri'
print "Type a URL: "
begin
=A0url =3D gets.chomp
=A0puts "You said: #{url.inspect}"
=A0uri =3D URI.parse(url) # should raise if a variable 'url' is malformed=

Rob Biedenharn · Sep 28, 2009

I expect a user to input a HTTP or HTTPS URL. e.g., http://abcdef.gov
Maybe using URI seems *too* generic after the research as 'uri' means
different protocols, not just http/https.

I'll look into it. Perhaps using Regexp match would be better.

Jon

You can see what the scheme is determined to be:

irb> require 'uri'
=> true
irb> u=URI.parse('http://example.com/')
=> #<URI::HTTP:0x395b34 URL:http://example.com/>
irb> u.scheme
=> "http"
irb> x=URI.parse('example.com')
=> #<URI::Generic:0x392f24 URL:example.com>
irb> x.scheme
=> nil

You probably don't want to jump down the Regexp rabbit-hole if you
know that you want a valid URI. Let the library do the heavy lifting.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)

David Masover · Sep 28, 2009

I expect a user to input a HTTP or HTTPS URL. e.g., http://abcdef.gov
Maybe using URI seems *too* generic after the research as 'uri' means
different protocols, not just http/https.

Well, a URI isn't even required to work. Just a clarification:

A URL is meant to actually refer to a resource. For example,

http://ruby-lang.org/

actually refers to a working website, and is thus a URL -- thus, the protocol
must be something that actually exists, and as a practical matter, you'll want
it to be something you (or your browser) know how to handle.

A URI only needs to be globally unique. For example:

http://www.w3.org/1999/xhtml

It doesn't matter AT ALL whether this points to a working resource. The Web
will continue to work, even if w3.org completely implodes. As a matter of
courtesy, the W3C has actually made this a valid URL, which points to a
description of what that namespace is, and the specifications that use it --
but when your browser sees that URI at the top of a web page:

<html xmlns='http://www.w3.org/1999/xhtml' ...>

It doesn't actually talk to w3.org at all. It just knows internally that this
namespaces is where HTML elements go in an XHTML document.

On a completely unrelated note, if you know how XML namespaces work,
technically, the following would probably work, on browsers that understand
XHTML:

<foobar:html xmlns:foobar='http://www.w3.org/1999/xhtml'>
<foobar:head>
...
</foobar:head>
<foobar:body>
...
</foobar:body>
</foobar:html>

I suspect that the spec explicitly disallows this, at least in the
"transitional" mode, because it's not backwards compatible with HTML 4.0. But
the point is, internally, the browser is looking for an html element
associated with that URI -- which is why it's not a valid xhtml document if
you don't include that xmlns in some form.

SOLVE THIS IF YOU CAN PYTHON MASTER	7	Jan 30, 2023
Forcing a string to valid UTF-8	2	Apr 26, 2010
Google Calculator command line tool	12	Oct 7, 2005
Testing command-line apps with Cucumber: advice?	0	Oct 3, 2009
Getting Response from HTTPS POST	0	May 31, 2007
following url redirect	0	Apr 3, 2009
Errors on REXML reading an HTML.	1	Dec 24, 2010
How to submit a form and get response content	0	Jun 23, 2009

Getting a valid URL from a command line

Hunt Jon

Rob Biedenharn

Hunt Jon

Rob Biedenharn

David Masover

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads