Checking whether a string contains only ISO-8859-1 chars

Discussion in 'Java' started by Jonck, Oct 20, 2004.

  1. Jonck

    Jonck Guest

    Hi,
    I need to send strings to someone else's servlet. However, these strings
    may only contain ISO-8859-1 characters, therefore I need to check
    whether the user of my app has not tried to enter any non-ISO-8859-1
    characters before I send his/her input on to the servlet. Does anyone
    know of an easy way to check whether a string contains only ISO-8859-1
    characters?

    The only solution I could think of was to use a regular expression where
    I enter every ISO-8859-1 character in the matching sequence, but this is
    rather clunky and prone to errors.

    Thanks for any help, Jonck
    Jonck, Oct 20, 2004
    #1
    1. Advertising

  2. Jonck schrieb:

    > Hi,
    > I need to send strings to someone else's servlet. However, these strings
    > may only contain ISO-8859-1 characters, therefore I need to check
    > whether the user of my app has not tried to enter any non-ISO-8859-1
    > characters before I send his/her input on to the servlet. Does anyone
    > know of an easy way to check whether a string contains only ISO-8859-1
    > characters?
    >
    > The only solution I could think of was to use a regular expression where
    > I enter every ISO-8859-1 character in the matching sequence, but this is
    > rather clunky and prone to errors.
    >
    > Thanks for any help, Jonck


    Another solution would be:
    convert the String into bytes and the bytes back to a String, and then
    compare both Strings:
    String s = ...;
    byte bytes[] = s.getBytes(s, "ISO-8859-1");
    String s2 = new String(bytes, "ISO-8859-1");
    if (s2.equals(s))
    .... // String s is OK
    See also the javadoc of String.

    --
    "Thomas:Fritsch$ops:de".replace(':','.').replace('$','@')
    Thomas Fritsch, Oct 20, 2004
    #2
    1. Advertising

  3. Jonck

    Chris Smith Guest

    Jonck wrote:
    > I need to send strings to someone else's servlet. However, these strings
    > may only contain ISO-8859-1 characters, therefore I need to check
    > whether the user of my app has not tried to enter any non-ISO-8859-1
    > characters before I send his/her input on to the servlet. Does anyone
    > know of an easy way to check whether a string contains only ISO-8859-1
    > characters?
    >
    > The only solution I could think of was to use a regular expression where
    > I enter every ISO-8859-1 character in the matching sequence, but this is
    > rather clunky and prone to errors.


    The easiest way to do this is with the java.nio.charset package:

    CharsetEncoder enc = Charset.forName("ISO-8859-1").newEncoder();
    if (enc.canEncode(str)) ...;
    else ...;

    For this particular encoding, you could also take advantage of the fact
    that ISO-8859-1 contains exactly the set of unicode character with
    ordinals less than 256. So you can write this instead:

    boolean canEncode = true;
    for (int i = 0; i < str.length(); i++)
    {
    if (str.charAt(i) >= 256)
    {
    canEncode = false;
    break;
    }
    }

    The only advantage of the second approach is that this would work with
    any version of the Java API; even obsolete versions like 1.1. For other
    encodings, of course, this doesn't work so well.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, Oct 20, 2004
    #3
  4. Jonck

    Jonck Guest

    Thomas and Chris, thanks to you both for your suggestions, both of your
    solutions work perfectly.
    Jonck, Oct 25, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Franck DARRAS
    Replies:
    12
    Views:
    633
    Jim Higson
    Aug 23, 2004
  2. Peter Jacobi
    Replies:
    13
    Views:
    845
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Aug 3, 2004
  3. janib
    Replies:
    3
    Views:
    13,632
    Roland de Ruiter
    Aug 7, 2006
  4. Replies:
    14
    Views:
    578
    Frederick Gotham
    Aug 7, 2006
  5. Martin
    Replies:
    2
    Views:
    142
    Bob Showalter
    Jun 27, 2007
Loading...

Share This Page