String processing question - char set related

Discussion in 'Java' started by XXX, Nov 22, 2006.

  1. XXX

    XXX Guest

    If I have a string with \r\n & I am trying to convert all \r\n to \n,
    then is code like this good enough?


    String s // contains the original string.

    StringBuffer old = new StringBuffer(s);
    StringBuffer new= new StringBuffer();

    for (int i=0; i < old.length(); ++i)
    {
    if (strBuf.charAt(i) != '\r')
    {
    new.append(old.charAt(i));
    }
    }

    Or are there problems with this? I am think of problems
    like
    - Is it possible to have strings with just \r not followed by \n.
    When can this happen?

    - Is it possible for some Unicode chars to have the \r\n pattern
    which doesn't represent a new line?
     
    XXX, Nov 22, 2006
    #1
    1. Advertising

  2. XXX

    Chris Smith Guest

    XXX <> wrote:
    > Or are there problems with this? I am think of problems
    > like
    > - Is it possible to have strings with just \r not followed by \n.
    > When can this happen?


    Yes. There are platforms where \r is the standard representation of
    end-of-line. If you need to handle end of line sequences across a
    number of common platforms, then it would be safer to wrap a
    StringReader with a BufferedReader, and then use readLine to get the
    lines.

    Alternatively, you may be reading from some protocol where the end of
    line sequence is specified; for example, it's required to be \r\n for
    many common internet application protocols. Then you could just look
    for that one sequence and replace it with \n if that's what you want.

    > - Is it possible for some Unicode chars to have the \r\n pattern
    > which doesn't represent a new line?


    It's safe to assume that \r\n indicates a newline whenever you find it.

    --
    Chris Smith
     
    Chris Smith, Nov 22, 2006
    #2
    1. Advertising

  3. XXX

    XXX Guest

    Chris Smith wrote:
    > XXX <> wrote:
    >> Or are there problems with this? I am think of problems
    >> like
    >> - Is it possible to have strings with just \r not followed by \n.
    >> When can this happen?

    >
    > Yes. There are platforms where \r is the standard representation of
    > end-of-line. If you need to handle end of line sequences across a
    > number of common platforms, then it would be safer to wrap a
    > StringReader with a BufferedReader, and then use readLine to get the
    > lines.
    >
    > Alternatively, you may be reading from some protocol where the end of
    > line sequence is specified; for example, it's required to be \r\n for
    > many common internet application protocols. Then you could just look
    > for that one sequence and replace it with \n if that's what you want.


    This is going to be text, I get from a AWT TextArea widget by calling
    getText()

    Are these issues relavant in this case?

    >> - Is it possible for some Unicode chars to have the \r\n pattern
    >> which doesn't represent a new line?

    >
    > It's safe to assume that \r\n indicates a newline whenever you find
    > it.
     
    XXX, Nov 22, 2006
    #3
  4. XXX

    Daniel Pitts Guest

    XXX wrote:
    > Chris Smith wrote:
    > > XXX <> wrote:
    > >> Or are there problems with this? I am think of problems
    > >> like
    > >> - Is it possible to have strings with just \r not followed by \n.
    > >> When can this happen?

    > >
    > > Yes. There are platforms where \r is the standard representation of
    > > end-of-line. If you need to handle end of line sequences across a
    > > number of common platforms, then it would be safer to wrap a
    > > StringReader with a BufferedReader, and then use readLine to get the
    > > lines.
    > >
    > > Alternatively, you may be reading from some protocol where the end of
    > > line sequence is specified; for example, it's required to be \r\n for
    > > many common internet application protocols. Then you could just look
    > > for that one sequence and replace it with \n if that's what you want.

    >
    > This is going to be text, I get from a AWT TextArea widget by calling
    > getText()
    >
    > Are these issues relavant in this case?
    >
    > >> - Is it possible for some Unicode chars to have the \r\n pattern
    > >> which doesn't represent a new line?

    > >
    > > It's safe to assume that \r\n indicates a newline whenever you find
    > > it.


    I would bet that you don't need to worry about it in this case. A few
    simple tests will let you know.
     
    Daniel Pitts, Nov 22, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. wwj
    Replies:
    7
    Views:
    597
  2. wwj
    Replies:
    24
    Views:
    2,573
    Mike Wahler
    Nov 7, 2003
  3. lovecreatesbeauty
    Replies:
    1
    Views:
    1,152
    Ian Collins
    May 9, 2006
  4. Rainer Weikusat

    processing strings char-by-char

    Rainer Weikusat, Aug 18, 2013, in forum: Perl Misc
    Replies:
    0
    Views:
    159
    Rainer Weikusat
    Aug 18, 2013
  5. Rainer Weikusat

    processing strings char-by-char

    Rainer Weikusat, Aug 18, 2013, in forum: Perl Misc
    Replies:
    3
    Views:
    175
    John W. Krahn
    Aug 20, 2013
Loading...

Share This Page