Remove punctuation from String?

D

dfhLASST

What is the best way to remove all non-alphabetic characters (e.g. symbols,
spaces etc.) from a String?

My original plan was to loop round the chars in the String and add them to
an array if the value of the chars are alphabetic (i.e. >=65 and <=122).
I've ran into problems with this and it seems more complex than the problem
should be.

Any suggestions?
 
M

Michael Borgwardt

dfhLASST said:
What is the best way to remove all non-alphabetic characters (e.g. symbols,
spaces etc.) from a String?

My original plan was to loop round the chars in the String and add them to
an array if the value of the chars are alphabetic (i.e. >=65 and <=122).
I've ran into problems with this and it seems more complex than the problem
should be.

The problem is more complex than you think. Are you absolutely sure than
you're only ever going to process English text? If not, use Character.isLetter()
for the condition.

For the accumulation of the output string, use StringBuffer (I gess that's
where you encountered obvious problems).
 
C

Chris Smith

dfhLASST said:
What is the best way to remove all non-alphabetic characters (e.g. symbols,
spaces etc.) from a String?

My original plan was to loop round the chars in the String and add them to
an array if the value of the chars are alphabetic (i.e. >=65 and <=122).
I've ran into problems with this and it seems more complex than the problem
should be.

Any suggestions?

str = str.replaceAll("[^A-Za-z]", "");

or, if you want more than just ASCII characters:

str = str.replaceAll("[^\\p{L}]", "");

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
D

dfhLASST

Michael Borgwardt said:
The problem is more complex than you think. Are you absolutely sure than
you're only ever going to process English text? If not, use Character.isLetter()
for the condition.

For the accumulation of the output string, use StringBuffer (I gess that's
where you encountered obvious problems).

Thanks, yeah I used that.

For future reference for anyone else here is my method:


public String stripPunctuation(String s) {

StringBuffer sb = new StringBuffer();

for (int i = 0; i < s.length(); i++) {
if ((s.charAt(i) >= 65 && s.charAt(i) <= 90) || (s.charAt(i) >= 97 &&
s.charAt(i) <= 122)) {

sb = sb.append(s.charAt(i));
}
}

return sb.toString();
}
 
W

Woebegone

Michael Borgwardt said:
8<

The problem is more complex than you think. Are you absolutely sure than
you're only ever going to process English text? If not, use
Character.isLetter()
for the condition.

For the accumulation of the output string, use StringBuffer (I gess that's
where you encountered obvious problems).

I've used something like the following in cases where I know the processing
is constrained to a given (relatively small) set of characters, e.g. English
text. It has the advantage of allowing easy extension by adding characters
to ALPHABET without necessarily requiring char codes.

/* */
public class StringCleanser {
public static final String ALPHABET =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"abcdefghijklmnopqrstuvwxyz";
public static boolean isAlphabetic(char c) {
return StringCleanser.ALPHABET.indexOf(c) != -1;
}
public static String cleanse(String s) {
StringBuffer buf = new StringBuffer();
for (int i = 0; i < s.length(); i++) {
if (StringCleanser.isAlphabetic(s.charAt(i))) {
buf.append(s.charAt(i));
}
}
return buf.toString();
}
public static void main(String[] args) {
String in = "L e,f.t/o;v'e[r]L1e.2t3t4e ,5r6s7";
System.out.println(StringCleanser.cleanse(in));
}
}
/* */
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top