FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM?

F

FAQ server

-----------------------------------------------------------------------
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM?
-----------------------------------------------------------------------

Using Regular Expressions (JavaScript 1.2/JScript 3+) :

String.prototype.lTrim =
function()
{
return this.replace(/^\s+/,'');
}
String.prototype.rTrim =
function()
{
return this.replace(/\s+$/,'');
}
String.prototype.trim =
function()
{
return this.replace(/^\s+|\s+$/g,'');
}

or for all versions (trims characters ASCII<32 not true
"whitespace"):

function LTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" " ; k++) ;
return str.substring(k,str.length);
}
function RTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" " ; j--) ;
return str.substring(0,j+1);
}
function Trim(str) {
return LTrim(RTrim(str));
}

http://docs.sun.com/source/816-6408-10/regexp.htm

http://msdn.microsoft.com/library/d...html/2380d458-3366-402b-996c-9363906a7353.asp

http://en.wikipedia.org/wiki/Regular_expression

http://www.informatics.sussex.ac.uk/courses/it/tutorials/nsJavaScriptRef/contents.htm

http://www.merlyn.demon.co.uk/js-valid.htm


--
Postings such as this are automatically sent once a day. Their
goal is to answer repeated questions, and to offer the content to
the community for continuous evaluation/improvement. The complete
comp.lang.javascript FAQ is at http://jibbering.com/faq/index.html.
The FAQ workers are a group of volunteers. The sendings of these
daily posts are proficiently hosted by http://www.pair.com.
 
P

Peter Michaux

-----------------------------------------------------------------------
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM?
-----------------------------------------------------------------------

Using Regular Expressions (JavaScript 1.2/JScript 3+) :

String.prototype.lTrim =
function()
{
return this.replace(/^\s+/,'');
}
String.prototype.rTrim =
function()
{
return this.replace(/\s+$/,'');
}
String.prototype.trim =
function()
{
return this.replace(/^\s+|\s+$/g,'');
}

I don't think the FAQ should be promoting augmenting built in objects.
This example may not be focused on that aspect but it does do it. I
think changing to the following would be beneficial

function lTrim(str) {
return str.replace(/^\s+/, '');
}
function rTrim(str) {
return str.replace(/\s+$/, '');
}
function trim(str) {
return str.replace(/^\s+|\s+$/g, '');
}
function LTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" " ; k++) ;
return str.substring(k,str.length);
}
function RTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" " ; j--) ;
return str.substring(0,j+1);
}
function Trim(str) {
return LTrim(RTrim(str));
}

It is a well established convention that functions with a capital
letter are intended to be used as constructor with the "new" keyword.
I suggest we change these examples to

function lTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" "; k++) ;
return str.substring(k, str.length);
}
function rTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" "; j--) ;
return str.substring(0, j+1);
}
function trim(str) {
return lTrim(rTrim(str));
}

Note I have also added the extra space after the commas as I too think
that is easier to read but without the capitalization change would not
warrant editing the FAQ.

Peter
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]
oglegroups.com>, Thu, 1 Nov 2007 22:50:00, Peter Michaux
function lTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" "; k++) ;
return str.substring(k, str.length);
}
function rTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" "; j--) ;
return str.substring(0, j+1);
}

It is often considered good practice for the terminal condition of a FOR
loop to be constant during that loop, to reduce confusion. Therefore
(undertested) :-

function lTrim(str) { var k = 0
while( k<str.length && str.charAt(k)<=" ") k++
return str.substring(k, str.length);
}

function rTrim(str) { var k = str.length-1
while (k>=0 && str.charAt(k)<=" ") k--
return str.substring(0, k+1);
}

Functions, except maybe when one-liners, should be separated by vertical
whitespace, for legibility. And k is better than j.

The character l should not be used where it might be, even momentarily,
thought to be a 1. Therefore, the names might be changed to trimL and
for symmetry trimR, matching trim in case. To facilitate searching for
the identifier, I suggest trim be renamed to trimX (X for eXtremes).

And "ASCII<32" is wrong, since the characters are UniCode.
 
T

Thomas 'PointedEars' Lahn

Peter said:
[LTrim(), RTrim(), Trim()]

It is a well established convention that functions with a capital
letter are intended to be used as constructor with the "new" keyword.
I suggest we change these examples to

function lTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" "; k++) ;
return str.substring(k, str.length);
}
function rTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" "; j--) ;
return str.substring(0, j+1);
}

I don't think it is necessary anymore for new scripts to prefer this rather
inefficient and incomplete approach instead of Regular Expression matching.
As I have mentioned before, Regular Expression support was introduced with
JavaScript 1.2 (Netscape 4.0), JScript 3.0 (IE 4.0), and standardized with
ECMAScript Ed. 3; that dates back to 1997-06, 1997-10, and 1999-12 CE,
respectively. Therefore, the FAQ should either recommend the RegExp
approach or prefer that approach and provide for a non-RegExp alternative
through a run-time feature test. See also [1].

As for the identifiers, I second your reasoning. However, `lTrim' could be
displayed ambiguously. Therefore, I use `trim', `trimLeft', and `trimRight'
in my string.js [1], and I suggest we also do so in the FAQ.


PointedEars
___________
[1] http://PointedEars.de/scripts/string.js
 
P

Peter Michaux

Peter said:
[LTrim(), RTrim(), Trim()]
It is a well established convention that functions with a capital
letter are intended to be used as constructor with the "new" keyword.
I suggest we change these examples to
function lTrim(str) {
for (var k=0; k<str.length && str.charAt(k)<=" "; k++) ;
return str.substring(k, str.length);
}
function rTrim(str) {
for (var j=str.length-1; j>=0 && str.charAt(j)<=" "; j--) ;
return str.substring(0, j+1);
}

I don't think it is necessary anymore for new scripts to prefer this rather
inefficient and incomplete approach instead of Regular Expression matching.
As I have mentioned before, Regular Expression support was introduced with
JavaScript 1.2 (Netscape 4.0), JScript 3.0 (IE 4.0), and standardized with
ECMAScript Ed. 3; that dates back to 1997-06, 1997-10, and 1999-12 CE,
respectively. Therefore, the FAQ should either recommend the RegExp
approach or prefer that approach and provide for a non-RegExp alternative
through a run-time feature test. See also [1].

Regular expression literals will cause syntax errors in old browsers
so the feature test, regexp and fallback versions within a single
function won't work. The function won't even be defined. The fall back
and literal versions could be loaded in different script elements so
that the new regexp version clobbers the old version. If the Regexp
constructor function was used the syntax error problem would avoided
back to a generation of earlier browsers.

I think the old non-regexp versions could be removed from the FAQ. It
may serve as a nice introduction to feature testing but its
irrelevance to browsers in use today may be a deterrent to a new
browser script programmer to even bother reading it seriously.

Peter
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]
glegroups.com>, Mon, 5 Nov 2007 22:03:28, Peter Michaux
...
The fall back
and literal versions could be loaded in different script elements so
that the new regexp version clobbers the old version. If the Regexp
constructor function was used the syntax error problem would avoided
back to a generation of earlier browsers.

Unless the RegExp method will be **significantly** faster, for probable
browser speeds and string trimmings, there is NO point in doing any form
of feature testing on RegExp and falling back to old-style code if
needed. If the old-style code is being provided, then use it and don't
bother with the test. Just add in comment
// For ECMA>=3, can use RegExp.


Given that ECMA 3 was published in December 1999, and that its contents
were presumably no great surprise to the better browser writers, ISTM
that there should now be a <FAQENTRY> to the effect that the FAQ only
supports browsers that implement everything in ECMA 3, though it will
endeavour to handle any known bugs in that implementation

For example, code in the FAQ might use default toFixed, but only in
conditions where it is confidently believed that all implementations
work correctly.

AIUI, one can safely feature test for new RegExp("a??") allowed.
 
T

Thomas 'PointedEars' Lahn

Peter said:
Peter said:
[LTrim(), RTrim(), Trim()]
As I have mentioned before, Regular Expression support was introduced with
JavaScript 1.2 (Netscape 4.0), JScript 3.0 (IE 4.0), and standardized with
ECMAScript Ed. 3; that dates back to 1997-06, 1997-10, and 1999-12 CE,
respectively. Therefore, the FAQ should either recommend the RegExp
approach or prefer that approach and provide for a non-RegExp alternative
through a run-time feature test. See also [1].

Regular expression literals will cause syntax errors in old browsers

I did not say Regular Expression literals would have to be used. However,
this literal syntax was introduced with the aforementioned language
versions, and standardized with the aforementioned edition of the specification
so the feature test, regexp and fallback versions within a single
function won't work.

They could be made work using eval(), but I would consider that error-prone
overkill.
The function won't even be defined. The fall back and literal versions
could be loaded in different script elements so that the new regexp version
clobbers the old version.

Since that would require MSHTML to support a versioning scheme in the
`language' attribute of the `script' element generally, I doubt this would
be feasible.
If the Regexp constructor function was used the syntax error problem would
avoided back to a generation of earlier browsers.

I daresay there are no browsers that support the RegExp() constructor but
don't support Regular Expression literals; that constructor was introduced
with JavaScript 1.2 and JScript 3.0 as well. But if you can provide a test
case that proves me wrong I will update the ECMAScript Support Matrix
accordingly.


PointedEars
 
P

Peter Michaux

Peter said:
Peter Michaux wrote:
[LTrim(), RTrim(), Trim()]
As I have mentioned before, Regular Expression support was introduced with
JavaScript 1.2 (Netscape 4.0), JScript 3.0 (IE 4.0), and standardized with
ECMAScript Ed. 3; that dates back to 1997-06, 1997-10, and 1999-12 CE,
respectively. Therefore, the FAQ should either recommend the RegExp
approach or prefer that approach and provide for a non-RegExp alternative
through a run-time feature test. See also [1].
Regular expression literals will cause syntax errors in old browsers

I did not say Regular Expression literals would have to be used. However,
this literal syntax was introduced with the aforementioned language
versions, and standardized with the aforementioned edition of the specification
so the feature test, regexp and fallback versions within a single
function won't work.

They could be made work using eval(), but I would consider that error-prone
overkill.
The function won't even be defined. The fall back and literal versions
could be loaded in different script elements so that the new regexp version
clobbers the old version.

Since that would require MSHTML to support a versioning scheme in the
`language' attribute of the `script' element generally, I doubt this would
be feasible.
If the Regexp constructor function was used the syntax error problem would
avoided back to a generation of earlier browsers.

I daresay there are no browsers that support the RegExp() constructor but
don't support Regular Expression literals; that constructor was introduced
with JavaScript 1.2 and JScript 3.0 as well. But if you can provide a test
case that proves me wrong I will update the ECMAScript Support Matrix
accordingly.

I just meant that using the RegExp constructor wouldn't cause a syntax
error like a regexp literal would. So by using the RegExp constructor
a feature test could be used in a function. If a regexp literal is
used then there will be a syntax error and the function will not even
be defined.

Peter
 
T

Thomas 'PointedEars' Lahn

Peter said:
Peter said:
Regular expression literals will cause syntax errors in old browsers
I did not say Regular Expression literals would have to be used. However,
this literal syntax was introduced with the aforementioned language
versions, and standardized with the aforementioned edition of the specification
so the feature test, regexp and fallback versions within a single
function won't work.
They could be made work using eval(), but I would consider that error-prone ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
overkill. ^^^^^^^^^
[...]

I just meant that using the RegExp constructor wouldn't cause a syntax
error like a regexp literal would. So by using the RegExp constructor
a feature test could be used in a function. If a regexp literal is
used then there will be a syntax error and the function will not even
be defined.

See above.


PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,044
Messages
2,570,388
Members
47,052
Latest member
ketan

Latest Threads

Top