Zero terminated strings

J

jacob navia

Zero terminated strings are a continuing security nightmare.

Slashdot reports this today:

"Two researchers, Dan Kaminsky and Moxie Marlinspike, came up with exact
same way to fake being a popular website with authentication from a
certificate authority.

Wired has the details: 'When an attacker who owns his own domain —
badguy.com — requests a certificate from the CA, the CA, using contact
information from Whois records, sends him an email asking to confirm his
ownership of the site. But an attacker can also request a certificate
for a subdomain of his site, such as Paypal.com\0.badguy.com, using the
null character \0 in the URL.

The CA will issue the certificate for a domain like
PayPal.com\0.badguy.com because the hacker legitimately owns the root
domain badguy.com. Then, due to a flaw found in the way SSL is
implemented in many browsers, Firefox and others theoretically can be
fooled into reading his certificate as if it were one that came from the
authentic PayPal site. Basically when these vulnerable browsers check
the domain name contained in the attacker's certificate, they stop
reading any characters that follow the "\0 in the name.'"

And still we will hear the same old arguments from the same
people again and again...

There is nothing wrong

C is like that

etc etc.

(Note that C++ uses zero terminated strings too)
 
J

jacob navia

I found this in slashdot too

Two strings walk into a bar.

The first string says to the bartender, "Give me a beer." The bartender
turns to the second string and says, "and what about for you?" To which
the second string replies, "I would also like a
beer#@a9101gb230b81;kajf3#$B89*#()*13!$%#@$"" and goes on and on spewing
gibberish.

The bartender, shocked, asks the first string, "What is your buddy's
problem?"

The first string answers, "Oh, you'll have to excuse him, he isn't null
terminated."
 
S

spinoza1111

jacob navia said:





I haven't read RFC 4210 in detail, but either this is a legitimate
request or it isn't. If it is (which is unlikely), then the CA is
right to honour it. Otherwise, there's your problem right there.


If the protocol allows embedded null characters in the domain name
(and I'm reasonably sure it doesn't), then the browsers are broken.
Solution: fix them.



Of course there's something wrong. Either the CA is issuing
certificates that it shouldn't issue, or the browser is failing to
parse certificate information correctly. Solution: fix it.

Changing C's definition of strings is not going to solve the problem
of programmers being careless.

Richard, the assertion that "every programmer except me is careless"
is a sign of a narcissitic personality disorder. You believe that YOU
will never write buggy code and to shore up this narcissistic belief,
you must constantly destroy others for example by conducting
conversations about their defects in their virtual presence with
others; you know they will see your words but you're not man enough to
say it to them directly. You make it impossible for others to form
their own judgements by saying, in effect, that if they like the
contributions of the other poster this says something about them.

Read C. Wright Mills' White Collar. Each little white collar employee
believes himself special, and blames his workmates for problems rather
than pitching in to fix the problem as do most blue-collar employees.
Rather than take responsibility for his own poor decisions, such as
your absurd positions on C, he hypothesizes that "out there" there are
any number of "incompetent programmers", "welfare queens", "liberals"
or "Jews" who create all the problems...who did not foresee, here, as
you foresee with the benefit of hindsight, the problem Navia so
intelligently describes, and whose livelihoods and reputation must
therefore be sacrificed to your narcissism.

You're a twat and a **** of the first water, and a Mean Girl.
 
J

jacob navia

Eric said:
jacob said:
Zero terminated strings are a continuing security nightmare.
[...]

Even if we accept the thesis as proven (which I don't, but
let it pass), what remedy or other action would you suggest?
Have you anything to offer other than a complaint? Anything
constructive, for instance?
I have proposed (and implemented) a full replacement of zero
terminated strings in my C compilation system lcc-win.

I distribute all the source code for those strings with
every distribution of lcc-win.

I have presented that solution here and I have advocated its
usage both here and comp.std.c.

That implementation uses operator overloading to use the
[] operator to index the strings. It is trivial then
to port from the old string package to the new strings.

I have rewritten all functions of the standard library using those
strings.

strcmp---> Strcmp

strcmp uses zero terminated strings
Strcmp uses counted strings.

Strcpy --> Strcpy

and so on.

I have repeated the arguments for using a more advanced set of
strings here many times and I have many times received
the same answer of heathfield and co:

o operator overloading is a sacrilege, etc etc

o zero terminated strings are OK if you aren't a stupid programmer

o etc etc
 
P

Paul Hsieh

jacob said:
Zero terminated strings are a continuing security nightmare.
[...]

     Even if we accept the thesis as proven (which I don't, but
let it pass), what remedy or other action would you suggest?

You are kidding right? 1) Deprecate all use of the C library w.r.t.
string processing. 2) Use *any other* programming language to handle
string processing throughout the chain of tools described (even C++'s
STL would probably have worked.) These are the obvious and implicit
proposals whenever you see this sort of thing.

Or you can wait for all the programs to "call home" and download
automatic updates that might fix your problem. My that's one mighty
overused band aid you've got there ...
Have you anything to offer other than a complaint?  Anything
constructive, for instance?

You have never heard of jacob navia making proposals to modify or
change the way we program in the C language? Are you new to
comp.lang.c or something?
 
L

Lew Pitcher

Zero terminated strings are a continuing security nightmare.
[snip]

What bugs me the most about this thread (and many others) is that it pushes
a single person's (or in this case, a single vendor's) agenda, using
arguments that have nothing to do with the proper use of the C language.

The 'fatal flaw' of DNS that you refer to could be avoided the through
proper editing of the incoming string. The RFCs do not dictate which
language should be used to implement DNS services, and such a service could
as easily be built in COBOL or BASH as it could in C. Your argument that "C
zero-terminated strings are bad because DNS suffers an exposure from
improperly validated strings that may contain NULL characters" is
irrelevant and argumentative.

Jacob, you are welcome to design a new language, and transition your
lcc-win32 to it. Please do so, with my blessing. And take your
argumentative stance elsewhere. I'm tired of hearing you bleat "C is
baaaaaaaaaad" in this forum. I'm tired of hearing you bleat "anyone who
disagrees with me is baaaaaaad" in this forum.

If you have a problem with an existing C program, please bring it up here.
If you have a problem with the C standard, bring it up in comp.std.c.
If you have a problem with the C language, write your own.

--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
 
P

Paul Hsieh

jacob navia said:

Null-terminated strings have advantages and disadvantages; counted
strings, too, have advantages and disadvantages.

It is completely lopsided. Counted strings are faster and safer, and
open up functionality (zero copy read-only substrings) not available
to the NUL terminated strings.
One major advantage
of null-terminated strings is that support for them is built into the
standard library.

That's was only an advantage in absence of an alternative that's as
good or better. The standard library for many compilers is closed
source, and even if they aren't you still don't know if they are
correct unless you study or test them independently yourself
(presumably gcc's test suite is open source, and therefore would be an
exception.) Something like my library ( http://bstring.sf.net/ ) is
immediately superior to the standard C library for at least this
reason.

But more importantly use of the C standard library at all itself is a
*SOURCE* of bugs. That's the *WHOLE POINT* of what this episode is
about. Find me a single case, where any other programming language is
similarly exploitable. If you at least use a different string library
in C (mine is not the only one) you are usually better off.

The fact that its ubiquitous is not an advantage if it carries its
error prone API around with it. Because a vendor cannot even *make*
it better while still conforming to the standard. Its even driven
Microsoft to issue diagnostics just for using it (their solution is a
bit wrong-headed, but at least they are doing something).
 
J

jacob navia

Lew said:
Zero terminated strings are a continuing security nightmare.
[snip]

What bugs me the most about this thread (and many others) is that it pushes
a single person's (or in this case, a single vendor's) agenda, using
arguments that have nothing to do with the proper use of the C language.

as I said elsewhere Mr Pitcher, the problem is that proper use of
zero terminated strings is impossible. It is an endless source of
bugs.

And I just added ONE more bug to the endless list of bugs
that a badly designed data structure hasbrought and will bring to us.
The 'fatal flaw' of DNS that you refer to could be avoided the through
proper editing of the incoming string.

All bugs can be avoided Mr Pitcher. What you fail to understand
is that a badly designed data structure makes bugs impossible to
avoid in the long run!
The RFCs do not dictate which
language should be used to implement DNS services, and such a service could
as easily be built in COBOL or BASH as it could in C. Your argument that "C
zero-terminated strings are bad because DNS suffers an exposure from
improperly validated strings that may contain NULL characters" is
irrelevant and argumentative.

Irrelavant?

Hardly.

Jacob, you are welcome to design a new language, and transition your
lcc-win32 to it. Please do so, with my blessing. And take your
argumentative stance elsewhere.

Why would I take it anywhere else?
Isn't this a forum for discussing the C language?

I'm tired of hearing you bleat "C is
baaaaaaaaaad" in this forum. I'm tired of hearing you bleat "anyone who
disagrees with me is baaaaaaad" in this forum.

I did not say anything like that. You are just showing that you haven't
got any arguments Mr Pitcher. If you are tired of my messages you
can setup a killfile with me inside and be happy.

This is an open forum and I can say what I want to say. Instead of
polemic you could show us a single argument in favor of zero
terminated strings. Ahh you have none!

OK. Then I understand the polemic.
If you have a problem with an existing C program, please bring it up here.

I have a problem with all existing C programs that are buggy because of
a badly designed data structure Mr Pitcher is that clear to you?
If you have a problem with the C standard, bring it up in comp.std.c.

I have already done that.
If you have a problem with the C language, write your own.

And yes, I have proposed changing the language so that counted
strings are ALSO a part of the language.
 
L

Lew Pitcher

Addendum:

==DNS==

The "Internet Standards" document for the basics of the Domain Name System
(STD0013) does not recognize imbedded zero characters in domain names. The
BNF definition of a legal domain name is:
<domain> ::= <subdomain> | " "
<subdomain> ::= <label> | <subdomain> "." <label>
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig-hyp> ::= <let-dig> | "-"
<let-dig> ::= <letter> | <digit>
<letter> ::= any one of the 52 alphabetic characters A through Z in
upper case and a through z in lower case
<digit> ::= any one of the ten digits 0 through 9

A) Any application that queries domain names should pre-edit/validate the
name to this pattern. If they had done so, then the "imbedded zero" DNS
flaw would be moot.

B) All DNS registrars should pre-edit/validate names to this pattern. Names
that fail to validate should be rejected.

C) All DNS servers should pre-edit/validate requests to this pattern.
Requests that fail to validate should be rejected.


==URL==

The standards document for the basics of the URL naming convention does not
recognize imbedded zero characters in URL strings. The BNF definition of a
legal URL is:

prefixedurl u r l : url
ur l httpaddress | ftpaddress | newsaddress |
nntpaddress | prosperoaddress | telnetaddress
| gopheraddress | waisaddress |
mailtoaddress | midaddress | cidaddress

scheme ialpha

httpaddress h t t p : / / hostport [ / path ] [ ?
search ]

ftpaddress f t p : / / login / path [ ftptype ]

afsaddress a f s : / / cellname / path

newsaddress n e w s : groupart

nntpaddress n n t p : group / digits

midaddress m i d : addr-spec

cidaddress c i d : content-identifier

mailtoaddress m a i l t o : : xalphas @ hostname

waisaddress waisindex | waisdoc

waisindex w a i s : / / hostport / database [ ? search
]

waisdoc w a i s : / / hostport / database / wtype /
wpath

wpath digits = path ; [ wpath ]

groupart * | group | article

group ialpha [ . group ]

article xalphas @ host

database xalphas

wtype xalphas

prosperoaddress prosperolink

prosperolink p r o s p e r o : / / hostport / hsoname [ %
0 0 version [ attributes ] ]

hsoname path

version digits

attributes attribute [ attributes ]

attribute alphanums

telnetaddress t e l n e t : / / login

gopheraddress g o p h e r : / / hostport [/ gtype [
gcommand ] ]

login [ user [ : password ] @ ] hostport

hostport host [ : port ]

host hostname | hostnumber

ftptype A formcode | E formcode | I | L digits

formcode N | T | C

cellname hostname

hostname ialpha [ . hostname ]

hostnumber digits . digits . digits . digits

port digits

gcommand path

path void | segment [ / path ]

segment xpalphas

search xalphas [ + search ]

user alphanum2 [ user ]

password alphanum2 [ password ]

fragmentid xalphas

gtype xalpha

alphanum2 alpha | digit | - | _ | . | +

xalpha alpha | digit | safe | extra | escape

xalphas xalpha [ xalphas ]

xpalpha xalpha | +

xpalphas xpalpha [ xpalphas ]

ialpha alpha [ xalphas ]

alpha a | b | c | d | e | f | g | h | i | j | k |
l | m | n | o | p | q | r | s | t | u | v |
w | x | y | z | A | B | C | D | E | F | G |
H | I | J | K | L | M | N | O | P | Q | R |
S | T | U | V | W | X | Y | Z

digit 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

safe $ | - | _ | @ | . | & | + | -

extra ! | * | " | ' | ( | ) | ,

reserved = | ; | / | # | ? | : | space

escape % hex hex

hex digit | a | b | c | d | e | f | A | B | C |
D | E | F

national { | } | vline | [ | ] | \ | ^ | ~

punctuation < | >

digits digit [ digits ]

alphanum alpha | digit

alphanums alphanum [ alphanums ]

void

Any application that accepts URLs should pre-edit/validate the string
against the above definition and reject validation failures as incorrect
URL strings.

What the OP complains about (his direct complaint) is the result of a
failure to validate, and that can happen in any language.

The OP's /agenda/ seems to be to cause enough dissent that some developers
who use the C language will switch to his quasi-C language variant,
supported by software which he sells.


--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
 
R

Richard Bos

jacob navia said:
Slashdot reports this today:

You keep coming up with the best jokes, jacob.

Slashdot and serious reporting... good one, good one.

Richard
 
J

jacob navia

Lew Pitcher wrote:

[snip]
What the OP complains about (his direct complaint) is the result of a
failure to validate, and that can happen in any language.

Yes bugs can happen in any language.

Specially in C using zero terminated strings that are a badly designed
data structure.

Normal software developers and the associated community should take care
of modifying the corresponding data structures to avoid error prone
constructs that have proved error-prone since decades!

What you refuse to acknowledge is that a change in the language is
needed.
The OP's /agenda/ seems to be to cause enough dissent that some developers
who use the C language will switch to his quasi-C language variant,
supported by software which he sells.

This is just bad faith, since you haven't got any arguments for zero
terminated string support.

(1) I have been arguing ALL ALONG that a change in THE LANGUAGE is
needed. I developed a first implementation and argued in comp.std.c
(where you argued AGAINST those changes) that the language needs
to be modified so that counted strings are accepted in C.

This is completely the OPPOSITE of what you are saying. I am precisely
trying to CHANGE the language and not trying to promote my first
implementation as such. I have always said that it is a proof of
concept, nothing else

(2) I am not selling anything there. lcc-win (and the source code
of the string library that it contains) are distributed free of charge.

You can read?

In case you don't, here is it again in BIG LETTERS (so that you can take
your glasses off if you want)

FREE OF CHARGE!

Of course you will say that somehow I earn my life indirectly with my
work, but then, linux is a commercial enterprise since Redhat and Suse
make money with it, including Linus.


You have just BAD FAITH because you have NO ARGUMENTS to support a
badly designed data structure that should have been changed DECADES AGO!

You have followed (with your characteristic bad faith) the discussions
in comp.lang.c where you always defended the status quo. You know
exactly that what you are saying is untrue.
 
J

jacob navia

Eric said:
I've read lots (and lots and lots) of Jacob's posts over the
years. A few have been constructive. This thread hasn't been.

Look Mr Sossman, in another thread I told you that I have
developed a counted string implementation.

Isn't that constructive?

You didn't answer THAT message because you have nothing to say. Zero
terminated strings aren't arguable as a serious position today and you
confirm it.
 
J

jacob navia

Eric said:
jacob said:
[...] the problem is that proper use of
zero terminated strings is impossible. [...]

Thus, *every* use of a zero-terminated string is improper.

Ri-i-i-ight.

In the long run yes. As heathfield proved here in this forum!

I am not inventing this

HEATHFIELD PROVED that using zero terminating strings correctly
is impossible in the long run.

You agree with me now?

:)
 
N

Nobody

Zero terminated strings are a continuing security nightmare.
C is like that
etc etc.

(Note that C++ uses zero terminated strings too)

As does every popular operating system.

[Although the "NT API" on which Windows NT/2K/XP/Vista are built uses
stored-length strings, which means that you can e.g. create registry keys
which cannot be accessed using the Win32 API.]
 
L

Lew Pitcher

On July 31, 2009 13:17, in comp.lang.c, jacob navia ([email protected])
wrote:

[snip]
(2) I am not selling anything there. lcc-win (and the source code
of the string library that it contains) are distributed free of charge.

You can read?

In case you don't, here is it again in BIG LETTERS (so that you can take
your glasses off if you want)

FREE OF CHARGE!

From your website:

License:

This software is not freeware, it is copyrighted by Jacob Navia. It's
free for non-commercial use, if you use it professionally you have to
have to buy a licence.

Professional use is:
* Related to business (e.g you use it in a corporation)
* If you sell your software.


Let me repeat the key phrase: "If you use it professionally, you have to buy
a licence". That doesn't look free to me. That looks like a paid-for
product, no different than MS Visual C++ (which you only buy a licence
for). That makes you a vendor, sir.

[snip]
--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
 
C

Chris M. Thomasson

jacob navia said:
Lew said:
Zero terminated strings are a continuing security nightmare.
[snip]

What bugs me the most about this thread (and many others) is that it
pushes
a single person's (or in this case, a single vendor's) agenda, using
arguments that have nothing to do with the proper use of the C language.

as I said elsewhere Mr Pitcher, the problem is that proper use of
zero terminated strings is impossible.
[...]

100% impossible? Humm...
 
J

jameskuyper

Lew said:
On July 31, 2009 13:17, in comp.lang.c, jacob navia ([email protected])
wrote:

[snip]
(2) I am not selling anything there. lcc-win (and the source code
of the string library that it contains) are distributed free of charge.

You can read?

In case you don't, here is it again in BIG LETTERS (so that you can take
your glasses off if you want)

FREE OF CHARGE!

From your website:

License:

This software is not freeware, it is copyrighted by Jacob Navia. It's
free for non-commercial use, if you use it professionally you have to
have to buy a licence.

Professional use is:
* Related to business (e.g you use it in a corporation)
* If you sell your software.


Let me repeat the key phrase: "If you use it professionally, you have to buy
a licence". That doesn't look free to me. That looks like a paid-for
product, no different than MS Visual C++ (which you only buy a licence
for). That makes you a vendor, sir.

It could be just a language issue. He's certainly got something (his
compiler, for commercial use) up for sale. However, when he says "I'm
not selling anything", it could be that what he means is that he
hasn't had any commercial users who were willing to pay for it.
 
P

Paul Hsieh

Paul said:
jacob navia wrote:
Zero terminated strings are a continuing security nightmare.
[...]
     Even if we accept the thesis as proven (which I don't, but
let it pass), what remedy or other action would you suggest?
You are kidding right?  1) Deprecate all use of the C library w.r.t.
string processing.  2) Use *any other* programming language to handle
string processing throughout the chain of tools described (even C++'s
STL would probably have worked.)  These are the obvious and implicit
proposals whenever you see this sort of thing.

     How big a programming team are you going to fund to undertake
this work?  They've only got forty years' worth of code to wade
through ...

You should look up the word "deprecate". I highly doubt that even the
tiniest fraction of the code written in the last 40 years needs to be
looked at. We were talking about a DNS server (that would be a widely
used open source central resource right?) and a web browser (Firefox
-- a pretty open source thing as I recall right?).

As an example, the next ANSI C standard is supposedly going to
deprecate gets(). Do you think that's going to cause work for people
to look through 40 years of source code?
     Sorry; what has this got to do with anything?  Why do you
even bring it up?

Because that's the standard solution?
 
J

jacob navia

Richard said:
jacob navia said:
Eric said:
jacob navia wrote:
[...] the problem is that proper use of
zero terminated strings is impossible. [...]
Thus, *every* use of a zero-terminated string is improper.

Ri-i-i-ight.
In the long run yes. As heathfield proved here in this forum!

I am not inventing this

HEATHFIELD PROVED that using zero terminating strings correctly
is impossible in the long run.

You have a strange idea of "proof".

<snip>

When I explained to you your bug with zero terminated strings
(that was discovered by Han from China) I wrote:

"The writer of this code is a competent C programmer. The fact that
even him could not avoid a bug proves that zero terminated strings
can't be used reliably"
 
J

jacob navia

Lew said:
On July 31, 2009 13:17, in comp.lang.c, jacob navia ([email protected])
wrote:

[snip]
(2) I am not selling anything there. lcc-win (and the source code
of the string library that it contains) are distributed free of charge.

You can read?

In case you don't, here is it again in BIG LETTERS (so that you can take
your glasses off if you want)

FREE OF CHARGE!

From your website:

License:

This software is not freeware, it is copyrighted by Jacob Navia. It's
free for non-commercial use, if you use it professionally you have to
have to buy a licence.

Professional use is:
* Related to business (e.g you use it in a corporation)
* If you sell your software.


Let me repeat the key phrase: "If you use it professionally, you have to buy
a licence". That doesn't look free to me. That looks like a paid-for
product, no different than MS Visual C++ (which you only buy a licence
for). That makes you a vendor, sir.

[snip]

and not different from Cygwin, that asks for thousands of dollars for
the "right" to use their stuff if you do not disclose your source code

and not different from redhat that asks for US$ 20 000 for technical
help from a gcc professional (year technical support license)

and not different from Suse, that sells their stuff too.

and not different from all other products that live from professional
users that allow them to keep expenses down!

AND SO WHAT?

Does my license make zero terminated strings any better?

YOU HAVE NO ARGUMENTS Mr. Pitcher, that is why you HAVE
to get personal.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top