can my program requires 4GB memory in debian?

X

Xiaoning He

hi currently i'm using a crawler called larbin to get some pages, it
hashes each url to an integer. This is not a good method comparing to
md5, however it's enough for me. Currently i set the hash value to be
a 31-bit integer, which is used to be the index of a bit string in
memory, thus needs 256MB memory. i have 4GB+ free memory, so my
question is can i call for 4GB memory in my program in c++? or sixteen
256MB arrays. What about 8GB??
 
U

Ulrich Eckhardt

Note up front: the issues here, while they crop up on any system, have
pretty little to do with C per se, so you should rather take this to a
group dedicated to programming under Debian (Note that it's a name, thus
the capital letter, and that I'm assuming you are using Debian/Linux).
Lastly, asking about C++ in a C newsgroup shows that that you actually
didn't take the time to get familiar with Usenet behaviour of first finding
out what a group is about and what is considered on-topic there (search
for "Usenet etiquette"). Please do that before further postings to the
Usenet.

Xiaoning said:
hi currently i'm using a crawler called larbin to get some pages, it
hashes each url to an integer. [...] Currently i set the hash value to be
a 31-bit integer, which is used to be the index of a bit string in
memory, thus needs 256MB memory.

Okay, so you have a bit for every 31-bit hash value that tells you if the
URL was e.g. already visited. While this works, this actually fails when
you have hash collisions. Depending on the way things are used, using a
hash map (or maybe even just a std::map<>, but that requires C++ which is
not the topic here) which works correctly even with hash collisions allows
this to work, though the overhead is bigger. I'd use that, until my
requirements actually say that the overhead is too large and that
collisions don't matter.

Now for a completely unrelated topic...
i have 4GB+ free memory, so my question is can i call for 4GB memory in
my program in c++? or sixteen 256MB arrays. What about 8GB??

The amount of allocatable memory depends on the available virtual address
space. For 32 bit Linux systems that is at most 2 or 3 GiB (I'm not sure
which, but I think it depends on some settings with which the kernel was
compiled), on 64 bit systems it is much larger. Note that not all this
memory must be backed up by RAM, so the amount of available RAM will not
affect if this works (rather, RAM+swap set the limit) but it will affect
how well it performs. Lastly, the amount of available contiguous memory
might be limited by the fragmentation of the virtual address space. This is
independent of the programming language but rather depends on the operating
system, so it's off topic here.

Uli
 
F

Flash Gordon

Xiaoning He wrote, On 05/01/08 11:51:
hi currently i'm using a crawler called larbin to get some pages, it
hashes each url to an integer. This is not a good method comparing to
md5, however it's enough for me. Currently i set the hash value to be
a 31-bit integer, which is used to be the index of a bit string in
memory, thus needs 256MB memory. i have 4GB+ free memory, so my
question is can i call for 4GB memory in my program in c++? or sixteen
256MB arrays. What about 8GB??

Firstly, this is comp.lang.c, C++ is a different language which is not
topical here. Secondly, in C it would depend entirely on the
implementation and I expect the same is try in C++, so you probably need
to ask in a group dedicated to your implementation rather than comp.lang.c++
 
X

Xiaoning He

hi currently i'm using a crawler called larbin to get some pages, it
hashes each url to an integer. This is not a good method comparing to
md5, however it's enough for me. Currently i set the hash value to be
a 31-bit integer, which is used to be the index of a bit string in
memory, thus needs 256MB memory. i have 4GB+ free memory, so my
question is can i call for 4GB memory in my program in c++? or sixteen
256MB arrays. What about 8GB??

finally i decide to require 2GB memory. Sorry for some inappropriate
points, I hardly post a message on usenet(this one seems to be the
2nd) and i've only learned c, although this program written in c++ is
not too hard to understand. Thank you for answering my question.
 
C

CBFalconer

Xiaoning said:
hi currently i'm using a crawler called larbin to get some pages, it
hashes each url to an integer. This is not a good method comparing to
md5, however it's enough for me. Currently i set the hash value to be
a 31-bit integer, which is used to be the index of a bit string in
memory, thus needs 256MB memory. i have 4GB+ free memory, so my
question is can i call for 4GB memory in my program in c++? or sixteen
256MB arrays. What about 8GB??

C99 guarantees you can receive 64 kbytes total. C90 guarantees 32
kbytes. Anything more depends on the actual installation. Most
provide more. See your system documentation.
 
R

Randy Howard

C99 guarantees you can receive 64 kbytes total. C90 guarantees 32
kbytes. Anything more depends on the actual installation. Most
provide more. See your system documentation.

I thought this minimum guarantee had to do with something like this:

int foo[32768];

And not the behavior of malloc(). I'm not arguing, I'm asking for a
clarification.
 
C

CBFalconer

Randy said:
CBFalconer wrote
.... snip ...
C99 guarantees you can receive 64 kbytes total. C90 guarantees 32
kbytes. Anything more depends on the actual installation. Most
provide more. See your system documentation.

I thought this minimum guarantee had to do with something like this:

int foo[32768];

And not the behavior of malloc(). I'm not arguing, I'm asking for a
clarification.

Don't ask me for C&V. I believe the guarantee is for _all_ object
storage, including malloc.
 
C

cr88192

CBFalconer said:
C99 guarantees you can receive 64 kbytes total. C90 guarantees 32
kbytes. Anything more depends on the actual installation. Most
provide more. See your system documentation.

a limit, yes, but in a way, an arbitrary and silly one...


reason:
on pretty much any 32 bit system, you can allocate way more than this.

on a 16 bit system (say, DOS), you can only allocate one such object (if you
are lucky enough to even get this, aka: a sufficiently large, and empty,
data segment). for multiple objects of this "icredible" size, you would have
to use something else (fmalloc or something like that...).


as a result, in some cases, standardizing on minimums or limits is silly.

it is much the same as including system requirements on mice and keyboards.
they are arbitrary, and have almost nothing to do with the device (why
should my PS2 mouse even care if I have a 1GHz CPU with 256MB of ram and a
10GB HD, when the mouse doesn't even come with a driver CD?...).


and, even more ammusing:
getting some chinese-made MP3 player (with a manual written in Engrish and
all that), and it having system requirements like a 486SX33 with 32MB RAM
and a 20MB HD.

yes, now go find a 486 with USB 2.0...

then again "you only installdriver if you using the Windows 95 or 98" (or
something like that...).


maybe it is better to say that how much memory malloc can allocate is
undefined...

not so good for people that get all worked up over every word and letter in
the standard though...


hell, there is probably at least some embedded system somewhere, where:
void *malloc(size_t sz)
{
return(NULL);
}

....


or such...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,282
Latest member
RoseannaBa

Latest Threads

Top