Unaligned access

A

aleksa

I'm relatively new to C.

How should I define a ptr to monochrome bitmap?
I currently have char*.

So far, I'm only allocating (w/o OS functions) memory for a
series of monochrome bitmaps. Every bitmap is different in size.

int size; // bitmap size (in bytes)
char* pbitmap; // ptr to current bitmap

size = GetBitmapSize(...);

// align next bitmap address on 16 bytes
pbitmap += (size + 16) & ~15;

When I had void* as ptr, this aligning thing was troublesome,
so I switched to char*.

Later, I will create functions PGet and PSet which will
operate on bits (pixels) within selected byte.

Now, suppose the code will run on CPU that
doesn't have byte memory access.

Is it my job or compiler's job to correctly access bytes?

The same thing is reading an ascii file. Do I have to make
any special precautions or just use char* and read trough?
 
M

Malcolm McLean

I'm relatively new to C.

How should I define a ptr to monochrome bitmap?
I currently have char*.

So far, I'm only allocating (w/o OS functions) memory for a
series of monochrome bitmaps. Every bitmap is different in size.

        int size;               // bitmap size (in bytes)
        char* pbitmap;  // ptr to current bitmap

        size = GetBitmapSize(...);

        // align next bitmap address on 16 bytes
        pbitmap += (size + 16) & ~15;

When I had void* as ptr, this aligning thing was troublesome,
so I switched to char*.

Later, I will create functions PGet and PSet which will
operate on bits (pixels) within selected byte.

Now, suppose the code will run on CPU that
doesn't have byte memory access.

Is it my job or compiler's job to correctly access bytes?

The same thing is reading an ascii file. Do I have to make
any special precautions or just use char* and read trough?
Unsigned char for a pointer to arbitrary byte data. This is one of C's
little quirks. Plain char can be either signed or unsigned, and if
signed can contain trap representations. Alos, unsigned char documents
your intention.

By definition processors can access individual bytes. However on some
machines the hardware bytes are 32 bits whilst C chars are 8 bits. The
compiler handles this transparently by bit twiddling. This is
efficient in memory use, inefficient in processor usage. Whilst such
designs are rare, often it will be faster to handle image data as 32
bit pixels rather than as separate 8-bit channels. However these days
framebuffer operations are seldom the bottleneck.

Images tend to have sizes not known at runtime. Allocate a flat buffer
using malloc and calculate x y offsets yourself. Don't try to use a 2d
array.
 
W

William Hughes

I'm relatively new to C.

How should I define a ptr to monochrome bitmap?
I currently have char*.

So far, I'm only allocating (w/o OS functions)

Ok, so we are now way, way, out into undefined
territory.
memory for a
series of monochrome bitmaps. Every bitmap is different in size.

        int size;               // bitmap size (in bytes)
        char* pbitmap;  // ptr to current bitmap

size = GetBitmapSize(...);

At this point, pbitmap is not defined. I assume
that at some point you do something (e.g. assign
an integer) to define it. Be aware that this
may not work, and even if it does you probably
want a different integer for each platform and perhaps
for different implementations on the same platform.

pbitmap = malloc(size);

will always work, but then you lose control over where
pbitmap is. If you want portability, define a macro

ASSIGN_BITMAP(pbitmap, size)

and change it for each implementation used, with the
above malloc as the default.



        // align next bitmap address on 16 bytes
        pbitmap += (size + 16) & ~15;

When I had void* as ptr, this aligning thing was troublesome,
so I switched to char*.


Not surprising, adding an integer to a void* has no meaning
in C. Adding an integer to a char* may have meaning and on
many platforms it will do exactly what you expect.
(I am not sure what will happen if (size + 16) will not fit
in an int, but I think that ~15 will not be what you want)

Later, I will create functions PGet and PSet which will
operate on bits (pixels) within selected byte.

Now, suppose the code will run on CPU that
doesn't have byte memory access.


Is it my job or compiler's job to correctly access bytes?


If x is a char* then it is the compilers job to get the
"correct" byte if you ask for x[47]. However, bear in mind that the
compiler may not do what you want. For one thing, the compiler may not
use 8 bit bytes. So what the compiler gets may not be
the 8 bits at offset x + 8*46
The same thing is reading an ascii file. Do I have to make
any special precautions or just use char* and read trough?

If you fopen the file and read character by character, you will
get the values you expect. However, the processor may not do this
in the way you expect.

- William Hughes
 
A

aleksa

The compiler handles this transparently by bit twiddling. This is
efficient in memory use, inefficient in processor usage.

In other words, it will *always* work, regardless of CPU used, right?

Currently, I plan this to execute only on
x86 and ARM9 and both can access bytes.


P.S.
I just wanted to be sure, in case I choose some ARM7 in the future.
That ARM7 probably won't work on bitmaps (too slow), but will most
probably
read some ascii files and I wouldn't want to change the sources later.
 
C

Chad

Ok, so we are now way, way, out into undefined
territory.




At this point, pbitmap is not defined.  I assume
that at some point you do something (e.g. assign
an integer) to define it. Be aware that this
may not work, and even if it does you probably
want a different integer for each platform and perhaps
for different implementations on the same platform.

         pbitmap = malloc(size);

will always work, but then you lose control over where
pbitmap is.  

How do you lose control over where bitmap is? I mean, it's pointing at
some 'valid' area of memory, isn't it?
 
M

Malcolm McLean

In other words, it will *always* work, regardless of CPU used, right?

Currently, I plan this to execute only on
x86 and ARM9 and both can access bytes.
void setpixel(unsigned char *buff, int width, int height, int x, int
y, unsigned char red, unsigned char green, unsigned char blue)
{
unsigned char *pixel = buff + ((y * width) + x) * 3;
pixel[0] = red;
pixel[1] = green;
pixel[2] = blue;
}

will always work, as long as buff points to a big enough area of
memory, and other fucntions treat the image buffer as an array of 8
bit rgbs.
(There will be a chorus of demands to use size_ts instead of ints,
which are correct. The total size of the buffer mustn't overflow the
width of an int).

You don't need to worry abput buff's alignment.

However other systems may well be faster. An obvious problem with the
above is that it takes too many parameters, a compiler may not always
optimise the call out.
 
W

William Hughes

How do you lose control over where bitmap is? I mean, it's pointing at
some 'valid' area of memory, isn't it?

Yes, you know the memory is valid, but you have no idea where it
is or what it is. It might be a little old man in China
with a brush and pad who communicates by mail.
A more reasonable example is a machine with some fast
memory that is never saved to disk, and some slower memory
that can be paged to disk. If you need your bitmaps to be
in the fast memory, malloc may not cut it.

- William Hughes
 
A

aleksa

At this point, pbitmap is not defined.

And it really isn't, I didn't get to that point yet.
 I assume that at some point you do something (e.g. assign
an integer) to define it.

How can I assign an integer to a pointer, isn't that invalid?

This is what I have planned:

1. pbitmap is (will be) initialized from a void* that
points to free memory.

2. store (somewhere) current pbitmap.

3. GetBitmapSize (in bytes) and adjust pbitmap.

4. get next bitmap and goto 2.

Actually, I have this already working, but written in ASM.

Now I'm converting it to C, and my first problem was void* or char*
as a ptr to bitmap. It seems that is must be char* although that
is a bit confusing since I'm not really accessing ascii characters.
Be aware that this
may not work, and even if it does you probably
want a different integer for each platform and perhaps
for different implementations on the same platform.

I don't understand this.

Adding an integer to a char* may have meaning and on
many platforms it will do exactly what you expect.

There are platforms that will behave differently?
Are x86 and ARM in that platform-list?
(I am not sure what will happen if (size + 16) will not fit
in an int, but I think that ~15 will not be what you want)

int is 32-bits, why wouldn't it fit? I don't plan to use 16 bitters
anymore :)

~15 is to align the memory ptr to 16 bytes, and the code generated
for x86 is correct (checked).

How else can I align my ptr?
If you fopen the file and read character by character, you will
get the values you expect.  However, the processor may not do this
in the way you expect.

I don't have any OS. My ASM code received the file with RS-232.

C code will do the same, and than scan with char*.
Any problems with that on x86, ARM?

P.S.
I'm switching from x86 ASM to C and my experience so far is:

- the sources are more readable and shorter.

- generated code is 5% faster than my hand-written ASM.
(I've tested the speed on one project only, easy-writing in both
langs)

- portability... thats why I even started learning C and now I hear
that bytes can be longer than 8 bits... How long is a nibble, than?

- and yeah, C can be very frustrating at times, and always wants
to be smarter than me...
 
A

aleksa

void setpixel(unsigned char *buff, int width, int height, int x, int
y, unsigned char red, unsigned char green, unsigned char blue)
{
  unsigned char *pixel = buff + ((y * width) + x) * 3;
  pixel[0] = red;
  pixel[1] = green;
  pixel[2] = blue;
}

Thanks, pretty straightforward, but I'll have to convert it to
monochrome BMP.
(There will be a chorus of demands to use size_ts instead of ints,
which are correct.

My English is poor here..
Do you say that people would say size_t is a better choice here,
but *you* stick with ints (as in, sorry folks, ints are correct)?

Why would size_t be better? From what I read here:
http://bytes.com/topic/c/answers/220206-what-size_t
size_t should not be used here..

Also, width, height, x and y are always positive values,
so int could/should be unsigned int (by my POW).

I've seen examples where always-positive variables are
not defined as unsigned int, only int, so I'm doing the
same, even though I don't understand why.
 
K

Keith Thompson

aleksa said:
void setpixel(unsigned char *buff, int width, int height, int x, int
y, unsigned char red, unsigned char green, unsigned char blue)
{
  unsigned char *pixel = buff + ((y * width) + x) * 3;
  pixel[0] = red;
  pixel[1] = green;
  pixel[2] = blue;
}

Thanks, pretty straightforward, but I'll have to convert it to
monochrome BMP.
(There will be a chorus of demands to use size_ts instead of ints,
which are correct.

My English is poor here..
Do you say that people would say size_t is a better choice here,
but *you* stick with ints (as in, sorry folks, ints are correct)?

Yes, I think that's what Malcolm is saying.

Incidentally, I had to go back to the parent article to confirm that
Malcolm was the one who wrote it. Please leave attribution lines in
place for any quoted text.
Why would size_t be better? From what I read here:
http://bytes.com/topic/c/answers/220206-what-size_t
size_t should not be used here..

Speaking of unattributed quotations, that web page appears to be a
copy of a thread from comp.lang.c, published with no indication of
where it came from and with a strong implication that the articles
were posted by bytes.com's "community of C / C++ experts". So this
article will probably appear on bytes.com as well.

To anyone reading this on bytes.com: I am not a member of this
"community", and bytes.com does not have my permission to re-post
this article or falsely claim credit for it.

Anyway, the thread you refer to includes several different opinions on
the topic. The statement that you should use int rather than size_t
appears to be from Malcolm, the same person you were replying to,
so I wouldn't say it's supporting evidence.

My own opinion is that size_t is the appropriate type to use for
an array index. It's ok to use int if you're reasonably sure
that indices cannot exceed INT_MAX, but can be difficult to be
sure of that as the program is modified in the future. size_t,
on the other hand, is very nearly guaranteed to be big enough
(the rationale for that has been discussed here at length before).
Also, width, height, x and y are always positive values,
so int could/should be unsigned int (by my POW).

I've seen examples where always-positive variables are
not defined as unsigned int, only int, so I'm doing the
same, even though I don't understand why.

unsigned int typically (almost always) can represent a wider range
of positive values than int. For example, on a 16-bit system
INT_MAX is typically 32767 and UINT_MAX is typically 65535.
So that might be a slight advantage for unsigned int over int.
But size_t (another unsigned type) has the further advantage that
it's guaranteed to be able to represent any valid array index.

There are some pitfalls in using unsigned types (unsigned int,
unsigned long, size_t, etc.). Integer types, whether signed
or unsigned, represent a finite subrange of the infinite set of
mathematical integers. If you stay well within that subrange, you
can safely pretend that you're dealing with mathematical integers.
As you approach the endpoints of the range, you can run into cases
where the results of a calculation don't match the mathematical
result, and may not even be well defined. For signed types, those
endpoints are at large negative and positive values that you're often
not likely to reach. For unsigned types, one of the endpoints is at
0, well within the range of values you're likely to be dealing with.

An example:

unsigned int count = 10;
while (count >= 0) {
/* ... */
count --;
}

This is an infinite loop, because the condition "count >= 0" is always
true.

Using int for array indices is quite common, and I'm not saying
that it's wrong. But I do think that size_t is the safest and most
sensible type to use for array indices. You just have to keep in
mind that it's an unsigned type and watch out for any pitfalls.
Obviously not everyone agrees.
 
A

aleksa

Hmm, after reading again, I now too think that size_t
should be used, since it is unsigned and "width, height,
x and y are always positive values".

At first, I just stopped at:
"`size_t' is a type suitable for representing the amount
of memory a data object requires, expressed in units of `char'."
 
K

Keith Thompson

aleksa said:
Hmm, after reading again, I now too think that size_t
should be used, since it is unsigned and "width, height,
x and y are always positive values".

At first, I just stopped at:
"`size_t' is a type suitable for representing the amount
of memory a data object requires, expressed in units of `char'."

Note that the same rationale would justify using unsigned int rather
than size_t.

The reason size_t is better is this:

The size in bytes of any object can be expressed as a value within the
range of size_t. For a declared object, ``sizeof obj'' yields a size_t
result; for an allocated object, malloc()'s argument is of type size_t.

Nitpick, feel free to ignore: {
There have been lengthy discussions here questioning this
assumption. An implementation might permit a declared object to
be bigger than size_t bytes, and ``sizeof obj'' could overflow
like any other operator. calloc() takes two size_t arguments,
and could in theory create an object bigger than SIZE_MAX bytes.
Some implementation-specific mechanism could be used to create or
access such objects. In practice, though, I know of no systems
where this is an issue. My own argument is that if objects bigger
than size_t bytes are possible, the implementation just needs to
choose a bigger type for size_t. And if size_t isn't guaranteed
to be big enough, no other type is either -- not even uintmax_t.
}

Since array elements are always at least 1 byte (C has no arrays of
bits or of bit fields), it follows that size_t is also guaranteed
to be big enough to hold any valid array index.

Some have argued that this is ugly, because the name "size_t" implies a
size in *bytes*, and perhaps because the "_t" suffix is just clutter.
I disagree.
 
B

bart.c

aleksa said:
Hmm, after reading again, I now too think that size_t
should be used, since it is unsigned and "width, height,
x and y are always positive values".

At first, I just stopped at:
"`size_t' is a type suitable for representing the amount
of memory a data object requires, expressed in units of `char'."

For working with images, it's unlikely that width, height, x or y will need
more than 16 bits, so that int will always be enough.

(But calculations with these to work out an offset within the entire image
will usally need more.)

Using size_t for x,y might also be problematic, if x,y can ever be negative.
For example, to draw geometric elements with some points to the left or
above your image (if (0,0) is the top left). Negative coordinates allows
clipping to be applied; always-positive coordinates makes this harder.
 
W

William Hughes

For working with images, it's unlikely that width, height, x or y will need
more than 16 bits


Highly unlikey, why such an image would probably be several
gigabytes in size. Such images are very unlikely until the mid
1990's.

On the other hand it is unlikely you would try to manipulate
such an image using an implementation with 16 bit ints.

- William Hughes
 
B

bart.c

William said:
Highly unlikey, why such an image would probably be several
gigabytes in size.

I was arguing against using size_t. My point was int would normally suffice,
probably even when ints were 16-bits.
Such images are very unlikely until the mid
1990's.

16-bits unsigned allows addressing of up to 4000 Mpixel images. Of course in
the 1990's we were all dealing with much bigger images than that...

However 32-bit signed x,y do make more sense (than both 16-bits and size_t)
allowing for degenerate image sizes, and virtual coordinates.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top