the main reason is:
- this architecture consists of two units, central (A) unit and
calculation unit(B)
- in my program I need to send the information of the address of A
where the data for the calculation is stored [snippage]
- the way in B to retrieval info from A is by DMA tranfer using a
special function in B which needs to know the address in A in order to
transfer this data
Given all of this information, simply "taking the address" is
quite possibly insufficient.
Consider, for instance, the situation in which the calculation
device ("unit B" here) is on a 32-bit PCI bus, while the
system running the C program ("unit A" here) is a 64-bit-memory PC.
Suppose that the physical address of item A, on the PC, is actually
0x0000_0002_410C_59B4 (underscores inserted here for readability;
this is not a C-language constant). This address lies beyond the
four gigabyte limit of the 32-bit address range of the PCI bus, so
it is physically impossible for unit B to supply it in the first
place.
Moreover, the *virtual* address of item A within the process running
on the PC may well not be 0x0000_0002_410C_59B4. This is merely
the *physical* address that must be sent when the device on the
PCI bus wants to access the physical RAM that will hold the
representation of the value calculated by unit B. That is, when
unit B has a final answer, it must write it to physical address
0x0000_0002_410C_59B4, which the process running on unit A will
read when it reads some other virtual address (perhaps something
like 0x000719B4, for instance).
What is required, in this particular case, is to set up an "I/O
map" (or IOMMU or some similar name) that sits between the 32-bit
PCI bus and the 64-bit physical address space. (If there is no
IOMMU, some alternative strategy must be devised.) The IOMMU will
take care of mapping from a 32-bit PCI bus address to a 64-bit
address. To build the mapping, you start with the virtual address(es)
as seen by the C code running on "unit A", and translate that
(those) to the ultimate 64-bit physical address(es). Then you
obtain a sufficiently large virtual address space in the IOMMU,
and map that virtual address to the same physical page(s). You
then supply, to unit B, the "PCI side virtual address" you obtained
and mapped. When the DMA completes, you can free the virtual
address range on the IOMMU (or, depending on device access patterns,
you might allocate this range permanently, and simply remap it as
needed).
Each of these steps is quite machine-dependent. In addition, some
devices have restrictions on DMA addressing (e.g., even though a
bus might be officially "32 bit", or perhaps even wider, some
devices might only be able to work with 24-bit addresses).
If one is lucky enough, sometimes all of the various steps fade
away: perhaps Unit A deals only in 64-bit physical addresses in
the first place, and perhaps Unit B has a 64-bit address bus, so
that the "virtual address" in the C code *is* the physical address
and the mapping for Unit B is 1-to-1 as well. In this case, all
the intermediate steps drop out. But in principle, this is how
the task is normally accomplished. Worse, none of the steps has
anything to do with Standard C, so the C Standard is no help.