This document describes the binary encoding of messages allowing communication between different hardware platforms.
The uniform inter-platform message (IPM) format is used for communication between processes on systems with different representation of hardware data types.
There are two approaches to platform-independent communications:
the simpliest (and most widely used) approach is to choose some
"standard" way of encoding elementary data types (usually
referred to as network format, encode
messages into that format before transmission and decode them
back into host format on receiption.
However, this approach is inefficient when host data format is not the same as network data format. In this case exchange of messages within the host (or between nodes of a MIMD machine) causes costly and completely unnecessary conversions to network format and back. As a result this approach cannot be used to provide a uniform way for communication with both local and remote processes.
The second approach involves tagging messages with some description
of how data is represented within those messages, so sender may
transmit messages in its local host format, without any encoding.
If sender's data representation corresponds to the recipient's internal
representation no conversion on receipt is necessary.
If representations are different, the recipient must use sender's
data format description to convert message to the local format.
This approach requires all hosts to be able to convert data from all other hosts' representations. Although it may seem very difficult the reality is that practically all computers have similar data formats, differing mostly in such details as byte order, sizes of data types, etc. The Grâl IPM data format description covers practically all modern machines. In any case, a program running on a really non-trivial machine may revert to pre-encoding messages to make them parseable by Grâl message recipients.
The Grâl message format is designed to simplify conversions when communicating hosts belong to the dominant species of binary computers with integer type sizes being powers of 2, twos-complement representation for negative integers and IEEE Std 754-like floating point numbers. To make conversions more efficient some restrictions are placed on ranges of values and alignment of elements within messages.
Grâl inter-platform messages are composed from the following elementary data types (names of types below are symbolic and may vary in different languages):
| Type Name | Range Of Values | Precision | |
|---|---|---|---|
| char | -(27-1) | 27-1 | |
| uchar | 0 | 28-1 | |
| short | -(215-1) | 215-1 | |
| ushort | 0 | 216-1 | |
| long | -(231-1) | 231-1 | |
| ulong | 0 | 232-1 | |
| xlong | -(263-1) | 263-1 | |
| uxlong | 0 | 264-1 | |
| float | -3.4x1038 | 3.4x1038 | 24 bits |
| double | -1.79x10308 | 1.79x10308 | 53 bits |
The corresponding host data types should be able to accomodate all values within specified ranges; transmission of values outside of those ranges is not guaranteed. Note that signed integer values do not include -2n-1 (where n is the number of bits) as it cannot be represented on machines with ones-complement negative numbers. Conversely, -0 is converted to +0 if the destination host uses twos-complement representation. To avoid conversion problems bit masks must always be represented as unsigned values.
Floating point numbers may be reduced in precision or reduced to infinities or zeroes in the process of conversion; however it is safe to assume that there won't be any unnecessary reduction of precision or limitation of range. Preservation of NaNs, infinite values and unnormalized numbers is not guaranteed.
Host representations of longer data types must be at least as large as of correponding shorter types. I.e. number of bits in host representation of long cannot be less than in representation of short, but can be less than in representation of ushort.
Data format conversion does not include transliteration or any other conversion of character strings. If any such conversion is necessary, it is a responsibility of the application.
Special care must be taken to ensure that elements within the messages are always aligned on boundary corresponding their size, relative to the beginning of the message, even if local host hardware does not require such alignment. Composite elements (arrays, structures, unions) must always be aligned accordingly with sizes of their largest elementary members.
An application programmer must reserve special header field in the beginning of message structure for description of local binary representation and initialize that field with the local host's constant before sending the message. That field takes 2 octets (groups of 8 bits) if source host has 8/16/32/64 bit integer numbers and IEEE Std 754-compliant floating point numbers. Otherwise, long (16-octet) field should be used.
Message conversion must not cause more than two-fold increase in its size (plus 14 octets if longer version of header is required); this limits the choice of hardware data types used to represent elementary data types. This limitation allows to pre-allocate sufficient memory for decoding of incoming messages (if messages are variable-size) or for receiption of messages (if messages are fixed-size). Obviously messages originated from machines with standard integer sizes and IEEE 754 floating point formats will have minimal possible size.
Format of the full 16-octet header is:
| Octet # | Type | Description |
|---|---|---|
| 0 | bitmask | Integer format flags |
| 1 | bitmask | Floating point format flags |
| 2 | unsigned | Total size of float, bits |
| 3 | unsigned | float's significand size, bits (excluding integer bit) |
| 4 | 2s-complement signed |
Exponent bias value defect for float |
| 5 | unsigned | Total size of double, bits |
| 6 | unsigned | double's significand size, bits (excluding integer bit) |
| 7 | 2s-complement signed |
Exponent bias value defect for double |
| 8 | unsigned | Size of char, bits |
| 9 | unsigned | Size of uchar, bits |
| 10 | unsigned | Size of short, bits |
| 11 | unsigned | Size of ushort, bits |
| 12 | unsigned | Size of long, bits |
| 13 | unsigned | Size of ulong, bits |
| 14 | unsigned | Size of xlong, bits |
| 15 | unsigned | Size of uxlong, bits |
Short (2-octet) message header is used when most-significant bits (0200) of both integer and floating point format octets are zero (see definitions below). Format of short message header is:
| Octet # | Type | Description |
|---|---|---|
| 0 | bitmask | Integer format flags |
| 1 | bitmask | Floating point format flags |
Integer format flags (octet 0) are:
| Bit octal |
Meaning If Clear | Meaning If Set |
|---|---|---|
| 1 | Integers are big-endian (most significant byte is first) |
Integers are little-endian (most significant byte is last) |
| 2 | Negative numbers are twos-complement |
Negative numbers are ones-complement |
| 4 | Natural order of bytes | short-sized pairs are swapped in long integers |
| 200 | Integer sizes are 8/16/32/64 bits |
Integer sizes are in header octets 8-15 |
The implicit assumptions about integer data representations are:
There are no restrictions on relative bit sizes of integer data types, except for the requirement that longer types should not have shorter representations. It means that sizes of long types do not have to be divisible by sizes of shorter types, i.e. combinbation of 12-bit char, 18-bit short and 36-bit long is acceptable. Field alignment is always counted in bits, not bytes.
Floating-point format flags (octet 1) are:
| Bit(s) octal |
Meaning If Clear | Meaning If Set | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | FP numbers are big-endian (most significant byte is first) |
FP numbers are little-endian (most significant byte is last) |
||||||||||
| 6 |
Binary exponent base as follows:
|
|||||||||||
| 10 | Exponent base is 2n | Exponent base is 10, binary exponent base bits must be 0 |
||||||||||
| 20 | Exponent is unsigned and
biased |
Exponent is ones-compement
signed |
||||||||||
| 40 | Fields of FP are laid out as:
![]() |
Fields of FP are laid out as:
![]() |
||||||||||
| 100 | Integer bit of significand
is hidden |
Integer bit of significand
is explicit |
||||||||||
| 200 | FP data types are IEEE 754-compliant, bits 0176 must be clear |
FP data types are not IEEE 754-compliant (values of header octets 2-7 are used) |
||||||||||
The floating point numbers are assumed to be represented as
sign * exponent_baseexponent * significandwhere sign can be 1 or -1, and significand is a fixed-point binary (or binary-coded decimal, if exponent base is 10) number which is either 0 or 1+x where x is non-negative and less than 1. (In other words, floating point numbers are assumed to be normalized). If the integer part of significand is hidden, zero is represented with minimal value of exponent and zero fractional part of significand.
Exponent base may be 2, 4, 8, 16 or 10; exponent may be represented as ones-complement with the most significant bit of exponent used as its sign, binary representation is
E = (sign exponent * abs exponent) - bias_defect;or as biased unsigned representation
E = exponent + 2n-1 - 1 - bias_defect,where n is the number of bits in the exponent field.
Zero sign bit is assumed to mean positive values; same applies to exponent sign bit if exponent field is ones-complement. If the host's native floating point data format does not conform to those assumption, an additional pre-encoding must be used.
The Grâl IPM format is defined in #include-file <message.h>. Message structures should be declared as
Usually C types correspond to Grâl IPM types as following:
| IPM Type | C Type |
|---|---|
| char | signed char |
| uchar | unsigned char |
| short | signed short |
| ushort | unsigned short |
| long | signed long |
| ulong | unsigned long |
| xlong | signed long long |
| uxlong | unsigned long long |
| float | float |
| double | double |
Note that C type int is not used; the reason is that its length is usually implementation-dependent even on "standard" machines. Most C compilers align fields accordingly to the Grâl IMP requirements, but do not take that for granted.
Before sending message out its format header must be initialized with the description of local machine data types; this is done as
On receiption, message must be explicitly decoded:
Conversion code is a segment of C code using the following macros:
| Macro | Definition |
|---|---|
| CVT_CHAR | Convert one char |
| CVT_CHARS(n) | Convert n chars |
| CVT_UCHAR | Convert one uchar |
| CVT_UCHARS(n) | Convert n uchars |
| CVT_SHORT | Convert one short |
| CVT_SHORTS(n) | Convert n shorts |
| CVT_USHORT | Convert one ushort |
| CVT_USHORTS(n) | Convert n ushorts |
| CVT_LONG | Convert one long |
| CVT_LONGS(n) | Convert n longs |
| CVT_ULONG | Convert one ulong |
| CVT_ULONGS(n) | Convert n ulongs |
| CVT_XLONG | Convert one xlong |
| CVT_XLONGS(n) | Convert n xlongs |
| CVT_UXLONG | Convert one uxlong |
| CVT_UXLONGS(n) | Convert n uxlongs |
| CVT_FLOAT | Convert one float |
| CVT_FLOATSS(n) | Convert n floats |
| CVT_DOUBLE | Convert one double |
| CVT_DOUBLES(n) | Convert n doubles |
| CVT_EOM | True if end of message is reached |
Every call to a conversion macro converts one or more data elements of a corresponding type, starting from the beginning of the message. Since previous fields are already converted into local host's representation, the conversion code may choose different course of action. This can be used to decode messages with unions or variable-size elements within.
If message contains only integer data elements use IDECODE instead of DECODE as it may produce significantly faster conversion code.