eCAL Zero Copy#
Note
The eCAL Zero Copy mode was introduced in two steps:
eCAL 5.10 (zero-copy on subscriber side only, still one memory copy on the publisher side)
eCAL 5.12 (zero-copy on publisher and subscriber side, see Full Zero Copy Behavior)
In all versions it is turned off by default.
Enabling eCAL Zero Copy#
Use zero-copy as system-default:
Zero Copy can be enabled as system default from the
ecal.ini
file like follows:[publisher] memfile_zero_copy = 1
Use zero-copy for a single publisher (from your code):
Zero-copy can be activated (or deactivated) for a single publisher from the eCAL API:
// Create a publisher (topic name "person") eCAL::protobuf::CPublisher<pb::People::Person> pub("person"); // Enable zero-copy for this publisher pub.ShmEnableZeroCopy(true);
Keep in mind, that using protobuf for serialization will still:
Require to set the data in the protobuf object
Later cause a copy by serializing into the SHM buffer.
If you want to avoid this copy, you can use the low-level API to directly operate on the SHM buffer.
Full Zero Copy behavior#
The Full eCAL Zero Copy mechanism is working for local (inner-host) publish-subscribe connections only. Sending data over a network connection will not benefit from that feature.
Mixed Layer connection#
This describes the case where a publisher publishes its data parallel via shared memory and network (tcp or udp). So we have at least one local subscription and one external (network) subscription on the provided topic.
Publisher:
Regardless of whether the data is generated by a Low Level Binary Publisher or by a Protobuf Publisher, it is always written to an process internal cache first. This memory cache is then passed sequentially to the connected transport layers “shared memory”, “innerprocess”, “udp” and “tcp” in this order.
Compared to the Full Zero Copy behavior described above with only local (shm) connections, we have a copy of the user payload on the publisher side again.
This leads to the following publication sequence for a local connection:
Protobuf API Level:
The user sets the data in the protobuf object
The publisher serializes the protobuf object into a process internal data cache
The publisher locks the SHM buffer.
The publisher copies the process internal data cache to the SHM buffer.
The publisher unlocks the SHM buffer
Binary API Level:
The publisher copies the binary user data into a process internal data cache
The publisher locks the SHM buffer
The publisher copies the process internal data cache to the SHM buffer.
The publisher unlocks the SHM buffer
The publisher informs all connected subscriber
Subscriber:
Subscribers will always use Zero Copy, if enabled. So they will directly read from the SHM buffer.
The subscriber locks the SHM buffer
The subscriber calls the callback function directly with the SHM buffer as parameter
After the callback has finished, the subscriber unlocks the SHM buffer
Low Level Memory Access#
For unleashing the full power of Full eCAL Zero Copy, the user needs to directly work on the eCAL Shared Memory via the CPayloadWriter
API. The idea behind the new CPayloadWriter
API is to give the user the possibility to modify only the data in the memory that has changed since the last time the date was sent. The aim is to avoid writing the complete memory and thus save computing time and reduce the latency of data transmission.
The new payload type CPayloadWriter
looks like this (all functions unnecessary for the explanation have been omitted):
/**
* @brief Base payload writer class to allow zero copy memory operations.
*
* This class serves as the base class for payload writers, allowing zero-copy memory
* operations. The `WriteFull` and `WriteModified` calls may operate on the target
* memory file directly in zero-copy mode.
*
* A partial writing / modification of the memory file is only possible when zero-copy mode
* is activated. If zero-copy is not enabled, the `WriteModified` method is ignored and the
* `WriteFull` method is always executed (see CPublisher::ShmEnableZeroCopy)
*
*/
class CPayloadWriter
{
public:
/**
* @brief Perform a full write operation on uninitialized memory.
*
* This virtual function allows derived classes to perform a full write operation
* when the provisioned memory is uninitialized. Typically, this is the case when a
* memory file had to be recreated or its size had to be changed.
*
* @param buffer_ Pointer to the buffer containing the data to be written.
* @param size_ Size of the data to be written.
*
* @return True if the write operation is successful, false otherwise.
*/
virtual bool WriteFull(void* buffer_, size_t size_) = 0;
/**
* @brief Perform a partial write operation to modify existing data.
*
* This virtual function allows derived classes to modify existing data when the provisioned
* memory is already initialized by a WriteFull call (i.e. contains the data from that full write operation).
*
* The memory can be partially modified and does not have to be completely rewritten, which leads to significantly
* higher performance (lower latency).
*
* If not implemented (by default), this operation will just call the `WriteFull` function.
*
* @param buffer_ Pointer to the buffer containing the data to be modified.
* @param size_ Size of the data to be modified.
*
* @return True if the write/update operation is successful, false otherwise.
*/
virtual bool WriteModified(void* buffer_, size_t size_) { return WriteFull(buffer_, size_); };
/**
* @brief Get the size of the required memory.
*
* This virtual function allows derived classes to provide the size of the memory
* that eCAL needs to allocate.
*
* @return The size of the required memory.
*/
virtual size_t GetSize() = 0;
};
The user must derive his own playload data class and implement at least the WriteFull
function. This WriteFull
function will be called by the low level eCAL SHM layer when finally the shared memory file needs to be written the first time (initial full write action).
For writing partial content (modifying the memory content) the user may define a second function called WriteModified
. This function is called by the eCAL SHM layer if the shared memory file is in an initialized state i.e. if it was written with the previously mentioned WriteFull
method. As you can see, the WriteModified
function simply calls the WriteFull
function by default if it is not overwritten.
The implementation of the GetSize
method is mandatory. This method is used by the eCAL SHM layer to obtain the size of the memory file that needs to be allocated.
Example:
The following primitive example shows the usage of the CPayloadWriter
API to send a simple binary struct efficient by implementing a full WriteFull
and an WriteModified
method that is modifying a few struct elements without memcopying the whole structure again into memory. Note the in case of the none Full Zero Copy Mode only the WriteFull
function will be called by eCAL.
This is the customized new payload writer class. The WriteFull
method is creating a new SSimpleStruct
struct, updating its content and copying the whole structure into the opened shared memory file buffer. The WriteModified
method gets a view of the opened shared memory file, and applies modifications on the struct elements clock
and bytes
by just apllying UpdateStruct
.
// a simple struct to demonstrate
// zero copy modifications
struct alignas(4) SSimpleStruct
{
uint32_t version = 1;
uint16_t rows = 5;
uint16_t cols = 3;
uint32_t clock = 0;
uint8_t bytes[5 * 3] = { 0 };
};
// a binary payload object that handles
// SSimpleStruct WriteFull and WriteModified functionality
class CStructPayload : public eCAL::CPayloadWriter
{
public:
// Write the complete SSimpleStruct to the shared memory
bool WriteFull(void* buf_, size_t len_) override
{
// check available size and pointer
if (len_ < GetSize() || buf_ == nullptr) return false;
// create a new struct and update its content
SSimpleStruct simple_struct;
UpdateStruct(&simple_struct);
// copy complete struct into the memory
*static_cast<SSimpleStruct*>(buf_) = simple_struct;
return true;
};
// Modify the SSimpleStruct in the shared memory
bool WriteModified(void* buf_, size_t len_) override
{
// check available size and pointer
if (len_ < GetSize() || buf_ == nullptr) return false;
// update the struct in memory
UpdateStruct(static_cast<SSimpleStruct*>(buf_));
return true;
};
size_t GetSize() override { return sizeof(SSimpleStruct); };
private:
void UpdateStruct(SSimpleStruct* simple_struct)
{
// modify the simple_struct
simple_struct->clock = clock;
for (auto i = 0; i < (simple_struct->rows * simple_struct->cols); ++i)
{
simple_struct->bytes[i] = static_cast<char>(simple_struct->clock);
}
// increase internal state clock
clock++;
};
uint32_t clock = 0;
};
To send this payload you just need a few lines of code:
int main(int argc, char** argv)
{
// initialize eCAL API
eCAL::Initialize(argc, argv, "binary_payload_snd");
// publisher for topic "simple_struct"
eCAL::CPublisher pub("simple_struct");
// turn zero copy mode on
pub.ShmEnableZeroCopy(true);
// create the simple struct payload
CStructPayload struct_payload;
// send updates every 100 ms
while (eCAL::Ok())
{
pub.Send(struct_payload);
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
// finalize eCAL API
eCAL::Finalize();
return(0);
}
Default eCAL SHM vs. Full Zero Copy SHM#
Default eCAL SHM |
Full Zero Copy SHM |
|
---|---|---|
Memcopies |
❌ 2 additional memcpy (1 for publishing, 1 for each subscriber) |
✅ No memcpy (if Low Level API is used) |
Partial changes |
❌ Changing only 1 byte causes the entire updated message to be copied to the buffer, again |
✅ Changing only 1 byte only costs as much as changing that 1 byte in the target memory, independent from the message size |
Subscriber decoupling |
✅ Good decoupling between subscribers. Subscribers only block each other for the duration of that 1 memcpy |
❌ Subscribers need to wait for each other to finish their callbacks |
Pub/Sub decoupling |
✅ Good decoupling between publisher and subscribers.
|
❌ Subscribers may block publishers
|
Combining Zero Copy and Multibuffering#
For technical reasons the Full Zero Copy mode described above is turned of if the Multibuffering option CPublisher::ShmSetBufferCount
is activated.
Default (subscriber side) Zero Copy is working in combination with Multibuffering as described.