zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Alignment Trap Error with Xenomai and EtherCAT

Tue Nov 29, 2022 8:58 am

Dear Community,

I have hit a problem which I have little expertise to solve. Please bear with my explanation since the system setup, applications build process, and the nature of the problem may be complicated and difficult to explain.

I have patched the Linux RPi kernel version 5.15.32 to prepare it to build Xenomai on it and then have also built the EtherCAT Master on the system (RPi CM4 on custom carrier board). For reference purposes, we want to EtherCAT communication to control a multijoint robot system, and each joint has its own slave. There are a total of 6 slaves per master port (there are two EtherCAT masters over two PCI Ethernet ports).
Our programming tool is Codeblocks and we want to use it in Release Mode.
The EtherCAT communication along with Xenomai real time thread works well with the Codeblocks debug mode; however, the algorithm run time per loop is slow as expected in Debug mode.

The problem:
While Xenomai and EtherCAT Master applications work perfectly well in the Debug mode, the running in the release mode is the problem.
When we run our code in the release mode, the EtherCAT Master fails to initialize after slave configuration citing a very ambiguous "bus error" on the console window, and the dmesg message is displayed as:

Code: Select all

[2211.855899] EtherCAT 0: Starting EtherCAT-OP thread
[2211.861444] Alignment trap: not handling ed9e6a00 at [<00013047>]
[2211.861469] 8<--- cut here ---
[2211.861495] Unhandled fault: alignment exception (0x221) at 0xb6f48012
[2211.861524] pgd = 08b91656
[2211.861545] b6f48012] *pgd=059dd003, *pmd=1fe3b5003
[2211.862181] EtherCAT 0: Releasing Master
As evident, the EtherCAT Master is successfully requested at Master 0 and starts the OP thread successfully as well. However, the data transfer portion creates problems with alignment exceptions.

I have read that Release Mode gives away strange errors called "Release - Only Bugs." I hope there is a solution to this.

Additional Observation:
Xenomai thread without EtherCAT works perfectly in release mode.

I am stuck and I would be grateful for any kind of help. I can share more details along the conversation if required.

Thank you and awaiting your responses in anticipation.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4820
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: Alignment Trap Error with Xenomai and EtherCAT

Tue Nov 29, 2022 3:48 pm

I'm not familiar with EtherCAT, but alignment errors like that usually arise from compiling code for a different platform from that which it was written for. This can be because the sizes of some data structure elements changes, or simply because the original platform doesn't care about alignment (e.g. X86 and derivatives) while the new target does (e.g. ARM).

Writing software to be portable requires a bit more care, and is not high on the list of priorities if the intended use is on a platform where it "just works".

zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Wed Nov 30, 2022 12:57 am

Thank you very much for your response.

Yes, your observation is completely correct. The initial EtherCAT codes were initially written on an x86 platform (Beckhoff C6015) and were transferred as it is on the Raspberry Pi ARM platform and then run.

I will review the code and check for bugs. However, I feel that it would be based on trial and basis. Is there a rule book, tutorial, or a guide of what aspects of the structures implemented in the code should be amended which I can consult to make my task easier?

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4820
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: Alignment Trap Error with Xenomai and EtherCAT

Wed Nov 30, 2022 9:09 am

The usual rule for portable code is that data elements should be aligned to at least their own size; chars/bytes can be at any address, shorts (16-bit values) must be at even addresses, 32-bit values must be at multiples of 4 etc. Compilers usually take care of alignment within structures by inserting padding where necessary, but it's good practice to do it explicitly in the code. What compilers can't do is fix code like:

Code: Select all

val = *((int *)p);
where an arbitrary pointer (or even integer!) is treated as a pointer to a type with an alignment requirement. On a processor such as an ARM it may be necessary to break that single read down into 2 or more smaller aligned accesses and assemble the results into the required value.

zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Tue Jan 03, 2023 8:15 am

Thank you very much for your kind replies. So I have taken some time to investigate the problem in our code. I found out that the problem was in the Xenomai real time cyclic process and within it the EtherCAT Read Data Function (EC_READ_S32()). The change required was very minor and seemingly trivial. I just transferred a few lines from the corresponding class concerning data addresses to the real time thread as follows (I hope it makes sense):

Before:

Code: Select all

for (int i = 0; i < SLAVES_NUM; i++)
{
	joint_pos[i] = EC_READ_S32(g_addrSlavePos[i]);
} 
After:

Code: Select all

for (int i = 0; i < SLAVES_NUM; i++)
{
        g_addrSlavePos[i] = domain1_pd + PosOffset[i]; // line copied from an imported class
	joint_pos[i] = EC_READ_S32(g_addrSlavePos[i]);
}
This is the only change I made which allowed the code to run in Release Mode normally. So I set about investigating the reason for this solution (because it seems too easy!!). You mentioned that data elements need to be aligned for their own size. I changed the alignment in "/proc/cpu/alignment" from default 2 to 3 (i.e. fix + warning). So it gave me the following warning in addition to the alignment error:

Code: Select all

ctrlTask (25809) PC=0x0001381c Instr=0xed995a00 Address=0xb6fe7012 FSR 0x221
here "ctrlTask" is the Xenomai Real Time task function. I am not sure about the PC, Instr, Address, and FSR terms. I would be grateful if you could explain them.
For further investigation, I added the compiler flag "-Wcast-align=strict" in addition to the optimization flag "-O2" in the Makefile. On running the "make" command gave me the following warnings from the Xenomai library files.

Code: Select all

/opt/xenomai-3.2/include/copperplate/threadobj.h:86:9: warning: cast from 'caddr_t' {aka 'char*'} to 'xnthread_user_window*' increases required alignment of target type [-Wcast-align]
86 | return (struct xnthread_user_window*)
87 | ((caddr_t)cobalt_umm_shared + corespec->u_winoff);

/opt/xenomai-3.2/include/copperplate/threadobj.h:86:9: warning: cast from 'pthread_key_t*' {aka 'unsigned int*'} to 'threadobj*'  increases required alignment of target type [-Wcast-align]
180 | #define THREADOBJ_IRQCONTEXT ((struct threadobj *)&threadobj_tskey) 
 
I am not sure if the above lines are a source of the problem. I have a few additional curiosities:
1) I am curious if the above change of one Iine made in our code would be enough to run our application in a stable manner without crashing. I am just worried of the application crashing in Release Mode.
2) It is an interesting problem since Xenomai processes are running normally in our short test codes with EtherCAT Master. I remind that the code runs normally in Debug mode. I would welcome any comments on your part.
Thank you very much.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4820
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: Alignment Trap Error with Xenomai and EtherCAT

Tue Jan 03, 2023 9:59 am

I am not sure about the PC, Instr, Address, and FSR terms. I would be grateful if you could explain them.
PC is the program counter in the application - the memory address (in the process's virtual address space) of the execution that caused the alignment exception. The Instr is the integer representation of the instruction that was being executed, which could be:

Code: Select all

00 5A 99 ED    vldr s10, [sb]
Address is the memory location the instruction was trying to access, in this case 0xb6fe7012. This address is 2-byte aligned, but not 4-byte aligned, which is likely the cause of the problem.

I suggest you enable debugging symbols in the compiler and linker (usually '-g') and don't run "strip" on the output ELF file. You can then disassemble it (e.g. "objdump -d ctrlTask > ctrlTask.dis"), then jump to address 0x0001381c in the disassembly to a) verify the instruction, and b) see which function it is in, and perhaps work out which datastructure the access corresponds to.

FSR is just a value that encodes the cause of the exception, in this case an unaligned access.

zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Tue Jan 10, 2023 5:12 am

Thank you so much for your reply.

Referring to your posts, I have carried out an investigation into the problem with some study. Please bear with my explanation below. As I have mentioned before, the alignment error occurs as follows:

Code: Select all

[  278.525109] Alignment trap: not handling instruction edd27a00 at [<00012fdc>]
[  278.525135] 8<--- cut here ---
[  278.525169] Unhandled fault: alignment exception (0x221) at 0xb6f28012
[  278.525200] pgd = a1421cae
[  278.525227] [b6f28012] *pgd=05654003, *pmd=bbfec003
I checked where this address (0xb6f28012) was being generated. This address is used by EtherCAT Master Library to fetch data (precisely 32-bit position data) from the EtherCAT process domain using the EC_READ_S32 described in the EtherCAT library. Since this address is only 2-byte aligned for retrieving 32-bit data (4-byte), this is where the problem is believed to be occurring in optimized compilation (-O2 option in gcc) as the address is not divisible by 4. The address 0xb6f28012 is made up of EtherCAT Process Domain Pointer (0xb6f28000) and the offset (0x12). Both the EtherCAT Process Domain Pointer and the offset are provided by the EtherCAT application. Now the pointer to EtherCAT process domain address (0xb6f28000) is a uint8_t* pointer retrieved through the function in the EtherCAT library (ecrt.h) as follows:

Code: Select all

/** Returns the domain's process data.
 *
 * - In kernel context: If external memory was provided with
 * ecrt_domain_external_memory(), the returned pointer will contain the
 * address of that memory. Otherwise it will point to the internally allocated
 * memory. In the latter case, this method may not be called before
 * ecrt_master_activate().
 *
 * - In userspace context: This method has to be called after
 * ecrt_master_activate() to get the mapped domain process data memory.
 *
 * \return Pointer to the process data memory.
 */
uint8_t *ecrt_domain_data(
        ec_domain_t *domain /**< Domain. */
        );
        );
The address (0xb6f28012) is held by uint32_t* pointer and is used by EC_READ_S32 function. This function is described in ecrt.h as follows:

Code: Select all

/** Read a 32-bit signed value from EtherCAT data.
 *
 * \param DATA EtherCAT data pointer
 * \return EtherCAT data value
 */
#define EC_READ_S32(DATA) \
     ((int32_t) le32_to_cpup((void *) (DATA)))
Hence it seems that the EtherCAT library returns a uint8_t* process domain pointer which after pointer arithmetic with the process data offset value (0x12) becomes 0xb6f28012 and is stored in a uint32_t* pointer.
Backtracking to the le32_to_cpup in ecrt.h, we find

Code: Select all

#define le32_to_cpup(x) le32_to_cpu(*((uint32_t *)(x)))
Hence, the original uint8_t* pointer to the EtherCAT process domain plus the offset (0xb6f28000 + 0x12 = 0xb6f28012) is being deferenced as *((uint32_t*)(x)). This might be the cause of the misalignment in the -O2 optimization enabled option in program compilation.
The EC_READ_S32 function successfully retrieves process data from addresses 0xb6f28000 (0xb6f28000 + 0x00), 0xb6f28008 (0xb6f28000 + 0x08), and 0xb6f2800c (0xb6f28000 + 0x0c) since they are 4-byte aligned. However, the program stops with an alignment trap error at 0xb6f28012.
Now here lies the problem: both the pointer to process domain and offsets are returned by the EtherCAT application which result in misalignment. I am stuck here as the misalignment seems to be caused by the addresses returned by the EtherCAT application, not by our code.
I hope that I make sense with my above explanation. It is to be noted that the program runs well without the -O2 option disabled. However, we need the program to execute faster since we want to run fast processes using Xenomai. Would you have a comment on this problem? I would be very grateful for your help.

Memotech Bill
Posts: 470
Joined: Sun Nov 18, 2018 9:23 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Tue Jan 10, 2023 8:39 am

There were similar alignment problems when porting BBC BASIC to the Pico viewtopic.php?t=316761.

The solution there was the following set of macros:

Code: Select all

// Alignment helper types:
typedef __attribute__((aligned(1))) int unaligned_int;
typedef __attribute__((aligned(1))) intptr_t unaligned_intptr_t;
typedef __attribute__((aligned(1))) unsigned int unaligned_uint;
typedef __attribute__((aligned(1))) unsigned short unaligned_ushort;
typedef __attribute__((aligned(1))) void* unaligned_void_ptr;
typedef __attribute__((aligned(1))) char* unaligned_char_ptr;
typedef __attribute__((aligned(1))) VAR unaligned_VAR;

// Helper macros to fix alignment problem:
#define ILOAD(p)    *((unaligned_int*)(p))
#define ISTORE(p,i) *((unaligned_int*)(p)) = i
#define TLOAD(p)    *((unaligned_intptr_t*)(p))
#define TSTORE(p,i) *((unaligned_intptr_t*)(p)) = i 
#define ULOAD(p)    *((unaligned_uint*)(p))
#define USTORE(p,i) *((unaligned_uint*)(p)) = i 
#define SLOAD(p)    *((unaligned_ushort*)(p))
#define SSTORE(p,i) *((unaligned_ushort*)(p)) = i 
#define VLOAD(p)    *((unaligned_void_ptr*)(p))
#define VSTORE(p,i) *((unaligned_void_ptr*)(p)) = i 
#define CLOAD(p)    *((unaligned_char_ptr*)(p))
#define CSTORE(p,i) *((unaligned_char_ptr*)(p)) = i 
#define NLOAD(p)    *((unaligned_VAR*)(p))
#define NSTORE(p,i) *((unaligned_VAR*)(p)) = i
You may try defining a similar unaligned uint32_t type, and then use that in the macro definition:

Code: Select all

typedef __attribute__((aligned(1))) uint32_t unaligned_uint32;
#define le32_to_cpup(x) le32_to_cpu(*((unaligned_uint32 *)(x)))

zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Wed Jan 11, 2023 8:10 am

Thank you so much PhilE and Memotech Bill for your replies. Using compiler alignment attribute on the ecrt.h solved the problem. I had been on the problem for weeks, and I am so grateful.

As an additional test, I used the compiler flags of "-mno-unaligned-access" and "-munaligned-access" which allow the compiler to support unaligned access to data. However, both of these options did not work. What may be the reason for it?

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4820
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: Alignment Trap Error with Xenomai and EtherCAT

Wed Jan 11, 2023 8:45 am

The GCC documentation says:
-munaligned-access
-mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6 and all ARMv6-M architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures will be accessed a byte at a time.
Note the use of the phrase "packed data structures", which I think refers to the "packed" attribute that you can use with structures.
packed
The packed attribute specifies that a variable or structure field should have the smallest possible alignment—one byte for a variable, and one bit for a field, unless you specify a larger value with the aligned attribute.
Here is a structure in which the field x is packed, so that it immediately follows a:

Code: Select all

          struct foo
          {
            char a;
            int x[2] __attribute__ ((packed));
          };
I don't know how the pointer is derived in your case, but the compiler needs some hint that the "int" it points too may not be aligned - it's not reasonable to expect it to assemble all 32-bit quantities a byte at a time, or even to do an alignment check on each access.

Bill may have more experience in this area.

Memotech Bill
Posts: 470
Joined: Sun Nov 18, 2018 9:23 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Wed Jan 11, 2023 10:44 am

PhilE wrote:
Wed Jan 11, 2023 8:45 am
Bill may have more experience in this area.
No, probably less. I did not find the solution for BBC Basic. I only happened to recognise that the solution was probably also applicable to this case.

As Phil remarks, using

Code: Select all

__attribute__((aligned(1)))
tells the compiler that this one location needs to be accessed byte-wise. All other memory locations can still be accessed using efficient aligned reads / writes.

zahakj
Posts: 23
Joined: Tue Dec 14, 2021 8:00 am

Re: Alignment Trap Error with Xenomai and EtherCAT

Fri Jan 20, 2023 1:23 am

Again thank you for the replies. I think that the compiler attribute ("__attribute__((aligned(1)))") has solved our problem. The program runs without crashing thus far. I will keep you posted if I run into further trouble.
Thank you so much. I am very grateful for your help. :)

Return to “Advanced users”