All taks failing on linux host

Mersenneplustwo home page
log in

Advanced search

Message boards : Number crunching : All taks failing on linux host

Author Message
Profile [AF>Le_Pommier] Jerome_C2005
Send message
Joined: 20 Oct 09
Posts: 5
Credit: 2,021,992
RAC: 14,454
Message 421 - Posted: 23 May 2020, 18:42:32 UTC
Last modified: 23 May 2020, 18:43:52 UTC

Hi

I have a linux debian 10 host with 2 Intel Xeon CPU E5645 @ 2.40GHz + 20 GB RAM where *all the tasks* (more than 250) are failing after a few seconds :

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation
SIGABRT: abort called
SIGABRT: abort called
SIGABRT: abort called
SIGABRT: abort called
SIGABRT: abort called
SIGABRT: abort called

(repeated many time)

I have another linux debian 10 host with a slightly better CPU (2 Intel Xeon CPU X5650 @ 2.67GHz with only 8 GB RAM) where it works fine !

Any idea of what may happen ?

Profile [AF>Le_Pommier] Jerome_C2005
Send message
Joined: 20 Oct 09
Posts: 5
Credit: 2,021,992
RAC: 14,454
Message 422 - Posted: 28 May 2020, 8:36:40 UTC

For information this was fixed by

1. sudo nano /etc/default/grub
2. change the row
GRUB_CMDLINE_LINUX_DEFAULT="<existing text>"
by
GRUB_CMDLINE_LINUX_DEFAULT="<existing text> vsyscall=emulate"
3. save
4. sudo update-grub
5. sudo reboot

Profile bearnol
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 3 Dec 06
Posts: 280
Credit: 7,389,137
RAC: 5,731
Message 423 - Posted: 28 May 2020, 18:39:16 UTC - in response to Message 422.

Thanks very much for that, Jerome...
____________

Profile [AF>Le_Pommier] Jerome_C2005
Send message
Joined: 20 Oct 09
Posts: 5
Credit: 2,021,992
RAC: 14,454
Message 424 - Posted: 29 May 2020, 15:26:34 UTC

Welcome :)

It seems the same issue happened on other project (WCG, ibercivis, for those I'm aware) with debian 10 hosts.

I *think* that for the other host of mine I mentioned above where "it was working fine" I had already and precisely applied that same fix, found in the ibercivis forum, so that's why it was already working with your project.

The forum topic is here (it includes a link to a WCG topic where the initial solution was given).

I reported there that it was working, and then it stopped working again but the errors were completely different so I think it was another issue with their app.


Post to thread

Message boards : Number crunching : All taks failing on linux host


Return to WEP-M+2 Project main page


Copyright © 2022 M+2 Group