Posts by VietOZ

Mersenneplustwo home page
log in
1) Message boards : Number crunching : Hi - FB Sprint (Message 486)
Posted 499 days ago by VietOZ
The trick to avoid, not all, but majority of the errors is start out with less than 32 threads. Says, starts 30 threads first ... then let it runs for about 10 minutes or so .. then start another 30 threads ... and so on. It's annoying to babysit each machine when I have to run this project, but better than wasting electricity.

These issues like that need to be address. I know you're a 1 man show, but work on it when you can instead of ignoring it. Each error will cause BOINC to increase time to the next "update". You know it ... and if we get enough errors ... the next update could be a day later. Which means the computer just sit there idle waiting for the next update. More advance users will then write scripts to trickle the update every N minutes/seconds. And if a bunch of users doing that, it's like DDoS to your server.
My point is, by ignoring the problem, it can be backfire. You forced users to find ways to make it work for them because there's no one addressing the problem/issue.
2) Message boards : Number crunching : Hi - FB Sprint (Message 479)
Posted 499 days ago by VietOZ
One can only guess at the amount of wasted electricity that has been consumed... :-(


Tim, since this project is a quorum 2 i believe, you'll have to also account in the users that waiting for his/her pendings to clear. If it'll ever be??

As for the server's issues outside of challenges/competition. I can confirm it. A while back, me and SaM (our team captain), were trying to gain some points on the Marathon. We both were running about 500 threads each, got a "could not open database" error that lasted for a few days and all of our work went down the drain.
Another time, I won't mention how many WUs, I gambled and did a speculative bunker. Server went poop for almost a week ... all work gone again. Ok, I took a chance... can't blame anybody.. cool.

There's also another issue that could be more like BOINC's code issue, but if other projects can minimize the errors ... i don't understand why WEP couldn't. The issue was/is if you have a high core counts machine, I'd say more than 32 threads, it almost guarantee that you'll get a bunch of errors saying something like "file exit too long" (something like that, i can't remember exactly the words). The higher threads you have, the longer it took for the errors to phase out before everything running smoothly. My 64 threads machines usually took about 1/2 a day getting errors before it can be crunching full speed.Y'all don't have to take my words for it, attach a 64 threads machine and run it for a day and you'll see the issue still last until this day. So basically, when you want to contribute to this project and have a high thread count machine, expect to wasted 1/2 of a day of electricity.
3) Message boards : Number crunching : Hi - FB Sprint (Message 458)
Posted 500 days ago by VietOZ
fair enough. At least we now know what to do instead of waiting and hoping for the next few days. 12k WUs lost, but all good. Good luck with everything in the future.
4) Message boards : Number crunching : Hi - FB Sprint (Message 452)
Posted 500 days ago by VietOZ
Not just this Sprint alone. I've lost tons of WUs not being credited due to either server downtime or "database could not open" OUTSIDE of challenges. Meaning on a regular basis, these errors would occurred from time to time and last long enough for the WUs to expired. I understand that it's a small project, and maybe a project that doesn't need anymore crunching. Just available for those who wants to hit milestones. But if it's a live project, then we need communication. We need to address the issues when they came up, even if it just a FUN project. Suggestions had been made, if you like/don't like it then just give an answer/response. Not too hard is it?

Return to WEP-M+2 Project main page

Copyright © 2021 M+2 Group