Page 1 of 4 1234 LastLast
Results 1 to 15 of 53
  1. #1
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Compiling Boinc on Linux - Discussion thread

    I have recently added a Linux BOINC guide to the main BOINC FAQ sticky.

    As part of that guide, I touched on the fact that the Linux boinc client is slower than the Windows version and described how to compile the boinc client on linux with optimisations to improve the speed. There was a lot of additional information that would probably be useful for anyone wanting to do this that I did not include in the guide for the sake of brevity and to help keep the original guide clear and consise. For that reason I've started this thread to make available that extra information and as a place to discuss.

    Note: A link to the guide has been posted on the SETI@Home message boards and has also been linked by Paul Buck's Boinc FAQ

    Benchmarks

    First off, a word about benchmarks. Firstly, the boinc client internal benchmarks run a lot slower for the linux client that the windows client on the same hardware. The windows benchmarks are approximately twice as fast. This means on the same hardware, the linux client will request half the credit that the windows client would if the work unit took the same time to complete. Secondly, the seti client that actually does the science shows differences in performance across the windows and linux platforms, although here the differences are not so great.

    Let me try to explain where these differences come from. The Windows clients are compiled using Visual C++ compiler whereas the Linux GNU clients are compiled using the GNU gcc compiler. The Linux gcc compiler is designed to produce binaries compatibe across a greater range of hardware platforms than the Windows compiler and speed is sacrificed for portability. This explains the relatively small differences between the Linux and Windows versions of the SETI@Home client, but is doesn't explain the huge differences in the benchmarks returned by the different boinc clients.

    Here we need to look at compiler optimisations. Compilers are told what optimisations to use when compiling the code to help produce more optimised code that will run faster. These optimisations can be quite simple like taking advantage of a feature of a specific processor such as 3dnow or sse instructions, or they can be quite complex and involve steps such as trying to pre-emptively predict patterns in the code. It would appear that the Visual C++ compiler does such a good job of optimising the relatively non-complex benchmark code in the boinc client so as to effectively cheat and not perform a full and proper benchmark. The Linux gcc compiler is not so efficient and the benchmark is being run in such a way as to give a more accurate indication of the processor's ability to process a work unit, which afterall is the purpose of the benchmark in the first place. Optimisations are generally a good thing, but here we see an example that defeats the purpose.

    I found this interesting quote on the Boinc_opt message list:


    Benchmark code (and I suspect all recent Whetstone benchmarks) was cheating.

    Not intentionally on the benchmark designer or boinc developer's parts, but because the optimizing compiler was too clever for OUR own good.

    For example...below is a section of the whetstone benchmark source code (there are 8 sections). Its goal is to see how well a particular CPU (and indirectly the compiler) can calculate integer math and do array offset calculations. (how well = how fast).

    The Visual C++ compiler figured out that the E1 array is only 4 elements long. It also figured out by simulation at compile time what the 4 values in those array elements would be at the end of the loop. It therefore just generated code to put 4 values into the array, and never generated any code for integer math or array indexing.

    Well in a small case optimization like this is good, but our BIG program has arrays and math that the compiler can't figure out at compile time. And the benchmark was supposed to give us an idea of how long such things would really take.
    Optimisations used for compiling the boinc client on Linux

    The final optimisations used in my guide for CFLAGS and CXXFLAGS were "-march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -fforce-addr -ffast-math -ftracer".

    Here I want to discuss how I arrived at these, what options I tried along the way, and what effect they had. The base system was an AthlonXP 2100+ at 133MHz/266DDR fsb with 266DDR memory, Fedora Core 2, kernel 2.6.8-1.521 and gcc-3.3.3-7. I used the v4.09 downloaded boinc client and the boinc_public-2004-9-30 nightly source code to build my clients.

    Here are the stock benchmark results:

    Dwonloaded BOINC client v4.09:
    Whetstone - 896
    Dhrystone - 2155

    First I just compiled the source code with no optimisations other than the standard optimisations defined by my Linux installation:

    Whetstone - 890
    Dhrystone - 2028

    As we can see, the benchmarks have actually got slightly worse. This is probably due to the fact that my default optimisations use -O2 whereas the downloaded client may have been compiled with -O3. So, just compiling the client yourself is not enough to get a performance increase - we are going to need to use some further optimisations.

    Next I tried a baseline set of optimisations that are generally considered safe (some optimisation options can break the code or cause errors during compilation). These made a large difference and were thus kept as the default set of optimisations:

    -march=athlon-xp -O3 -fomit-frame-pointer
    Whetstone - 1197
    Dhrystone - 2694

    Next I added -funroll-loops to the default set of optimisations with may or may not improve performance. This increased the Whetstone score but also slightly decreased the Dhrystone score. Given the relative increase to one versus decrease to the other I decided to keep this option.

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops
    Whetstone - 1301
    Dhrystone - 2625

    I then added -falign-functions=4 which is said to be the optimal setting for AthlonXP processors. This had no effect on the Whetstone score but decreased the Dhrystone score so was not used further.

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -falign-functions=4
    Whetstone - 1301
    Dhrystone - 2401

    I next tested the -s and -static optimisations which again caused decreases to both benchmark scores and were not kept:

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -s -static
    Whetstone - 1271
    Dhrystone - 2509

    I then added the -ffast-math optimisation. This option optimises the way in which complex math are performed and may not conform to all standards. As such it should be used with caution in programs performing mathematical calculations (such as SETI). I reasoned it safe to include here as the boinc client is not actually performing any maths in terms of the chosen project (other than the benchmark), but simply calculating how much credit to request for each work unit processed. As expected, inclusion of -ffast-math had a positive effect and was retained.

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -ffast-math
    Whetstone - 1585
    Dhrystone - 2627

    Seeing the improvements -ffast-math made, I next looked at -ffinite-math-only and -funsafe-math-optimizations. The comments above cautioning against using in programs performing mathematical calculations apply to both these options. As it turns out, neither option had any effect so they were not used any further.

    Then I added the -fprefetch-loop-arrays option which had a slightly negative influence on the Dhrystone score and was dropped:

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -fprefetch-loop-arrays -ffast-math
    Whetstone - 1585
    Dhrystone - 2618

    Next I looked at -fforce-addr which had a positive influence on the Whetstone result and only a slightly negative influence on Dhrystone so was included:

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -fforce-addr -ffast-math
    Whetstone - 1656
    Dhrystone - 2614

    and next was -ftracer which improved the Dhrystone score:

    -march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -fforce-addr -ffast-math -ftracer
    Whetstone - 1656
    Dhrystone - 2640

    I then went on to examine other optimisations that had been suggested on the SETI@Home forums. -mfpmath=sse and -pipe made no difference. -mcpu=athlon-xp is implied by -march=athlon-xp so does not need to be specifically included, and when specifically included made no difference as expected. Likewise, -m3dnow, -msse and -mmmx are also all implied by -march=athlon-xp and again had absolutely no effect as one would expect. Other optimisations suggested were also implied by either -O2 or -O3 so were not individually tested. They would be of no further benefit as they are already included as part of the general -O3 level optimisations.

    This thread is the place to discuss the Linux BOINC FAQ and the optimisations I outlined in this post.

    Ned
    Last edited by Ned Slider; 10-03-2004 at 12:22 PM.

  2. #2
    Joined
    Dec 2000
    Location
    myrtle beach,south carolina, U. S. of A.!
    Posts
    12,696

    Re: Compiling Boinc on Linux - Discussion thread

    hey ned, you got any windows optimizations?

    great job bro, looks like you put alot of time into figuring all that stuff out!

    someday i will get the penguin going, until then, i am stuck on the darkside!

  3. #3
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by thelmores
    hey ned, you got any windows optimizations?
    Try this:

    Code:
    format C:\
    are you sure? [Y/N] Y
    Insert Linux CD



    Quote Originally Posted by thelmores
    great job bro, looks like you put alot of time into figuring all that stuff out!

    someday i will get the penguin going, until then, i am stuck on the darkside!
    Thanks thelmores

    Yes, spent best part of the last week figuring this stuff out, testing and writing it up for the FAQ. The hardest part was finding the info in the first place as there's no instructions included and only snippets of info posted elsewhere. Everyone says things like "I eventually managed to get it to compile" but doesn't bother to post HOW they managed it

    I figured I'd just try and make it a little easier for the next person

    Ned
    Last edited by Ned Slider; 10-03-2004 at 12:24 PM.

  4. #4
    Joined
    Dec 2000
    Location
    myrtle beach,south carolina, U. S. of A.!
    Posts
    12,696

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by Ned Slider
    Try this:

    Code:
    format C:\
    are you sure? [Y/N] Y
    Insert Linux CD






    yea, but after that last step, i'd probably be lost!

    Quote Originally Posted by Ned Slider
    Everyone says things like "I eventually managed to get it to compile" but doesn't bother to post HOW they managed it
    see what i mean!

  5. #5
    Joined
    Nov 2000
    Location
    Toronto, Canada
    Age
    48
    Posts
    13,852

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by Ned Slider
    Try this:

    Code:
    format C:\
    are you sure? [Y/N] Y
    Insert Linux CD
    Q6600 @ 3.0Ghz
    Asus P5K Deluxe
    4GB Corsair PC6400
    ATI X1650 Pro

  6. #6
    Joined
    Mar 2002
    Location
    bergen county, new jersey
    Posts
    784

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by Ned Slider
    Try this:

    Code:
    format C:\
    are you sure? [Y/N] Y
    Insert Linux CD
    Ned

    LOL I did exactly that!! See sig...Linux newb here. Oh btw, you wouldnt have a compiled code of your optimized Bionic client would you? I know nothing about compiling yet...

  7. #7
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by centered effect
    LOL I did exactly that!! See sig...Linux newb here. Oh btw, you wouldnt have a compiled code of your optimized Bionic client would you? I know nothing about compiling yet...
    Yep, sure thing. Can be downloaded here

    Right click on link and choose Save Link As.

    Could you please let me know if it runs OK or not, and what (linux) system you tried it on. It was compiled on Fedora Core 2 (FC2) and I've also tested it on FC1 and it worked fine, but I haven't tested it on any other systems yet. I'm unsure how portable it will be to other systems.

    Ned
    Last edited by Ned Slider; 10-24-2004 at 02:52 AM.

  8. #8
    Joined
    Oct 2004
    Posts
    3

    Re: Compiling Boinc on Linux - Discussion thread

    Hi Ned,

    Thanks for the HOW-TO/FAQ. I followed your directions and my benchmarks went from

    511 double precision MIPS (Whetstone) per CPU
    1135 integer MIPS (Dhrystone) per CPU

    to

    1339 double precision MIPS (Whetstone) per CPU
    1336 integer MIPS (Dhrystone) per CPU

    Then I had a "D'oh!" moment when I realised that by using cut and paste I'd included the athlon optimisation instead of for i686. But here's the thing, when I recompiled the benchmarks weren't as good for the i686 optimisation as for the athlon.

    1316 double precision MIPS (Whetstone) per CPU
    1167 integer MIPS (Dhrystone) per CPU

    How come? Would you expect the actual workunit processing to take longer with the athlon optimisation? Or maybe to even crash at some point?

    Thanks,

    John

  9. #9
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by jkobrien
    Then I had a "D'oh!" moment when I realised that by using cut and paste I'd included the athlon optimisation instead of for i686. But here's the thing, when I recompiled the benchmarks weren't as good for the i686 optimisation as for the athlon.

    1316 double precision MIPS (Whetstone) per CPU
    1167 integer MIPS (Dhrystone) per CPU

    How come? Would you expect the actual workunit processing to take longer with the athlon optimisation? Or maybe to even crash at some point?
    Hi John,

    First off, welcome to our forums

    I assume you're using an Intel Pentium of some description?

    I played around with different -march settings and was somewhat surprised at the results I got. On an athlonXP, using athlon and athlon-xp gave identical results. Using -march=i686 actually improved the Whetstone benchmark very slightly on my AthlonXP but Dhrystone was down significantly. I guess the best thing to do is use my optimisations as a starting point for further experimentation if you're so inclined

    These optimisations are for the boinc client only and as such they only affect the benchmark score. This in turn will affect how much credit your computer will request and will also more accurately predict how much work to download from your machine. It has absolutely no effect on the processing of work. That is done by a second client, the seti_boinc client. We are currently working on recompiling the seti_boinc client and atm are trying to validate results from the compiled client against results from the downloaded client. Once this testing procedure is complete, we will post details of how to compile it yourself and provide download links to the precompiled client.

    Regards,

    Ned

  10. #10
    Joined
    Oct 2004
    Posts
    3

    Re: Compiling Boinc on Linux - Discussion thread

    Thanks Ned,

    Yeah, I'm running a dual Xeon system. Interesting to see you had similar findings. I forgot about the boinc/seti_boinc separation but wasn't so much interested in how my processing would be affected as in what was going on with the compiler (I'm teaching myself C at the moment). I guess it's just one of those things.

    I'll look forward to your post on compiling the seti_boinc client!

    Thanks again,

    John

  11. #11
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    So you got better results using -march=athlon-xp on a Xeon than using -march=i686?

    Wow - that's interesting!

    Could you post your benchmark scores for both please if you have them available.

    Ned

  12. #12
    Joined
    Mar 2002
    Location
    bergen county, new jersey
    Posts
    784

    Re: Compiling Boinc on Linux - Discussion thread

    Ned, my results:

    Duron 1.1
    512 PC2100
    Abit Kg7
    Mandrake 10.0

    using your code:
    CPU benchmarks
    2004-10-07 02:10:09 [---] Benchmark results:
    2004-10-07 02:10:09 [---] Number of CPUs: 1
    2004-10-07 02:10:09 [---] 1040 double precision MIPS (Whetstone) per CPU
    2004-10-07 02:10:09 [---] 1664 integer MIPS (Dhrystone) per CPU
    2004-10-07 02:10:09 [---] Finished CPU benchmarks

  13. #13
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by centered effect
    Ned, my results:

    Duron 1.1
    512 PC2100
    Abit Kg7
    Mandrake 10.0

    using your code:
    CPU benchmarks
    2004-10-07 02:10:09 [---] Benchmark results:
    2004-10-07 02:10:09 [---] Number of CPUs: 1
    2004-10-07 02:10:09 [---] 1040 double precision MIPS (Whetstone) per CPU
    2004-10-07 02:10:09 [---] 1664 integer MIPS (Dhrystone) per CPU
    2004-10-07 02:10:09 [---] Finished CPU benchmarks

    Great, thanks for posting your results. Glad my client worked for you. I didn't know how well it would port to other linux systems. It was compiled on a Fedora Core 2 system, so we'll chalk up a success on Mandrake 10.0

    Ned

  14. #14
    Joined
    Mar 2002
    Location
    bergen county, new jersey
    Posts
    784

    Re: Compiling Boinc on Linux - Discussion thread

    Well... one question.. and this may be a Boinc newb question but....

    I get message when running the Linux Boinc:

    Resuming computation for result 27ap04aa.24346.20482.717330.74_3 using setiathome version 4.02

    But I am running your version 4.11...or should I not worry about it?

    The whole console looks like this so as I am crunching:

    [jason@localhost jason]$ cd /home/jason/boinc
    [jason@localhost boinc]$ chmod 774 boinc_4.11_i686-pc-linux-gnu
    [jason@localhost boinc]$ ./boinc_4.11_i686-pc-linux-gnu
    2004-10-08 00:36:35 [---] Starting BOINC client version 4.11 for i686-pc-linux-gnu
    2004-10-08 00:36:35 [SETI@home] Project prefs: using your defaults
    2004-10-08 00:36:35 [SETI@home] Host ID is ******* (masked id by me)
    2004-10-08 00:36:35 [---] General prefs: from SETI@home (last modified 2004-06-09 14:56:39)
    2004-10-08 00:36:35 [---] General prefs: using your defaults
    2004-10-08 00:36:35 [SETI@home] Resuming computation for result 27ap04aa.24346.20482.717330.74_3 using setiathome version 4.02
    2004-10-08 02:27:11 [---] May run out of work in 0.10 days; requesting more
    2004-10-08 02:27:11 [SETI@home] Requesting 8378 seconds of work
    2004-10-08 02:27:11 [SETI@home] Sending request to scheduler: http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
    2004-10-08 02:27:33 [SETI@home] Scheduler RPC to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
    2004-10-08 02:27:33 [SETI@home] General preferences have been updated
    2004-10-08 02:27:33 [---] General prefs: from SETI@home (last modified 2004-10-07 02:24:15)
    2004-10-08 02:27:33 [---] General prefs: using your defaults
    2004-10-08 02:27:33 [SETI@home] Project prefs: using your defaults
    2004-10-08 02:27:33 [SETI@home] Started download of 14mr04aa.17223.11633.373564.13
    2004-10-08 02:27:34 [SETI@home] Finished download of 14mr04aa.17223.11633.373564.13
    2004-10-08 02:27:34 [SETI@home] Throughput 225317 bytes/sec
    2004-10-08 05:39:11 [---] Received signal 2
    2004-10-08 05:39:11 [---] Exit requested by user

    Not sure what I hit on the last part to cancel it...

  15. #15
    Joined
    Jul 2001
    Location
    UK
    Age
    51
    Posts
    20,229

    Re: Compiling Boinc on Linux - Discussion thread

    Quote Originally Posted by centered effect
    Well... one question.. and this may be a Boinc newb question but....

    I get message when running the Linux Boinc:

    Resuming computation for result 27ap04aa.24346.20482.717330.74_3 using setiathome version 4.02

    But I am running your version 4.11...or should I not worry about it?

    The whole console looks like this so as I am crunching:

    [jason@localhost jason]$ cd /home/jason/boinc
    [jason@localhost boinc]$ chmod 774 boinc_4.11_i686-pc-linux-gnu
    [jason@localhost boinc]$ ./boinc_4.11_i686-pc-linux-gnu
    ---Edited for brevity----
    2004-10-08 00:36:35 [---] Starting BOINC client version 4.11 for i686-pc-linux-gnu
    2004-10-08 00:36:35 [SETI@home] Project prefs: using your defaults

    2004-10-08 05:39:11 [---] Received signal 2
    2004-10-08 05:39:11 [---] Exit requested by user

    Not sure what I hit on the last part to cancel it...

    OK, there are TWO clients. One is the boinc client that handles the projects, downloads units and returns them and requests credit. The latest downloadable version is 4.09 and your compiled version is called 4.11. Then there is the science client, which in our case is the setiathome client. This is automatically downloaded by the boinc client and the latest version on linux is 4.02. So what your seeing is perfectly correct

    The last part means the client has exited. You get this when you hit Ctrl-C.

    Ned

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •