Jump to content


Photo

Don't upgrade that NVidia card just yet...


  • Please log in to reply
49 replies to this topic

#41 bisenberger

bisenberger

    Novice

  • Member
  • PipPip
  • 207 posts
  • Gender:Male
  • Location:Springfield, MO

Posted 17 May 2013 - 09:32 PM

 

That happens because you cannot use the same card for viewport and CUDA. If the CUDA kernel runs too long the driver will time out. Infact the reason it times out the driver is because kernel is tooo long and should be split in to multiple parts. 

Using two cards you wouldn't get such problem OR edit the registry settings but that can have consequences.

 

That shows you how unoptimized the 3DCoat CUDA kernel is.

 

Just for the record; I wasn't using 3D-Coat when this happened (or any other cuda enabled app).


Edited by bisenberger, 17 May 2013 - 09:33 PM.

GeoForce GTX 460, Intel Core 2 Quad Q6600, Windows 7 Professional 64-bit.

#42 ravenzep

ravenzep

    Newbie

  • New Member
  • 5 posts

Posted 17 May 2013 - 09:48 PM

Well thats what usually happens :) ^_^

 

http://stackoverflow...m-cuda-run-time



#43 ravenzep

ravenzep

    Newbie

  • New Member
  • 5 posts

Posted 18 May 2013 - 02:52 PM

Also while we are at this whole CUDA discussion. 3DCoat supports the version bellow.

 

CUDA 3.0 toolkit released
March 20th, 2010

 

 

 

 



#44 L'Ancien Regime

L'Ancien Regime

    Novice

  • Member
  • PipPip
  • 492 posts
  • Gender:Male

Posted 18 May 2013 - 03:09 PM

I'm glad this thread happened. I was looking at Nvidia and listening to their hype a lot, and rejecting AMD RAdeon out of hand solely because I like 3d coat the best of all the sculpting programs (though I haven't tried Blender yet) and this thread basically laid it all out on the table for me; 3d Coat utilization of CUDA isn't really that important to me now. A new  CPU (say dual Xeons) or maybe a 5ghz i7  or perhaps the AMD 8350 or its successor ( but that will only handle 2 gpus due to bandwidth issues) with 64 gig of RAM (or moar) will be all I need to sculpt all the voxels I'll ever need regardless of CUDA or OpenCL parallelism offloading of tasks to the GPU. 100 million polys in Live Clay no problem.

 

If this is the case then choice of GPU becomes one solely of real time rendering. If that is the case then AMD becomes far more competitive; AMD 7970 is 6gb ddr5 RAM and about 2500 cores and Nvidia Titan is about the same. Only Nividia is solely CUDA and Open CL while AMD is only OpenCL but it runs it a lot faster.

 

AMD 7970 = $500.00

NVidia Titan = over $1000.00

 

Now I agree AMD has had problems in the past with drivers. I had a   ATI Radeon 4870 x2 and not only did it screw up the interface in Vue but it was loud as hell, literally a screamer on hot days.

 

But 3 Titans or 3 AMD 7970's is RTR.

 

So that's $3000+ Nvidia or $1500 for AMD.

 

I think for me the best bet is to try one AMD 7970 at $500  and see how the drivers are and if it's crap no big deal sell it at a loss on ebay.

 

Because CUDA for 3d Coat is no longer a consideration.


Edited by L'Ancien Regime, 18 May 2013 - 03:17 PM.


#45 AbnRanger

AbnRanger

    Expert

  • Member
  • PipPipPipPipPip
  • 3,537 posts
  • Gender:Male

Posted 18 May 2013 - 10:02 PM

I'm still not ready to go with AMD just yet. Too many GPU renderers only using CUDA, to limit myself in this regard. Holding out hope that more applications will get onboard with OpenCL instead is a fruitless endeavor. They were trumpeting their Streaming capability back when I had a 4850. That card just would not work with some of my CG apps. If Andrew just solved the Large Brush bottleneck with CUDA or optimized the code in OpenGL or DX, that would solve my issues on the performance side.

 

I just recently discovered that some of the newer RAM modules you buy have an XMP Profile embedded, and most newer (aftermarket) Motherboards can detect it. So, if you buy RAM listed at 1866-2100mhz, for example, you can select "Auto" (each MB is different in it's terminology, but you can probably find it easy enough) and the motherboard will run on the timing settings in that embedded profile. Previously, I could never seem to get anywhere near the max speed listed by the RAM or MB manufacturers, adjusting the settings manually. You have 4 timing settings listed on each RAM module, but in the Bios there are a bunch of other timing settings, so setting the timing manually is not going to get the average techy/geek very far.

 

You go into your BIOS to set this, and if you got 1866-2100 memory (and a pretty good multi-core CPU running around 4ghz+), you'll be surprised how much more liver the performance is. So, fast RAM and a beefy CPU will benefit a 3D Coat user more than anything else. Word of warning, though. If you have an off the shelf system, like Dell, HP or something...forget all of this. You are pretty much stuck at the rate you have. They disable the Motherboard from allowing any overclocking of the CPU or RAM. Your board might support a better CPU model from the same line of CPU's, but it won't help you to get faster RAM, when the Bios won't give you access to adjust it's timings (auto or otherwise).



#46 L'Ancien Regime

L'Ancien Regime

    Novice

  • Member
  • PipPip
  • 492 posts
  • Gender:Male

Posted 22 May 2013 - 11:13 AM

http://www.extremete...-bitcoin-mining



#47 Zeddicus

Zeddicus

    Novice

  • Member
  • PipPip
  • 192 posts
  • Gender:Male
  • Location:Midlands

Posted Today, 07:52 AM

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

 

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

 

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

 

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.


4.0 BETA 16 x64 DX CUDA || Windows 7 SP1 x64 (WEI 7.9) || Nvidia GTX670 v314.22 || Intel 3770K @ 4.5 GHz || 16 GB DDR3-1600 || SpaceMouse Pro v10 Beta 15 || Intuos4 v6.3.5-3


#48 AbnRanger

AbnRanger

    Expert

  • Member
  • PipPipPipPipPip
  • 3,537 posts
  • Gender:Male

Posted Today, 08:15 AM

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

 

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

 

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

 

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.

 

It's running fine here, but what I had to do was adjust the main CPU speed + the memory ratio to get the CPU speed I wanted and keep the memory speed close to the max listed on the RAM. The Motherboard needs to be be able to handle the extra speed. For example on the 1366 (Intel X58) boards, most are rated only for a max of 1600. I put some Patriot Viper Extreme memory in an Intel board for a render box (2000mhz rated) in it and got no xmp profile option. All I could do was run the setup at optimized defaults and it ran like crap.

 

As soon as I put some sticks rated 1600, I got an an XMP profile....set it to that and it ran fine. Maybe a BIOS update might help your situation, but often times it's just the motherboard, period. The upper tier (high end ASUS, MSI, Gigabyte, EVGA) boards generally work the best.


Edited by AbnRanger, Today, 08:16 AM.


#49 L'Ancien Regime

L'Ancien Regime

    Novice

  • Member
  • PipPip
  • 492 posts
  • Gender:Male

Posted Today, 08:43 AM

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

 

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

 

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

 

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.

 

 

Actually after all this discussion, I'm thinking that the whole GPU card parallel programming business may not be the right way to go. Andrew thinks it's a bitch to program and so do the guys over at Vray.

 

Vray has a much more interesting take on it; they're going with Intel Xeons and the Xeon Psi Co Processor, which is much easier to program for multithreading and parallel computing. For not much more than an Nvidia Titan you get a lot more cores. 240 threads..with  8 gigs of DDR5 RAM and 320 GB/s of max memory bandwidth. One Xeon Phi will thus be  = 4 * 8 core Xeons. For under $2000. And that will have MKL, Math Kernel Libraries built in.

 

I'm still not sure how many of these you could plug into your PCIE slots  for each Xeon CPU....I would think at least two.

 

 

This is the route SGI is going with it's SGI UV chassis..

 

http://www.sgi.com/p...cts/servers/uv/

 

 

INTC_XeonPhiKnightsCorner.jpg

 

So forget CUDA, and pass on OpenCL and go for the Xeon Phi, and just get an AMD 7970 or a Titan for viewport..or if you've got money to burn a Quadro Pro or FireGL..

 

If Andrew can make 3d Coat scalable to all those  threads, (and the new Xeon Phi coprocessor coming out in July will have 480 threads) that will be the real deal, not trying to transform your GPUs into CPU functionality.

 

And the Nvidia 780 will be crippled just like the 680...what a joke..

 

 

Somebody send Andrew this for his birthday;

 

xeonphib.png

 

:D :D :D :D :D

http://www.amazon.ca...g/dp/0124104142

 

 

"Reinders and Jeffers have written an outstanding book about much more than the Intel® Xeon PhiT. This is a comprehensive overview of the challenges in realizing the performance potential of advanced architectures, including modern multi-core processors and many-core coprocessors.  The authors provide a cogent explanation of the reasons why applications often fall short of theoretical performance, and include steps that application developers can take to bridge the gap.  This will be recommended reading for all of my staff." -James A. Ang, Ph.D. Senior Manager, Extreme-scale Computing, Sandia National Laboratories


Edited by L'Ancien Regime, Today, 09:07 AM.


#50 AbnRanger

AbnRanger

    Expert

  • Member
  • PipPipPipPipPip
  • 3,537 posts
  • Gender:Male

Posted Today, 12:12 PM

I think this is probably one reason why Cebas (3rd Party software vendor for 3ds Max, Maya and C4D) has dragged it's feet for the past 3-4yrs+ with the upgrade to R4 GPU. It looked like they were going to be all CUDA based (hybrid CPU +GPU), but at one point, this Intel card came into the picture and I think that made them take another look. Who knows? GPU computing IS indeed the future, cause CPU tech has been stuck in the mud, with no significant advances for the past 3-5yrs. We've had 4-8 core CPU's since 2009...and we're still here.






1 user(s) are reading this topic

1 members, 0 guests, 0 anonymous users


    Norstu