This site may earn affiliate commissions from the links on this page. Terms of use.

Roughly six months agone, we covered the debut of Ashes of the Singularity, the first DirectX 12 title to launch in any form. With just a month to get earlier the game launches, the developer, Oxide, has released a major new build with a heavily updated benchmark that's designed to mimic terminal gameplay, with updated assets, new sequences, and all of the enhancements to the Nitrous Engine Oxide has broiled in since last summer.

Ashes of the Singularity is a spiritual successor to games like Total Annihilation, and the first DirectX 12 championship to showcase AMD and Nvidia GPUs working side-by-side in a multi-GPU configuration.

The new build of the game released to printing now allows for multi-GPU configuration testing, but time constraints limited us to evaluating general performance on unmarried-GPU configurations. With Ashes launching in only under a month, the data we meet today should be fairly representative of last gameplay.

AMD, Nvidia, and asynchronous compute

Ashes of the Singularity isn't but the offset DirectX 12 game — information technology's as well the first PC title to make all-encompassing use of asynchronous calculating. Support for this capability is a major difference between AMD and Nvidia hardware, and it has a meaning touch on on game performance.

A GPU that supports asynchronous compute can utilize multiple control queues and execute these queues simultaneously, rather than switching between graphics and compute workloads. AMD supports this functionality via its Asynchronous Compute Engines (ACE) and HWS blocks on Fiji.

AMD-HWS

Fiji'south architecture. The HWS blocks are visible at top

Asynchronous calculating is, in a very real sense, GCN's secret weapon. While every GCN-class GPU since the original HD 7970 can employ it, AMD quadrupled the number of ACEs per GPU when it built Hawaii, then modified the design once more with Fiji. Where the R9 290 and 290X use eight ACEs, Fiji has four ACEs and two HWS units. Each HWS tin perform the piece of work of ii ACEs and they announced to be capable of boosted (but as-yet unknown) work also.

The verbal state and nature of Nvidia'due south asynchronous compute capabilities is still unclear. We know that Nvidia'south Maxwell can't perform anything like the concurrent execution that AMD GPUs can manage. Maxwell can benefit from some light asynchronous compute workloads, as it does in Fable, but the benefits on Team Green hardware are modest.

NV-Preemption

Nvidia's async compute back up is limited compared to AMD

The Nitrous Engine that powers Ashes of the Singularity makes all-encompassing use of asynchronous compute and uses it for upwards to thirty% of a given frame's workload. Oxide has stated that they believe this will be a common approach in future games and game engines, since DirectX 12 encourages the use of multiple engines to execute commands from split up queues in parallel.

Test setup and performance:

We tested both the AMD Fury X and the Nvidia GeForce GTX 980 Ti in a Haswell-E system with 16GB of DDR4-2667 and Windows 10 with all updates installed. AMD distributed a new driver for this review, Nvidia did not — we used the WHQL 361.91 driver, released on two/sixteen/2016 for our performance testing.

We confined ourselves to DirectX 12 testing this time out, simply Anandtech did encompass DX11. The performance data in that location suggests that both AMD and Nvidia improved in all modes. Nvidia continues to outperform AMD in DX11 compared to DX12, merely the gap is much smaller than it was previously. As before, however, DirectX 12 gives Nvidia no functioning comeback over and higher up DX11.

Nosotros tested Ashes of the Singularity in three particular modes — Loftier, Farthermost, and Crazy and with asynchronous calculating enabled and disabled to measure the impact on AMD versus Nvidia cards. The characteristic is enabled by default.

We're going to bear witness y'all results first with asynchronous compute enabled versus disabled, so by resolution.

Ashes-1080p-Async Ashes-1080p-NoAsync

With asynchronous compute disabled, AMD's R9 Fury X leads the GTX 980 Ti by seven-8% across all iii detail levels. Enable asynchronous compute, however, and AMD roars ahead, beating its Nvidia analogue by 24-28%. The GeForce GTX 980 Ti's performance, in contrast, drops by 5-eight% if asynchronous compute is enabled. This accounts for some of the gap between the two manufacturers, just past no means all of it.

Let's shift to 4K and cheque performance there:

Ashes-4K-Async

Ashes-4K-NoAsync

Higher resolutions have often favored AMD cards, and this is no exception. With asynchronous compute disabled, AMD GPUs are still running eleven-15% faster than their Nvidia counterparts. Enable async compute, and that gap doubles — the Radeon R9 Fury X is no less than 31-33% faster than the Nvidia GTX 980 Ti. Given how the Fury Ten struggled out of the gate, that's got to be a welcome sight for Squad Blood-red.

Is Ashes of the Singularity biased?

Ashes of the Singularity is the kickoff DX12 game on the marketplace, and the operation delta between AMD and Nvidia is going to court controversy from fans of both companies. We won't know if its operation results are typical until we run into more games in market. Merely is the game intrinsically biased to favor AMD? I think non — for multiple interlocking reasons.

Get-go, there's the fact that Oxide shares its engine source lawmaking with both AMD and Nvidia and has invited both companies to both see and advise changes for most of the time Ashes has been in development. The company's Reviewer's Guide includes the following:

[W]east have created a special branch where not only tin can vendors see our source code, but they can even submit proposed changes. That is, if they want to suggest a change our co-operative gives them permission to do so…

This branch is synchronized directly from our main branch and so information technology's unremarkably less than a calendar week from our very latest internal master software evolution co-operative. IHVs are complimentary to make their ain builds, or test the intermediate drops that we give our QA.

Oxide as well addresses the question of whether or not information technology optimizes for specific engines or graphics architectures directly.

Oxide primarily optimizes at an algorithmic level, non for any specific hardware. Nosotros also have intendance to avoid the proverbial known "glass jaws" which every hardware has. However, we do not write our code or tune for any specific GPU in listen. We discover this is only too fourth dimension consuming, and we must run on a broad variety of GPUs. Nosotros believe our lawmaking is very typical of a reasonably optimized PC game.

Nosotros reached out to Dan Baker of Oxide regarding the decision to plough asynchronous compute on by default for both companies and were told the following:

"Async compute is enabled by default for all GPUs. We do non desire to influence testing results by having different default setting by IHV, we recommend testing both ways, with and without async compute enabled. Oxide volition choose the fastest method to default based on what is bachelor to the public at ship time."

2d, we know that asynchronous compute takes advantages of hardware capabilities AMD has been building into its GPUs for a very long time. The Hd 7970 was AMD's first carte with an asynchronous compute engine and it launched in 2012. You could fifty-fifty argue that devoting die space and engineering endeavour to a feature that wouldn't be useful for four years was a bad idea, not a good one. AMD has consistently said that some of the benefits of older cards would appear in DX12, and that appears to be what's happening.

Asynchronous computing is non itself part of the DX12 specification, merely it's ane method of implementing a DirectX 12 multi-engine. Multi-engines are explicitly part of the DX12 specification. How these engines are implemented may well affect relative functioning between AMD and Nvidia, merely they're one of the advantages to using DX12 every bit compared with previous APIs.

Third, equally of independent inquiry on this topic has confirmed that AMD and Nvidia accept greatly different asynchronous compute capabilities. Nvidia's own slides illustrate this as well. Nvidia cards cannot handle asynchronous workloads the mode that AMD's tin, and the differences between how the ii cards function when presented with these tasks can't exist bridged with a few quick driver optimizations or code tweaks. Beyond3D forum member and GPU developer Ext3h has written a guide to the differences between the 2 platforms — it's a work-in-progress, only it contains a meaning amount of useful data.

Fourth, Nvidia PR has been silent on this topic. Questions about Maxwell and asynchronous compute take been bubbling for months; we've requested boosted information on several occasions. Nvidia is historically quick to respond to either wrong information or misunderstandings, often by making highly placed engineers or company personnel available for interview. The visitor has a well-deserved reputation for existence proactive in these matters, but nosotros've heard nothing through official channels.

Fifth and finally, nosotros know that AMD GPUs have ever had enormous GPU compute capabilities. Those capabilities haven't always been displayed to their best advantage for a variety of reasons, but they've always existed, waiting to be tapped. When Nvidia designed Maxwell, it prioritized rendering performance — in that location's a reason why the visitor'due south highest-finish Tesla SKUs are yet based on Kepler (aka the GTX 780 Ti / Titan Blackness).

It'south fair to say that the Nitrous Engine'due south design runs meliorate on AMD hardware — only in that location'south no proof that the engine was designed to disadvantage Nvidia hardware, or to prevent Nvidia cards from executing workloads effectively.

Conclusion

Ashes of the Singularity launches in a month. It's going to be a major DX12 data point for several years, at to the lowest degree, and we don't nevertheless know if the shift to that API ways that more engines will move to using asynchronous compute or non. It's certainly possible, particularly given that both the Xbox 1 and PS4 can make use of asynchronous compute already.

For now, we recommend treating these results as an interesting instance of how a new API can open up upwards performance capabilities and breathe new life into older hardware. While time constraints prevented usa from testing older AMD or NV cards, data nosotros've seen suggests that AMD GPUs see advantages from async compute across the visitor's entire product stack. It'due south not a miracle cure for an otherwise-boring menu, simply it gives a solid benefit.

If you already own a GeForce card, we still recommend waiting before rushing out to purchase new hardware. Both AMD and Nvidia have 14nm refreshes coming this year, and relative rankings could change depending on the architectures of the new cards. For now, all the same, AMD seems to exist gaining more from the DX12 shift than Nvidia is — the Fury X is an absolute titan in Ashes of the Singularity.

Now read: What is DirectX 12?

Update: (two/24/2016) Nvidia reached out to us tonight to ostend that while the GTX 9xx series does support asynchronous compute, it does not currently have the feature enabled in-driver. Given that Oxide has pledged to ship the game with defaults that maximize performance, Nvidia fans should care for the asynchronous compute-disabled benchmarks as representative at this fourth dimension. We'll revisit performance between Teams Cherry and Green if Nvidia releases new drivers that substantially change performance between now and launch twenty-four hour period.