This is what will come with Navi, next year 2018.
as far as DCS and DX11,12 etc are all concerned nothing changes.
GPU’s Now, are already distinct Shader Groups Divided into sections with subprocessors and caches etc.
the difference w/ navi is instead of cramming everything onto a huge GPU they use 4 Smaller ones linked on the interposer w/ HBM and New Memory Interface.
So Using a General Reference of say $1000 Per 1000mm^2 wafer for production:
A Huge 4096 Core, 500mm^2 GPU would have a Cost of say $250 a Pop,
A Smaller 1024 Core, 250mm^2 GPU would have a Cost of Say $62 a pop.
Company can take the $62 Chips and Link them and Create the same 4096 Core Processor on the Interposter and save $2 a Pop, and Still be able to Scale the Chips to fit the needs of their entire Graphics Card Family/Range
Scaling from 1 @ 1024 to 2 @ 2048 to 4 @ 4096 to 6 @ 6144 to 8 at 8192.
They would only have to pay to fabricate ONE Chip design, and not have to lease multiple production lines to run multiple wafers.
This doesnt even take into the Money saved due to imperfections in the wafers causing defects and low yeilds.
So, Say Each wafer has 2 defects in it, on a 1000mm^2 wafer.
Depending on locations,
On a 500mm^2 Design, that can render 1 chip or 2 chips defective. that’s 25-50% of the yields. or 2 of 4 fabbed chips. ($250 to $500 lost to defects)
On a 250mm^2 Design, that can render 1 or 2 Chips defective or 6-12%, of the yields, or 2 of 16 fabbed chips. ($62 -$100 Lost to defects)
now extrapolate that to the real life cost of a 14nm Production Wafer which is somewhere around $5-6K Each depending on Size (1200mm^2),
Then extrapolate that to how many runs the company does to meet supply demand.
you’re looking at thousands of wafers.
if you could minimalize how many Chips you throw away to wafer defects and poor fabrication yields. You would not only save production costs, but you’d be able to sell the product cheaper.
hence why Ryzen is significantly Cheaper than the Intel i9s etc, the 8Core 16t chips arent Fab’d as 8 Core 16T Chips, they are 4 core 8T Chips Linked on an Infinity Fabric Layer. before being mounted to a CPU Substrate Package.