Hirdetés
- Motorola G86 - majdnem Edge
- Samsung Galaxy Watch4 és Watch4 Classic - próbawearzió
- Samsung Galaxy S25 - végre van kicsi!
- Kiszivárgott, mikor érkezik a Galaxy S26
- Megérkezett a Pixel 5 és 4a 5G
- Honor Magic8 Pro - bevált recept kölcsönvett hozzávalókkal
- Milyen hagyományos (nem okos-) telefont vegyek?
- Soundcore Sleep A30 - a hosszú házasság titka
- Xiaomi 15T - reakció nélkül nincs egyensúly
- Xiaomi 13T és 13T Pro - nincs tétlenkedés
Aktív témák
-
Oliverda
Topikgazda
"Interesting to note are the "two tightly linked cores" sharing ressources. The shared FPU (which seems to be capable of 1x256 bit FMAC or 2x128 bit FMAC/FADD/FMUL in any combination per cycle) has been proposed many years ago by Fred Weber (AMD's CTO at that time). He already said, that two cores might share a FPU sitting between them. The whole thing is a CMT capable processor as speculated before. And if we look at core counts of Bulldozer based MPUs we should remember, that 2 such cores are accompanied by 1 FPU and an 8 core Zambezi actually contains 4 of these blocks shown on the Bulldozer slide.
Further the 4 int pipelines per core/cluster aren't further detailed, while for Bobcat they are. In the latter case we see 2 ALU + 1 Store + 1 Load pipes. For BD I still think, that we'll see 2 ALU + 2 AGU (combined Load/Store) pipes. Those "multi mode" AGUs would simply fit better to achieve a higher bandwidth and be more flexible, because the FPU will also make use of these pipes. BTW compared to the combined 48B/cycle L1 bandwidth (could be used e.g. as 48B load bandwidth) of Sandy Bridge we might only see 32B/cycle L1 bandwidth per core but up to 64B/cycle combined L1 bandwidth per FPU (although shared by 2 threads). Finally, nobody knows clock speeds of these processors, so no real comparison is possible right now.
Today rumour site Fudzilla posted some rumoured details of BD, e.g. DDR3-1866 compatibility, 8 MB L3 (for 8 cores) and "APM Boost Technology". A german site even mentioned some mysterious patents pointing into the same direction... Well, given the time frame and filing dates I see both a chance for a simple core level overclocking like Intels Turbo Boost, because this is covered in AMD patents and a chance for a more complex power management on a core component level as described in my second last blog posting."
"Take the current decoders and add a fourth. This would work most of the time but would work better if you add another small buffer between the predecode and decode stages or if you increase the depth of the pick buffer. Then you continue with 4 pipelines to the reorder buffer and down to four dispatch and four ALU/AGU units. You can't break these apart without redesigning the front end decoder and the dispatch units.
The FMAC units are now twice as fast so you really don't need more than two per pair of cores. Besides you can't keep the same ratio while doubling the speed without busting your thermal limits. Even at that, the L1 Data bus is too small. So, you double the width of the L2 cache bus and get your SSE data directly from there. This leaves the existing L1 Data caches to service the Integer units. That increases the data bandwidth by 50%. The current limit is two 128 bit loads per core. If we allow the same for FP then we have six 128 bit loads per core pair. The current limit is two 64 bit saves per core. They could leave this unchanged on the integer side and beef up the FP unit to allow two 128 bit loads, one 128 bit load plus one 128 bit save, or two 128 bit saves. That would give the FP sufficient bandwidth. The front end doesn't really have to be changed since current FP instructions are already sent to another bus. It's a good question of whether each core gets one FMAC all to itself or whether they can intermingle. Either would be possible since threading the FP pipeline would only require extra tags to tell the two threads apart. Two threads would also break some dependencies and partially make up for the extra volume due to tighter pipeline packing and fewer stalls.
Presumably, widening the pipelines to four would give a good boost to Integer performance. I suppose 33% would be the max unless they beef up the decoders but 20% would be enough. I'm guessing they would also change the current behavior of the decoders which now tend to stall when when decoding complex instructions and would probably reduce the decode time on more of the Integer instructions.
The only reason for a complete redesign would be if the architecture extensions can't fit into the thermal limits."
Aktív témák
- Autós topik
- Motorola G86 - majdnem Edge
- One otthoni szolgáltatások (TV, internet, telefon)
- Genshin Impact (PC, PS4, Android, iOS)
- Sok hűhó négyszázért: AMD Ryzen 7 9850X3D
- Motorolaj, hajtóműolaj, hűtőfolyadék, adalékok és szűrők topikja
- Hosszú premier előzetest kapott az Arknights: Endfield
- BestBuy topik
- Honda topik
- Samsung Galaxy Watch4 és Watch4 Classic - próbawearzió
- További aktív témák...
- Kingston Fury Beast 2x8GB ddr4 2666mhz KF426C16BBK2
- OHH ! Dell Latitude 9430 Üzleti Profi Prémium Ultrabook 14" -80% i7-1265U 32/1TB IRIS Xe FHD+
- Canon EOS 200D +kit objektív és táska ( 6100 expo )
- Chieftec Eco Series 600W táp eladó!
- Szinte Új Razer Kraken 2019 Fekete/Fehér/Zöld Bolti ár:20k INGYEN FOXPOST
- Dell 14 Latitude 5430 FHD IPS i5-1245U vPro 4.4Ghz 10mag 16GB 512GB Intel Iris XE Win11 LTE Garancia
- GYÖNYÖRŰ iPhone 12 mini 64GB Purple -1 ÉV GARANCIA - Kártyafüggetlen, MS3851
- Samsung Galaxy A53 5G / 6/128GB / Kártyafüggetlen / 12 Hó Garancia
- Fém, összecsukható és kihúzható fotó állvány eladó
- Decathlonos 43-as boxcipő, alig használt, hibátlan állapotban
Állásajánlatok
Cég: Central PC számítógép és laptop szerviz - Pécs
Város: Pécs
Cég: Laptopműhely Bt.
Város: Budapest
Oliverda

