#### Integrated Power Delivery for High Performance Server Based Microprocessors

#### J. Ted DiBene II, Ph.D.- Intel, Dupont-WA

#### International Workshop on Power Supply on Chip, Cork, Ireland, Sept. 24-26





THIS PRESENTATION AND RELATED MATERIALS AND INFORMATION ARE PROVIDED "AS IS" WITH NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT OF INTELLECTUAL PROPERTY RIGHTS, OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION, OR SAMPLE. INTEL ASSUMES NO RESPONSIBILITY FOR ANY ERRORS CONTAINED IN THIS PRESENTATION AND HAS NO LIABILITIES OR OBLIGATIONS FOR ANY DAMAGES ARISING FROM OR IN CONNECTION WITH THE USE OF THIS PRESENTATION.

#### NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED HEREIN.

- All products, dates, and figures specified are preliminary based on current expectations, provided for planning purposes only, and are subject to change without notice.
- No promises are made, express or implied, nor are any obligations assumed or created by Intel or you solely as a result of this presentation to sell or purchase from the other party any products and you should not make any commitments to do so or otherwise rely on this presentation or on related materials or information.
- Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
- \*Other names and brands may be claimed as the property of others.
- Copyright © 2008 Intel Corporation



# Agenda

- Where Processor Loads Occur
- Multi-core Server Based Processors
  - Reasons for fine grain power management
- MCP Power Delivery Constraints
- Silicon Based Power Delivery Design
- Conclusions



# **MICROPROCESSOR LOADS**

Where they occur on the processor



#### **Processors General**

#### Processors functioning...

- Fetch, Decode, Execute, Write
- Power burned in discrete units often non-sequentially
- Load concentrates in active logical units

| Cycle              | 1     | 2      | 3      | 4       | 5       | 6       | 7       | 8     | 9 |
|--------------------|-------|--------|--------|---------|---------|---------|---------|-------|---|
| Instr <sub>1</sub> | Fetch | Decode |        | Execute |         | Write   |         |       | - |
| Instr <sub>2</sub> | Fetch | Decode |        | Wait    |         | Execute | Write   |       |   |
| Instr <sub>3</sub> |       | Fetch  | Decode | Execute | Write   |         |         |       |   |
| Instr <sub>4</sub> |       | Fetch  | Decode | 4       | Wait    | _       | Ixecute | Write |   |
| Instr <sub>5</sub> |       |        | Fetch  | Decode  | Execute | Write   |         |       |   |
| Instr <sub>6</sub> |       |        | Fetch  | Decode  | Execute | Write   |         |       |   |
| Instr <sub>7</sub> |       |        |        | Fetch   | Decode  | Execute | Write   |       |   |
| Instr <sub>8</sub> |       |        |        | Fetch   | Decode  | Execute | Write   |       |   |

#### **Execution Cycle**



## Load currents can concentrate in regions thru-out the execution of a process

Load Cycle





\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

#### Modeled Results – Package Static Analysis - Example

#### Package Planes

 Funneling can occur into load where current density is highest







2D View of Package Current Density – J(x,y)



\*Other names and brands may be claimed as the property of others. Sli Copyright © 2008 Intel Corporation

### MULTI-CORE SERVER BASED PROCESSORS

Justification for Fine-Grain Power Management



\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

#### **Server MBVR Today**

- Caps distributed around socket but static current comes from inductor 'nodes'
- Distribution typically looks good for *uniform* loads
- However, for multi-core, this results in large impedance as discussed.



MBVR region



## **Power in Distribution – LF**

Static power distribution is determined from current density distribution in path J(r,z) predominately determines impedance.

$$\oint_{S} (E \times H) dS = -\left\{ \int_{V} H \bullet \frac{\partial B}{\partial t} dV + \int_{V} E \bullet \frac{\partial D}{\partial t} dV + \int_{V} E \bullet J dV \right\}$$

Determination of power loss in plane structures dominates



#### **Movement in Server Microprocessors**

 Industry Direction is now towards multiple cores rather than fewer cores





#### **Multi-Core Server Processors**

 For multi-core operation, MBVR Distribution may result in higher loss and impedance due to larger parasitics, distance, and current density increase.



#### **Processor Activities**

- Generic Multi-threaded processors activity can overlap in time-windows for server processors – speeds & feeds
  - Activity Factors are key
  - Load overlap is key
- Performance and Power metrics are key
  - Process quickly // shut down fast
- Power Density has increased
  - Control power at load not at source



#### **Processor – Low Level**

Simple Transistor Operations...

 At low-level current is drawn both 'statically' and dynamically





\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

#### Potential Power Savings Estimated - Leakage

Power Savings estimated from both leakage current reduction in thermal and  $V_{dd}$  as well as set-point.

- Sub-threshold currents tend to dominate in leakage equation
- Reductions may also occur due to thermals not estimated here.

$$\Delta P = \left( P_{MBVR} - P_{PSOC} \right) / P_{MBVR}$$

$$\frac{\Delta P}{P} \cong \left(\frac{V_{dd} k 10^{\frac{Vdd}{Vlow}} - V_{dd} k 10^{\frac{Vdd'}{Vlow}}}{V_{dd} k 10^{\frac{Vdd}{Vlow}}}\right) = \left(\frac{V_{dd} 10^{\frac{Vdd}{Vlow}} - V_{dd} 10^{\frac{Vdd'}{Vlow}}}{V_{dd} 10^{\frac{Vdd}{Vlow}}}\right)$$

approximations



$$I_{off} \cong k10^{\frac{Vdd}{Vlow}}$$



#### Loadline – Where Power is Saved

Loadline = Representation of impedance between power source and load

When close to load, power may be saved due to lowering voltage at source





## Leakage Power Revisited

Leakage (drain-source)

- Leakage current dominated by sub-threshold dimensions and DIBL.
- Threshold voltage lowering
- Dependent upon internal state q and input vector v.
- Temperature dependence estimation
- Other leakage components exist e.g. gate leakage not evaluated here.

#### Standard NMOS sub-threshold current equation

$$P_{Leak} = V_{DD} \cdot I_{Leak}(v, q) \quad [1]$$

$$I_{N} = \mu_{N} C_{ox} \frac{W}{L} V_{t}^{2} e^{\frac{V_{GS} - V_{THN}}{nV_{t}}} \left( 1 - e^{-\frac{V_{DS}}{V_{t}}} \right) [2]$$

$$P_{new} \approx (\% P_{leak} \bullet P_{old}) 10^{\frac{\Delta T}{T_H}}$$



### **Multi-Core Processors**

If power delivery does not change

- -Power to load will increase
- Cost of power delivery solution will increase
- Requirements
  - -Shallower Loadline
  - -Segment power delivery to each load
  - Smaller power delivery implying Power SOC technology on or near package



### MULTI-CHIP PACKAGE POWER DELIVERY CONSTRAINTS

Power delivery where the load is

(intel)

\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

## **MCP Power Delivery**

- To combat the emerging issues of multicore, power delivery must get closer to the load. This requires delivery on package.
- Many rails requires many VR's
- Reliability is key metric
- Size is key metric
- Cost is key metric.



Example large die on package.

#### Example Power SOC Device next to die





## **Constraints of MCP**

- Package
  - Must be compatible with manufacturing design rules (spacings, Cu layers, etc.)
  - Must be able to use current capacitor technology
  - Must be compatible with manufacturing rules for existing silicon.
- System
  - Must be compatible with VR's on platform for higher input voltages
  - Must be cooled with existing thermal solutions and not impact thermal of microprocessor.
  - Must be cost effective over existing platform designs and reduce cost at platform.



# Many Rails, Many VRs

#### Architecture Requirements

- Power SOC device must be able to have many rails to supply the many loads
- Number of phases must be high activity of load modulates from off to full on – efficiency must be flat thru range.
- Response must be very fast (change in impedances requires less local cap [and different!] thus, loop response in > 5 MHz.)



### SILICON BASED POWER DELIVERY DESIGN

Power SOC Design Considerations

(intel)

\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

# If a Power SOC

#### Integration of Magnetics, Capacitors

- Low resistance vertical connections an advantage
- Backend compatibility with Silicon CMOS process required.
- Power Density must meet manufacturing requirements
- Must be compatible with Assembly manufacturing

Example Cross-section of Power SOC

Metal 'L' layer



\*Other names and brands may be claimed as the property of others. Copyright © 2008 Intel Corporation

# **Physics Requirements**

$$W_{M} = \frac{1}{2} \iiint_{\upsilon} \vec{B} \cdot \vec{H} d\upsilon \cong \frac{B^{2} \upsilon}{2\mu_{r} \mu_{0}}$$

- Power SOC Energy Storage
  - Energy Density
  - Power Loss
  - Dimensional Constraints

$$\frac{W_{_{M}}}{W_{_{A}}} \cong \mu_{r}$$

$$W_a \cong W_m \Longrightarrow \upsilon_m \mu_r \cong \upsilon_a$$

$$\frac{R_a}{R_m} = \frac{P_a}{P_m} \approx \frac{l_m \sqrt{\mu_r}}{l_a}$$



## Performance

- Efficiency of Power SOC must be high to combat extra stage.
- Speed of Power SOC must be better than MBVR due to less energy storage near die.
- Size of Power SOC must be much smaller to allow getting closer to load.



# **Silicon Design Constraints**

- Integration is key
  - Integrate everything!
  - Passives should be integrated magnetics, capacitors, etc. – to ensure parasitics are reduced.
  - Layout constraints of bridges, drivers, bias circuitry is key to limit noise generation on die.
  - Cannot affect operation of the load emissions, thermals, electrical noise to PLL's is extremely important – thus location of these devices on Power SOC is critical.



# **Silicon Design Constraints**

- Bridge Design
  - Routing thru to metal stack and bumps
  - Current densities
- Driver Design
  - Proximity to segmented bridges
- Bias Control Circuitry Design
  - Noise immunity
  - Isolation
  - speed





# **Silicon Design Constraints**

- High Volume Manufacturing Design Things to consider...
  - manage your transistor sizes for different functions
  - Key analog circuits (BW range is typically 5-400 MHz) – keeping gain, low noise and HF is key here.
  - PSRR's & CMRR's
    - Need high rejection on sensitive op-amps and references (bandgaps).



## **Other Considerations**

#### Digital Control

• Bias power in digital control can often exceed 10x that of analog control for similar BW. Breakthru's needed here.

#### Materials

Compatibility with silicon manufacturing is important.

#### System

#### Compatibility with existing infrastructure on platforms



## Conclusions

- Architecture and technology in microprocessors drives multiple rails
- There is justification for moving closer to load in server based microprocessor designs
- Power SOC architecture must support multiple rails, phases, and be quick
- Need to match packaging constraints of load
- Cost drives CMOS implementation
- Architecture for Power SOC Silicon design must take into account inter-system analog/digital design constraints.

