# **Introduction to System-on-Chip**

COE838/EE8221 Systems-on-Chip Design http://www.ecb.torontomu.ca/~courses/coe838/

### Dr. Gul N. Khan

### http://www.ecb.torontomu.ca/~gnkhan Electrical, Computer & Biomedical Engineering Toronto Metropolitan University

## **Overview**

- Course Management
- Introduction to SoC
- SoC Applications
- On-Chip Interconnections
- Bus and NoC based SoC Interconnects

Introductory Articles on SoC available at the course webpage

## COE838/EE8221: Systems-on-Chip Design

http://www.ecb.torontomu.ca/~courses/coe838/

Instructor: Dr. Gul N. Khan Email: gnkhan@torontomu.ca URL:http://www.ecb.torontomu.ca/~gnkhan Telephone: 416 979-5000 ext. 556084, Office: ENG448 Consultation: Monday 1:45-3:00PM, or by Appointment

## Electrical, Computer & Biomedical Engineering Toronto Metropolitan University

## Lectures, Labs and Projects

### Half Notes

• Students need to take notes and also require text-reference books and some research articles identified by the instructor.

### **Labs and Project**

• Aimed at concept reinforcement and practical experience. Lectures, Labs, Projects and other support material is available at the course website:

http://www.ecb.torontomu.ca/~courses/coe838/

### **Assessment and Evaluation**

Labs/Project: 32% 20% Labs and 12% Project (For EE8221 students 32% Project) Midterm Exam: 25% (Monday: February 10, 2025 during lecture timeslot Final Exam: 43%

### **Course Text/Reference Books and other Material**

### **Text and Other Books**

1. SystemC: From the Ground Up, 2nd Edition, D.C. Black, J Donovan, B. Bunton, A. Keist, Springer 2010, ISBN 978-0-387-69958-5.

2. Michael J. Flynn, Wayne Luk, Computer System Design: System on Chip, John Wiley and Sons Inc. 2011, ISBN 978-0-470-64336-5

3. M. Wolf, Computer as Components: Principles of Embedded Computing System Design, 3rd or 4th edition Morgan Kaufmann-Elsevier Publishers 2012, 2016 ISBN 978-0-12-388436-7, ISBN 97801280538741.

4. On-Chip Communication Architectures, System on Chip Interconnect, S. Pascricha and N. Dutt, Morgan Kaufmann-Elsevier Publishers 2008, ISBN 978-0-12-373892-9.

5. Embedded Core Design with FPGAs, Z. Navabi, McGraw-Hill 2007, ISBN 978-0-07-147481-8 ISBN 0-07-147481-1.

*Some Articles, Embedded Processors and other Data Sheets* are available at the Course Website: <u>http://www.ecb.torontomu.ca/~courses/coe838/</u>

## **Main Lecture Topics**

1. Introduction to System on Chip (SoC)

- \* An SoC Design Approach
- 2. SystemC and SoC Design:
  - \* Co-Specification, System Partitioning, Co-simulation, and Co-synthesis
  - \* SystemC for Co-specification and Co-simulation
- 3. Hardware-Software Co-Synthesis, Accelerators based SoC Design
- 4. Basics of Chips and SoC ICs:
  - \* Cycle Time, Die Area-and-Cost, Power,
  - \* Area-time-Power Tradeoffs and Chip Reliability
- 5. System-on-Chip and SoPC (System on Programmable Chips)
- 6. SoC Interconnection Structures Network on Chip

\* NoC Interconnection and NoC Systems

- 7. Bus-based Interconnection
  - \* AMBA Bus, IBM Core Connect, Avalon, Interconnection Structures
- 8. SoC CPU/IP Cores
  - \* ARM Cortex A9, NIOS-II, OpenRISC, Leon4 and OpenSPARC
- 9. SoC Verification and UVM
- 10. SoC Application Case Studies (time permitting)

# System on a Chip

- An IC that integrates multiple components of a system onto a single chip.
- MPSoC addresses performance requirements.



# Samsung S3C6410 Platform



**Introduction to SoC Design** 

# S3C6410 System-on-Chip

- A 16/32-bit RISC low power, high performance micro-processor
- Applications include mobile phones, Portable Navigation Devices and other general applications.
- Provide optimized H/W performance for the 2.5G and 3G communication services,
- Includes many powerful hardwaree accelerators for motion video processing, display control and scaling. An
- Integrated Multi Format Codec (MFC) supports encoding and decoding of MPEG4/H.263, H.264.
- Many hardware peripherals such as camera interface, TFT 24-bit LCD controller, power management, etc.

## S3C6410 based Mobile Processor

Navigation System



## iPhone based on ARM1176JZ S3C6410



## Samsung S5PC100 SoC used in iPhone 3GS



**Introduction to SoC Design** 

# S5PC100 Samsung SoC

S5PC100 has various functionalities:

- Wireless communication, Personal navigation, Camera
- Portable gaming, Video player and Mobile TV into one device.
- S5PC100 has a 32-bit ARM Cortex A8 RISC microprocessor that operates up to 833MHz.
- 64/32-bit internal bus architecture
- Used in iPhone 3GS and iPod touch 3<sup>rd</sup> generation.



# Technology Roadmap in the past

| Year of Technology Node                                | 1999    | 2002    | 2005    | 2008         | 2011    | 2014    |
|--------------------------------------------------------|---------|---------|---------|--------------|---------|---------|
| Technology                                             | 180nm   | 130nm   | 100nm   | 70 <b>nm</b> | 50nm    | 35nm    |
| DRAM /introduction                                     | 1G      | 2~4G    | 8G      | -            | 64G     | -       |
| Transistors/chip ( $\mu$ P) (M)                        | 110     | 220~441 | 882     | 2,494        | 7,053   | 19,949  |
| Chip size $(\mu P)$ $(mm^2)$                           | 450     | 450~567 | 622     | 713          | 817     | 937     |
| Number of signal I/O (µP)                              | 768     | 1,024   | 1,024   | 1,280        | 1,408   | 1,472   |
| Power/Ground I/O (µP)                                  | 1,536   | 2,018   | 2,018   | 2,560        | 2,816   | 2,944   |
| On-chip local clock (MHz)<br>(high performance)        | 1,250   | 2,100   | 3,500   | 6,000        | 10,000  | 13,500  |
| On-chip across-chip clock<br>(MHz) (high performance)  | 1,200   | 1,600   | 2,000   | 2,500        | 3,000   | 3,600   |
| Off-chip speed (MHz) (high<br>perf., peripheral buses) | 480     | 885     | 1,035   | 1,285        | 1,540   | 1,800   |
| Power (W) H.P./H.H.                                    | 90/1.4  | 130/2.0 | 160/2.4 | 170/2.0      | 174/2.2 | 183/2.4 |
| Power supply (V) H.P./H.H.                             | 1.8/1.5 | 1.5/1.2 | 1.2/0.9 | 0.9/0.6      | 0.6/0.5 | 0.6/0.3 |
| Metal levels # ( $\mu P/SoC$ )                         | 7/6     | 8/7     | 9/8     | 9/9          | 10/10   | 10/10   |

H.P: High performance µP – Micro-Processor

H.H: Hand-Held Devices

## Number of Transistors on a Chip (SoC)



# Evolution: Boards to SoC

### **Evolution:**

- IP based design
- Platform-based design

### **Some Challenges**

- HW/SW Co-design
- Integration of analog (RF) IPs
- Mixed Design
- Productivity

### **Emerging new technologies**

- Greater complexity
- Increased performance
- Higher density
- Lower power dissipation



# What is System-on-Chip

SoC: More of a System not a Chip

\* In addition to IC, SoC consists of software and interconnection structure for integration.

SoC may consists of all or some of the following:

- Processor/CPU cores
- On-chip interconnection (busses, network, etc.)
- Analog circuits
- Accelerators or application specific hardware modules
- ASICs Logics
- Software OS, Application, etc.
- Firmware

# System on a Chip

## **On-Chip Components?**

A processor or multiple processors

\* Including DSPs, microprocessors, microcontrollers Cores (IPs): On-chip memory, accelerators, peripherals (i.e. USB, ETH, etc.), PLLs, power management, etc.





#### **Introduction to SoC Design**

## ASIC to System-on-Chip

**ASICs:** Application Specific ICs are close to SoC designed to perform a specific function for embedded and other applications.

\* ASIC vendors supply libraries for each technology they provide. Mostly, these libraries contain pre-designed/verified logic circuits.

\* SOC is an IC designed by combining multiple stand-alone VLSI designs to provide a functional IC for an application. It composes of pre-designed models of complex functions e.g. cores (IP block, virtual components, etc.) that serve various embedded applications.





# ASIC Design Flow



#### **Introduction to SoC Design**

# System-on-Chip Design Flow

- Specify: What does the customer really want?
- Architect:
  - \* Find the most cost and performance effective architecture to implement it?
  - \* Which existing components can we adapt & re-use?
- Evaluate: What is the performance impact of a cheaper architecture?
- Implement: What can we generate automatically from libraries and customization?

Use separate computation, communication, etc.

# SoC Design Flow

## SoC -- Typical Design Steps



- Due to Chip Complexity and lower IC area, it is difficult to reduce Placement, Layout and Fabrication steps time.
- There is need to reduce the time of other steps before
  Placement, Layout and
  Fabrication steps.
- One should consider Chip Layout issues up-front.

System-on-Chip



## SOC Structure



## SOC: System on Chip

- SOC cannot be considered as a large ASIC
  - Architectural approach involving significant design reuse
  - Addresses the cost and time-to-market problems
- SOC design is significantly more complex
  - Need cross-domain optimizations
  - IP reuse will increase productivity, but not enough
  - Even with extensive IP reuse, many of the ASICs design problems will remain, and more ...





## **SOC** Applications

- SOC Design include embedded processor cores, and a significant software component, which leads to additional design challenges.
- An SOC is a system on an IC that integrates software and hardware Intellectual Property (IP) using more than one design methodology.
- > The designed system on a chip is application specific.

Typical applications of SOC:

- Consumer devices.
- Networking and communication.
- Biomedical Devices.
- Other segments of electronics industry.

Microprocessor, Media processor, GPS controllers, Cellular/Smart phones, ASICs, HDTV, Game Consoles, PC-on-a-chip

## IP: Intellectual Property Cores

IP cores can be classified into three types:

*Hard IP* cores are hard layouts using physical design libraries. The integration of hard IP cores is simple and easy. However, they are technology dependent and lack flexibility.

*Soft IP* cores are generally in VHDL/Verilog code providing functional descriptions of IPs. These cores are flexible and reconfigurable. However, these soft IP cores must be synthesized and verified by the user before integrating them.

*Firm IP* cores provide the advantage of both balancing the high performance and optimization properties of hard IPs along with the flexibility of soft IPs. These cores are provided in the form of netlists to specific physical libraries after synthesis.

# Multi-Core (Processor) System-on-Chip

Inter-node communication between CPU/cores can be performed by message passing or shared memory. Number of processors in the same chip-die increases at each node (CMP and MPSoC).

- Memory sharing will require: Shared Bus
  - \* Large Multiplexers
  - \* Cache coherence
  - \* Not Scalable
- Message Passing: NOC: Network-on-Chip
  - \* Scalable
  - \* Require data transfer transactions
  - \* Overhead of extra communication

## Buses to Networks



- Architectural paradigm shift: Replace wire spaghetti by network
- Usage paradigm shift: Pack everything in packets
- Organizational paradigm shift
  - Confiscate communications from logic designers
  - Create a new discipline, a new infrastructure responsibility

## MPSoC

MPSoC is a System-on-Chip and it contains multiple instruction-set processors (CPUs).

- A typical MPSoC is a **heterogeneous multiprocessor** where several different types of processing elements (PEs).
- The memory system may also be heterogeneously distributed around the machine, and the interconnection structure between the PEs and the memory may also be heterogeneous.
- MPSoCs often have large memory. The application device can have embedded memory on-chip and may rely on off-chip commodity memory.

# SOC: System on Chip

Several CPUs are now actually considered as SoCs!

• CPUs now contain the CPU itself, along with integrated graphics processors, PCI express, memory controllers etc. all on a single die



Advantages? Disadvantages?



ipad3's CPU SoC Circuit → A5

PC Motherboard – CPU with support ICs

## Technology Roadmap - Latest

| Processor/SoC                                       | Year | Designer      | Technology  | SoC Area<br>(mm <sup>2</sup> ) | Transistor<br>density (tr./mm <sup>2</sup> ) |
|-----------------------------------------------------|------|---------------|-------------|--------------------------------|----------------------------------------------|
| AMD Zeppelin SoC Ryzen                              | 2017 | AMD           | 14 nm       | 192 mm <sup>2</sup>            | 25,000,000                                   |
| Xbox One X main SoC                                 | 2017 | Microsoft/AMD | 16 nm       | 360 mm <sup>2</sup>            | 19,440,000                                   |
| HiSilicon Kirin 970 (octa-core<br>ARM64 mobile SoC) | 2017 | Huawei        | 10 nm       | 96.72 mm <sup>2</sup>          | 56,900,000                                   |
| Snapdragon 845 (octa-core<br>ARM64 mobile SoC)      | 2017 | Qualcomm      | 10 nm       | 94 mm <sup>2</sup>             | 56,400,000                                   |
| Apple A12 Bionic (hexa-core<br>ARM64 mobile SoC)    | 2018 | Apple         | 7 nm        | 83.27 mm <sup>2</sup>          | 82,900,000                                   |
| Snapdragon 865(octa-core<br>ARM64 mobile SoC)       | 2019 | Qualcomm      | 7 nm        | 83.54 mm <sup>2</sup>          | 123,300,000                                  |
| Apple M1 (octa-core 64-bit<br>ARM64 SoC)            | 2020 | Apple         | <u>5 nm</u> | 119 mm <sup>2</sup>            | 134,500,000                                  |
| M1 Pro (10-core, 64-bit)                            | 2021 | Apple         | 5 nm        | 245 mm <sup>2</sup>            | 137,600,000                                  |
| Snapdragon 8 Gen 2 (8-core<br>ARM64 mobile SoC)     | 2022 | Qualcomm      | 4 nm        | 268 mm <sup>2</sup>            | 59,701,492                                   |
| Apple M2 Ultra (2 M2 Max )                          | 2023 | Apple         | 5 nm        | ?                              |                                              |
| Apple A17                                           | 2023 | Apple         | <u>3 nm</u> | 103.8 mm <sup>2</sup>          | 183,044,315                                  |
| <u>M4</u> (10-core ARM64 SoC)                       | 2024 | Apple         | <u>3 nm</u> | ?                              |                                              |

**Introduction to SoC Design** 

## More Recent SoCs

- Apple M2 Max SoC: 12core CPU, 38-core GPU & 96GB Unified Memory.
- M2 Ultra has more than **134 Billion** Transistors
- 2 M2 Max chips: 24-core CPU, 76-core GPU and 192GB Unified Memory.
- Qualcomm Snapdragon X 8-core Oryon CPU.

The SoC is equipped with the Qualcomm® Hexagon NPU (Neural Processing Unit) which runs 45 TOPS (Trillion Operations/Second)



NPU

# SC2A11: Multi-core Processor

A multi-core processor SoC with 24cores of ARM Cortex-A53.

SC2A11 suitable for low-power server systems. It can also suit to edge computing to process data at the edge of the cloud.



|   | Processing Element                   |                                     |                                      |                                       |                                       |                                       |                                     |                                       |
|---|--------------------------------------|-------------------------------------|--------------------------------------|---------------------------------------|---------------------------------------|---------------------------------------|-------------------------------------|---------------------------------------|
| ľ | ARM<br>Cortex-A53<br>#0<br>#8<br>#16 | ARM<br>ortex-A53<br>#1<br>#9<br>#17 | ARM<br>Cortex-A5<br>#2<br>#10<br>#18 | ARM<br>Cortex-A53<br>#3<br>#11<br>#19 | ARM<br>Cortex-A53<br>#4<br>#12<br>#20 | ARM<br>Cortex-A53<br>#5<br>#13<br>#21 | ARM<br>Cortex-A<br>#6<br>#14<br>#22 | ARM<br>Cortex-A53<br>#7<br>#15<br>#23 |
|   | PCIe Gen2<br>4lane                   |                                     |                                      | GPIO                                  | UART                                  |                                       |                                     | PCIe Gen2<br>4lane                    |
|   | GbE                                  | DD                                  | )R4                                  | eMMC                                  | SPI                                   | DD                                    | )R4                                 | GbE                                   |

# SC2A11-Media Transcoder System

- High energy-efficiency processor element is realized with multicore configuration of ARM Cortex-A53.
- Large amount of Video data can be processed faster in memory.



#### **Introduction to SoC Design**

## Exynos 5410 Octa Processor SoC



Octa core CPU, big.LITTLE processing Released in 2013/14 3D graphics – fast/efficient operation for smartphone/tablets. 12.8 GB/s memory bandwidth, 1080p 60 fps video.

## Exynos 2400 SoC (2024 release)

### Samsung Exynos 2400 Snapdragon 8 Gen 2

| Architecture             | 1x 3.21 GHz Cortex-X4<br>2x 2.9 GHz Cortex-A720<br>3x 2.6 GHz Cortex-A720<br>4x 1.95 GHz Cortex-A520 | 1x 3.2GHz Cortex-X3<br>2x 2.8GHz Cortex-A715<br>2x 2.8 GHz Cortex-A710<br>3x 2 GHz Cortex-A510 |
|--------------------------|------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
| Cores                    | 10                                                                                                   | 8                                                                                              |
| Frequency                | $3210 \mathrm{~MHz}$                                                                                 | $3200 \mathrm{~MHz}$                                                                           |
| Instruction set          | ARMv9.2-A                                                                                            | ARMv9-A                                                                                        |
| L2 cache                 | -                                                                                                    | $1 \mathrm{MB}$                                                                                |
| L3 cache                 | $8 \mathrm{MB}$                                                                                      | 8  MB                                                                                          |
| Process                  | 4 nanometers                                                                                         | 4 nanometers                                                                                   |
| Sustained Power<br>Limit | <sup>r</sup> 6 W                                                                                     | 6.3 W                                                                                          |
| Manufacturing            | Samsung                                                                                              | TSMC                                                                                           |

### Exynos 2400 vs Snapdragon SoC

used in Samsung Galaxy 24 and Ultra

| Features      | Samsung Exynos 2400                                                                                               | Qualcomm Snapdragon 8 Gen 3                                                          |
|---------------|-------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| Technology    | 4 nm                                                                                                              | <b>4 nm</b>                                                                          |
| CPU           | 10 cores: 1 Cortex X4 @<br>3.21GHz, 2 Cortex A720 @<br>2.9GHz, 3 Cortex A720 @<br>2.6GHz, 4 Cortex A520 @<br>2GHz | 8 cores: 1 Cortex X4 @3.3GHz, 3<br>Cortex A720 @ 2.96GHZ, 4<br>Cortex A520 @ 2.26GHz |
| GPU           | Samsung Xclipse 940                                                                                               | Adreno 750                                                                           |
| Max memory    | 24GB                                                                                                              | 24GB                                                                                 |
| Video capture | 8K @ 30 FPS                                                                                                       | 8K @ 30 FPS                                                                          |

## Where are we heading?

- Introduction to System on Chip An SoC Design Approach.
- SystemC for SoC Design: Co-Specification and Simulation.
- Hardware-Software Co-synthesis and Accelerator based SoCs.
- Basics of Chips and SoC ICs.
- SoC Interconnection Structures: NoC (Network on Chip)
- SoC Interconnection Structures: Bus-based Interconnection
- SoC CPU/IP Cores: ARM Cortex A9
- SoC Verification
- SoC Case Studies (if time permits)