

# TMS320C6474 Module Throughput

Jon Bradley

#### ABSTRACT

This document provides information on the C6474 module throughput.

## Contents

| 1 | Introduction                   | 2  |
|---|--------------------------------|----|
| 2 | C6474 Overview                 | 2  |
| 3 | Main Switch Fabric Connections | 5  |
| 4 | Peripherals Throughput         | 6  |
| 5 | References                     | 11 |

#### List of Figures

| 1 | C6474 Block Diagram                     | 3 |
|---|-----------------------------------------|---|
| 2 | Switched Central Resource Block Diagram | 5 |

#### List of Tables

| 1 | C6474 Bridges                                                 | 6  |
|---|---------------------------------------------------------------|----|
| 2 | AIF Uplink Delayed Streams Data                               | 7  |
| 3 | EDMA Maximum Measured Throughput                              | 8  |
| 4 | Maximum Theoretical Throughput Values                         | 8  |
| 5 | EDMA Measured Throughput as Percentage of Theoretical         | 9  |
| 6 | EMAC Throughput on Silicon With 1 GHz CPU Frequency           | 9  |
| 7 | SRIO Throughput on Silicon With 1 GHz CPU and 3.125 Gbps Link | 10 |
|   |                                                               |    |

All trademarks are the property of their respective owners.





## 1 Introduction

This document analyzes various scenarios of the C6474 to come up with data on its throughput. It is intended to complement the *TMS320C6474 Common Bus Architecture Throughput* (SPRAAX6) by reporting the CBA protocols, switch central resources (SCRs), and bridges, as applied to the C6474.

The next sections provide a brief overview followed by a list of the different peripherals and their corresponding throughput analysis and measurements.

## 2 C6474 Overview

C6474 is a DSP designed to meet the requirements for wide-band code division multiple access (WCDMA) and Universal Mobile Telecommunications System (UMTS) wireless infrastructure baseband systems.

The DSP comprises three 1+GHz C64x+ cores, three baseband accelerators, and many high-speed interfaces. All cores on the device have full access to the memory map and all resources on the device. Outside of the megamodule core, all are equally accessible by each core. The C6474 block diagram is shown in Figure 1.





Figure 1. C6474 Block Diagram

C6474 Overview



www.ti.com

The C6474 peripherals include:

- Level-three 64 KB read-only memory (ROM) for boot control and internet protocol (IP) security support
- Main switch fabric operating at one-third (1/3) the central processing unit (CPU) clock rate partitioned into 128-bit and 64-bit crossbars, including a 64-channel enhanced direct memory access (DMA) controller running at 1/3 the CPU clock rate
- 32-bit configuration (CFG) switch fabric operating at 1/3 the CPU clock rate.
- Turbo co-processor (TCP2) for processing data channels
- Viterbi co-processor (VCP2) for processing voice channels
- Configuration module (CFGC) for latching device settings
- 1.8-V, 32-bit DDR2 synchronous dynamic random access memory (SDRAM) external memory interface (EMIF) capable of 200-MHz (DDR2-400) to 400-MHz (DDR2-800) operation.
- Six-link antenna interface (AIF) capable of up to 3.072 Gbps operation per link, compliant to the Open Base Station Architecture Initiative Reference Point 3 Specification (OBSAI RP3) (768 Mbps, 1.536 Gbps, 3.072 Gbps link rates) and Common Public Radio Interface (CPRI) (614.4 Mbps, 1.2288 Gbps, 2.4576 Gbps link rates) standards.
- Two high-speed 1x Serial RapidIO<sup>®</sup> (SRIO) ports, capable of 1.25, 2.5, or 3.125 Gbps raw data rate
- Two 1.8-V 128-channel multichannel buffered serial ports (McBSP0/1)
- 1.8-V inter-integrated circuit (I2C) control module
- Frame synchronization module (FSYNC), compliant to OBSAI RP1 and CPRI standards
- Six 64-bit timers (Timer64)
- Two standard PLLs and phased-locked loop (PLL) controllers
- 16-pin general-purpose I/O module (GPIO)
- 10/100/1000M-bit Ethernet media access controller (EMAC) with serial gigabit media independent interface (SGMII)
- Integrated interrupt controller to combine Ethernet MAC and management data input/output (MDIO) interrupts
- · Semaphore peripheral block for enforcing atomic accesses to shared chip-level resources



## 3 Main Switch Fabric Connections

Figure 2 shows the main switch fabric connection between slaves and masters through the data switched central resource (SCR). Masters are shown on the right and slaves on the left.



Figure 2. Switched Central Resource Block Diagram



[exas Instruments

| Peri  | pherals  | Throu | ahnut  |
|-------|----------|-------|--------|
| 1 611 | priciais | THOU  | griput |

| Bridge | Type<br>(Mstr:Slv) | Width<br>(Mstr:Slv) | CLK<br>(Mstr:Slv) | Read<br>FIFO Ctl | Read<br>FIFO<br>Data | Read<br>FIFO<br>Burst<br>Size | Write<br>FIFO<br>Data | Write<br>FIFO<br>Burst<br>Size | CMD<br>FIFO<br>Depth | Status<br>FIFO<br>Depth |
|--------|--------------------|---------------------|-------------------|------------------|----------------------|-------------------------------|-----------------------|--------------------------------|----------------------|-------------------------|
| 1      | P-M                | 128:128             | 1:1               | N/A              | N/A                  | N/A                           | N/A                   | N/A                            | N/A                  | N/A                     |
| 2      | M-M                | 64:128              | 1:1               | 8                | 8                    | 1*                            | 3                     | 2                              | 3                    | 2                       |
| 3      | M-M                | 64:128              | 1:1               | 8                | 8                    | 1*                            | 3                     | 2                              | 3                    | 2                       |
| 4      | M-M                | 128:64              | 1:1               | 2                | 2                    | 1                             | 2                     | 0                              | 2                    | 2                       |
| 5      | M-M                | 128:64              | 1:1               | 2                | 2                    | 1                             | 2                     | 0                              | 2                    | 2                       |
| 6      | P-M                | 32:32               | 1:1               | N/A              | N/A                  | N/A                           | N/A                   | N/A                            | N/A                  | N/A                     |
| 7      | M-M                | 32:64               | 1:2               | 3                | 7                    | 1                             | 6                     | 4                              | 2                    | 2                       |
| 9      | M-P                | 64:32               | 2:1               | 2                | 2                    | 1                             | 3                     | 0                              | 2                    | 2                       |
| 10     | M-P                | 64:32               | 1:1               | 2                | 4                    | 4                             | 2                     | 0                              | 2                    | 2                       |
| 11     | M-P                | 64:32               | 1:1               | 2                | 9                    | 8                             | 2                     | 0                              | 2                    | 2                       |
| 12     | M-P                | 64:32               | 1:1               | 2                | 9                    | 8                             | 2                     | 0                              | 2                    | 2                       |
| 16     | P-M                | 32:32               | 1:1               | N/A              | N/A                  | N/A                           | N/A                   | N/A                            | N/A                  | N/A                     |
| 17     | M-M                | 32:64               | 1:1               | 1                | 2                    | 1                             | 2                     | 2                              | 1                    | 1                       |
| 18     | P-M                | 64:64               | 1:1               | N/A              | N/A                  | N/A                           | N/A                   | N/A                            | N/A                  | N/A                     |
| 22     | M:2M               | 128:1<br>28R+128W   | 1:1               | N/A              | N/A                  | N/A                           | N/A                   | N/A                            | N/A                  | N/A                     |
| 23     | M:P<br>(Read)      | 128:128             | 1:1               | 5                | 5                    | 1                             | N/A                   | N/A                            | 2                    | 2                       |
| 24     | M:P<br>(Write)     | 128:128             | 1:1               | N/A              | N/A                  | N/A                           | 3                     | 1                              | 2                    | 2                       |
| 25     | M-M                | 128:64              | 1:1               | 2                | 2                    | 1                             | 2                     | 0                              | 2                    | 2                       |
| 27     | M-P                | 64:64               | 1:1               | 2                | 2                    | 1                             | 5                     | 0                              | 2                    | 2                       |
| 28     | M-M                | 64:64               | 1:1               | 2                | 2                    | 0                             | 2                     | 0                              | 2                    | 2                       |
| 29     | M-M                | 64:64               | 1:1               | 2                | 2                    | 0                             | 2                     | 0                              | 2                    | 2                       |

The definitions of the bridges' parameters can be found in the TMS320C6474 Common Bus Architecture Throughput (SPRAAX6).

# 4 Peripherals Throughput

Theoretically, throughput is simply calculated by dividing the number of bytes being transferred through a peripheral to the bus width of that peripheral.

# 4.1 AIF Throughput

Enhanced direct memory access (EDMA) controls all antenna data movement in the . Antenna data enter and leave this device through any of six bidirectional serializer/deserializer (SerDes) links with a maximum capacity of 3.125 Gbps.

The antenna interface block has internal memories, called inbound and outbound memories, which respectively hold antenna data that is intended to enter and exit C6474.

AIF supports both the OBSAI RP3 and the CPRI standards. Each one of the six links must follow the same protocol, but may be independently configured for its frequency.

The antenna data-flow can be classified as:

- Uplink on-time streams
- Uplink delayed streams
- Downlink streams



#### 4.1.1 Uplink On-Time Streams

Uplink data received by AIF needs to be routed to the GEM to perform uplink chip rate (CR) processing. The data-flow would be:

AIF (inbound memory)  $\rightarrow$  EDMA  $\rightarrow$  GEM (0, 1, or 2)

The processing time is divided among the various components within the path of the transfer.

- Transfer from the AIF (inbound memory) → EDMA: the uplink data coming at 3.84 Mcps chip rate is divided in packets that contain 8 chips per antenna stream. Each antenna stream packet contains 8 × 2 (OSF) × 2(I, Q) × 8 (bit-width) = 32 bytes. For N antenna streams you need to transfer N × 32 bytes before a transfer is complete. It takes 2 bus (VBUS) cycles to transfer a packet from AIF → EDMA.
- It takes 2 additional bus (VBUS) cycles to transfer a packet from EDMA  $\rightarrow$  GEM.

## 4.1.2 Uplink Delayed Streams

Uplink antenna streams can be buffered in external memory for 1+ frame and routed back to the AIF at an appropriate time:

 $\mathsf{AIF} \to \mathsf{EDMA} \to \mathsf{External} \text{ memory} :: \mathsf{External} \text{ memory} \to \mathsf{EDMA} \to \mathsf{GEM}$ 

To study bottlenecks, on-time and delayed stream traffics were superimposed at the same time and different transfer controllers were used to control both traffics. The measured transfer latencies are noted in Table 2.

## Table 2. AIF Uplink Delayed Streams Data

| Data-Flow                                           | Number of Uplink<br>Antenna Streams | Number of<br>Downlink Antenna<br>Streams | Theoretical Minimum Time<br>Required in UMTS Chips | Measured Time in<br>UMTS Chips |
|-----------------------------------------------------|-------------------------------------|------------------------------------------|----------------------------------------------------|--------------------------------|
| AIF-EDMA-GEM On-Time<br>Streams                     | 24                                  | 24                                       | 1.459 UMTS chips                                   | 1.57 UMTS chips                |
| AIF-EDMA-DDR2 +<br>DDR2-EDMA-GEM Delayed<br>Streams | 24                                  | 24                                       | 2.92 UMTS chips                                    | 3.58 UMTS chips                |

Comparing the on-time streams' transfer time between the theoretical estimates (calculated in the previous paragraph) and the measured values in Table 2, the increase in time can be attributed to the occasional conflict of commands directed at AIF for on-time and delayed traffic; it can also be contributed to the EDMA overhead.

#### 4.1.3 Downlink Streams

Downlink streams are transferred in 4 chip packets for each antenna stream. Each downlink antenna stream packet contains  $4 \times 2$  (OSF)  $\times 2(I, Q) \times 8$  (Bit-width) = 16 bytes. With TX Gem  $\rightarrow$  EDMA bus 16 bytes wide (assuming the downlink antenna buffer is in internal memory, otherwise when it is in DDR2 memory the bus width would be 8 bytes), each antenna stream needs 1-VBUS cycle to pass from TX Gem to EDMA.

The transfer controller typically has a burst size of 64 bytes; first it waits for 4 VBUS cycles to get 64 bytes before beginning.

For the nominal example of 24 antenna streams, and assuming that all antenna streams on a given SerDes link are in contiguous locations in the outbound memory, 24 VBUS transactions are needed to transfer all streams.

The theoretical minimum transport time for 24 on-time downlink streams is given by:

24-VBUS cycles (TX Gem  $\rightarrow$  EDMA) + 4 VBUS cycles (time to fill the transfer controller queue)

= 28-VBUS cycles

= 28 / ((1000/3) × 0.260) = 0.323 UMTS chips



Due to a blocking in the bridge when both on-time and delayed uplink antenna streams are used, the actual real time required for completing the transfers was almost double the calculated value for on-time downlink streams only.

## 4.2 EDMA Throughput

Worst-case, SCR activity arises when all data transactions line up to the same memory end-point. Since the SCR allows for concurrent transfers between non-conflicting master/slave pairs, the SCR can support a very high total data rate across any end-point. If transfers line up such that the source or destination memory is the same, then collisions will occur and certain transactions will be chocked.

For 1 GHz CPU frequency and DDR2 (DDR2.667), silicon measurements give the maximum EDMA throughput for passing data between different L2s and DDR2 as shown in Table 3. The measured throughput values are obtained by sending packets from 16 channels at a time on the same queue that corresponds to a particular TC. In this way, 16 channels run on every TC in sequential manner but, at a given time, all are using the same TC. The channels are doing the same operation; i.e., source and destination points of all the channels are the same.

|     | Maximum Throughput in MBytes/s |                          |                          |                          |                      |  |  |
|-----|--------------------------------|--------------------------|--------------------------|--------------------------|----------------------|--|--|
|     | 16-Bit DDR2 -<br>GEM0 L2       | 32-Bit DDR2 -<br>GEM0 L2 | GEM0 L2 -<br>16-Bit DDR2 | GEM0 L2 -<br>32-Bit DDR2 | GEM0 L2 ·<br>GEM1 L2 |  |  |
| TC0 | 1224.8                         | 2191.52                  | 1226.08                  | 2443.84                  | 2058.72              |  |  |
| TC1 | 1224.48                        | 2178.56                  | 1226.08                  | 2444.64                  | 2058.72              |  |  |
| TC2 | 1224.8                         | 2179.52                  | 1226.08                  | 2444.64                  | 2058.72              |  |  |
| ТС3 | 1226.24                        | 2456.48                  | 1227.36                  | 2454.88                  | 3681.76              |  |  |
| TC4 | 1226.56                        | 2456.16                  | 1227.36                  | 2455.68                  | 3681.76              |  |  |
| TC5 | 1226.24                        | 2456.00                  | 1227.36                  | 2454.88                  | 3681.76              |  |  |

#### Table 3. EDMA Maximum Measured Throughput

The maximum theoretical EDMA throughput values for passing data between different L2s and DDR2 are shown in Table 4. For 16-bit DDR memory controller the theoretical throughput is 666.7 \* 16/8 = 1333 MBps, while for 32-bit DDR2 memory controller is 666.7 \* 32/8 = 2666.8 MBps. For 1 GHz CPU and with both SCRs running at CPU/3 frequency, for TC0-2 the theoretical throughput is 8 \* 1000/3 = 2667 MBps, while for TC3-5 it is 16 \* 1000/3 = 5333 MBps. Note that the maximum theoretical throughput for 128-bit TCs (TC3, TC4, or TC5) is the same as the one for its corresponding 64-bit TCs (TC0, TC1, or TC2) for all but one scenario (GEM0 L2 - GEM1 L2) as this scenario is the only one that contains only 128-bit data in the path.

| Table 4. Maximum | Theoretical | Throughput | Values |
|------------------|-------------|------------|--------|
|------------------|-------------|------------|--------|

|                           | Maximum Throughput in MBytes/s |                          |                          |                          |                      |  |  |
|---------------------------|--------------------------------|--------------------------|--------------------------|--------------------------|----------------------|--|--|
|                           | 16-Bit DDR2 -<br>GEM0 L2       | 32-Bit DDR2 -<br>GEM0 L2 | GEM0 L2 -<br>16-Bit DDR2 | GEM0 L2 -<br>32-Bit DDR2 | GEM0 L2 -<br>GEM1 L2 |  |  |
| 64-bit<br>(TC0, 1, or 2)  | 1333                           | 2667                     | 1333                     | 2667                     | 2667                 |  |  |
| 128-bit<br>(TC3, 4, or 5) | 1333                           | 2667                     | 1333                     | 2667                     | 5333                 |  |  |

Table 5 shows the EDMA throughput measured on silicon (from Table 3) as a percentage of theoretical values (from Table 4). The GEM0 L2 - GEM1 L2 throughput is 77.20% of theoretical values when using TC0, TC1, or TC2, while it is 69.03% of theoretical values when using TC3, TC4, or TC5; however, the actual amount of data transferred using TC3, TC4, or TC5 is 178% of the amount of data transferred using TC0, TC1, or TC2.

|     | Maximum Throughput in MBytes/s |                            |                            |                            |                        |  |  |  |
|-----|--------------------------------|----------------------------|----------------------------|----------------------------|------------------------|--|--|--|
|     | 16-Bit DDR2 -<br>GEM0 L2 %     | 32-Bit DDR2 -<br>GEM0 L2 % | GEM0 L2 -<br>16-Bit DDR2 % | GEM0 L2 -<br>32-Bit DDR2 % | GEM0 L2 -<br>GEM1 L2 % |  |  |  |
| TC0 | 91.88                          | 82.17                      | 91.98                      | 91.63                      | 77.20                  |  |  |  |
| TC1 | 91.86                          | 81.68                      | 91.98                      | 91.66                      | 77.20                  |  |  |  |
| TC2 | 91.88                          | 81.72                      | 91.98                      | 91.66                      | 77.20                  |  |  |  |
| TC3 | 91.99                          | 92.11                      | 92.07                      | 92.05                      | 69.03                  |  |  |  |
| TC4 | 92.01                          | 92.09                      | 92.07                      | 92.07                      | 69.03                  |  |  |  |
| TC5 | 91.99                          | 92.09                      | 92.07                      | 92.05                      | 69.03                  |  |  |  |

| Table 5 EDMA Measured | I Throughput as | s Percentage of Theoretica    | L |
|-----------------------|-----------------|-------------------------------|---|
|                       | i iniougnput as | s i ci cci lage oi meoi clica |   |

Among the various scenarios, the difference in throughput results can be attributed to one or more of the following reasons:

- EDMA throughput data varies from source and destination resource configuration and selection; 16-bit DDR and 32-bit DDR data have different throughput.
- There are 6 different transfer controllers (TCs) responsible for doing the data transfer between source and destination. Out of these 6 TCs:
  - TC0, TC1, TC2 have 64-bit data bus
  - TC3, TC4, TC5 have 128-bit data bus.

So, there is a difference of percent utilization among TCs. For instance, using TC3 instead of TC1 in passing data between 32-bit DDR2 and GEM0 L2 would improve the EDMA throughput from 81.68% to 92.11% (in percentage of theoretical values).

 All the data movement within C6474 happens between the pre-defined path allocated to each source/destination pair via SCR. Even though GEM0-GEM1 pair looks close, there are lots of bridges accommodating the difference in bus width in the path, which is the primary cause of degrading the throughput data.

## 4.3 Ethernet EMAC Throughput

The Ethernet MAC interface operates with a 125-MHz input clock from the SGMII SerDes, with 125 Mbytes/second sustained throughput to any on-chip L2 or L1D RAM, or to the DDR2 interface.

Ethernet packets contain 64-byte payloads. The worst-case scenario occurs on the Ethernet interface when operating with a 1000 Mbps PHY. In this case, the peripheral needs to transfer a packet (64 bytes) once every 1  $\mu$ s (1 MHz packet rate).

EMAC throughput testing results on C6474 running at a clock rate of 1 GHz are captured in Table 6. Measurements are collected for two scenarios: PHY external loopback and internal loopback. For external loopback, one Gigabit or 100 Mbps throughput via SGMII is used, with descriptors in/to L2 or DDR. For internal loopback, EMAC mode is set to Gigabit or 100 Mbps in order to verify full-duplex operation with a SerDes loopback.

| Source/<br>Destination | Packet Length<br>(Including<br>Overhead<br>Bytes) | Number of<br>Packets<br>TX/ RX | Theoretical<br>(Mbps) | Transmit<br>(Mbps) | Receive<br>(Mbps) | % Utilization<br>(TX) | % Utilization<br>(RX) |
|------------------------|---------------------------------------------------|--------------------------------|-----------------------|--------------------|-------------------|-----------------------|-----------------------|
|                        |                                                   |                                | PHY Ext               | ernal Loopback     |                   |                       |                       |
| L2                     | 1538                                              | 5                              | 1000                  | 999.29798          | 999.2979836       | 99.929798             | 99.92979836           |
| L2                     | 1538                                              | 5                              | 100                   | 99.943638          | 99.94426005       | 99.943638             | 99.94426005           |
| DDR                    | 1538                                              | 5                              | 1000                  | 999.29798          | 999.2979836       | 99.929798             | 99.92979836           |
| DDR                    | 1538                                              | 5                              | 100                   | 99.943793          | 99.94286034       | 99.943793             | 99.94286034           |
|                        |                                                   |                                | Intern                | al Loopback        |                   |                       |                       |
| L2                     | 1538                                              | 5                              | 1000                  | 999.31353          | 999.2979836       | 99.931353             | 99.92979836           |

Table 6. EMAC Throughput on Silicon With 1 GHz CPU Frequency

| Source/<br>Destination | Packet Length<br>(Including<br>Overhead<br>Bytes) | Number of<br>Packets<br>TX/ RX | Theoretical<br>(Mbps) | Transmit<br>(Mbps) | Receive<br>(Mbps) | % Utilization<br>(TX) | % Utilization<br>(RX) |
|------------------------|---------------------------------------------------|--------------------------------|-----------------------|--------------------|-------------------|-----------------------|-----------------------|
| L2                     | 1538                                              | 5                              | 100                   | 99.943793          | 99.94379347       | 99.943793             | 99.94379347           |

#### Table 6. EMAC Throughput on Silicon With 1 GHz CPU Frequency (continued)

# 4.4 SRIO Throughput

RapidIO interface operates at a data rate of 3.125 Gbps per differential pair. Assuming a 10% packet overhead, this equals 2.25 Gbps data throughput rate per 1x link. Both links operating simultaneously can have at most 4.5 Gbps data in both transmit and receive directions.

The worst-case scenario occurs on two 1x RapidIO ports running at 3.125 Gbps and the payload sizes for the packets are small (32 bytes). In this case, the peripheral needs to transfer a packet (32 bytes) once every 14.3 ns (70 MHz packet rate). This assumes full packet rate on both links in both transmit and receive directions.

The RapidIO peripheral has sufficient buffering to hold up to four full-size packets of 256 bytes each, or 1 KByte; the peripheral can hold up to 32 of 32B packets. Multiple transfers can be outstanding at a time to the DMA switch fabric.

The SRIO throughput measurements from the C6474 running at a clock rate of 1 GHz and for 3.125 Gbps link rate are captured in Table 7.

| Test Case | Source/<br>Destination | Data | # Packets<br>Transmitted | Throughput            |                       | % Utilization |              |
|-----------|------------------------|------|--------------------------|-----------------------|-----------------------|---------------|--------------|
| SWRITE    | L2                     | 4096 | 1                        | 2361.723              |                       | 94.47         |              |
| NWRITE    | L2                     | 4096 | 1                        | 2366.420              |                       | 94.66         |              |
| TxuRxu    | L2                     | 2048 | 16                       | TX (Mbps)<br>2618.041 | RX (Mbps)<br>2433.394 | %TX<br>104.72 | %RX<br>97.33 |
| SWRITE    | DDR                    | 4096 | 1                        | 2351.221              |                       | 94.05         |              |
| NWRITE    | DDR                    | 4096 | 1                        | 2355.501              |                       | 94.22         |              |
| TxuRxu    | DDR                    | 2048 | 16                       | TX (Mbps)<br>2608.53  | RX (Mbps)<br>2426.206 | %TX<br>104.34 | %RX<br>97.05 |

Table 7. SRIO Throughput on Silicon With 1 GHz CPU and 3.125 Gbps Link

# 4.5 EMIF DDR2 Throughput

As long as data requests are pending from the SCR, the 333 MHz, 32-bit DDR EMIF (DDR2-667) operates with sustained throughput at the memory data rate. For example, a 128 32-bit word linear burst from any GEM L2 to external memory (or vice versa) should complete in approximately 64-EMIF clock cycles (data clocked on both rising and falling edges). This assumes that the first command is pending at the EMIF controller boundary and ignores the latency of performing the first transfer.

The SCR must meet or exceed the data rate to keep the EMIF serviced to maintain full utilization with a 32-bit DDR2-667 interface, therefore the CPU clock rate should be  $\geq$  1.0 GHz. Since the SCR is running at 1/3 the CPU frequency with a 64-bit data bus width, x/3 \* 64  $\geq$  667 \* 32 means  $\times \geq$  1000. If running below 1.0 GHz frequency then the EMIF data rate should be reduced, either by dropping the frequency or the data width.

The worst case, the EMIF cycle latency occurs when running at 400 MHz (2.5 ns period). The default burst length for all transactions through the DMA SCR is 64 bytes, or 16 external memory data-phases. With a read burst size of 16, the EMIF must be serviced once every 20 ns ( $8 \times 2.5$  ns EMIF period - 2 data/cycle).

When running the EMIF at 333 MHz (3.3 ns period) the EMIF must be serviced once every 24 ns. With a clock rate of 1 GHz (1 ns period) it must be serviced once every 24 CPU cycles.



## 4.6 McBSP Throughput

A minimum of 1 McBSP can support CPU/10 = 100 Mbps transfers in full duplex mode using 32-bit data words in a McBSP interface operation. Under optimal conditions, both McBSPs are supported with CPU/8 Mbps bidirectional data transfer based on a single data word transfer per McBSP.

The worst-case scenario is where the McBSP is running full duplex at 100 MHz with 8-bit contiguous data, both the transmitter and the receiver need to be serviced by the EDMA with 8-bit data once every 64 CPU cycles per McBSP, per direction. McBSP silicon measurement with 8-bit data showed that it can only achieve  $\approx$  30 Mbps throughput while 16-bit operation only achieved  $\approx$  43 Mbps; this is due to EDMA overhead.

# 4.7 TCP Throughput

TCP coprocessor operates at 1/3 the CPU frequency. This allows for  $\approx$  17 Mbps (decoded data-rate when the CPUs are operating at 1 GHz), assuming eight iterations for 384 kbps data channels.

McBSP silicon measurement with 8-bit data showed that it can only achieve  $\approx$  30 Mbps throughput while a 16-bit operation can only achieve  $\approx$  43 Mbps. This is due to EDMA overhead.

# 4.8 VCP Throughput

The VCP coprocessor operates at 1/3 the CPU frequency. This allows for  $\approx$  9 Mbps (decoded data-rate when the CPUs are operating at 1 GHz), assuming 12.2 kbps adaptive multi rate (AMR) (class B&C) channels.

# 4.9 I2C Throughput

The worst-case scenario is when the I2C is running at 400 kHz with 8-bit data. In this case, the device needs to service the I2C module at a packet rate of 3.2 MHz.

#### 5 References

TMS320C6474 Common Bus Architecture Throughput (SPRAAX6)

#### **IMPORTANT NOTICE**

Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements, and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are sold subject to TI's terms and conditions of sale supplied at the time of order acknowledgment.

TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with TI's standard warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where mandated by government requirements, testing of all parameters of each product is not necessarily performed.

TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and applications using TI components. To minimize the risks associated with customer products and applications, customers should provide adequate design and operating safeguards.

TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right, or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property of the third party, or a license from TI under the patents or other intellectual property of TI.

Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional restrictions.

Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all express and any implied warranties for the associated TI product or service and is an unfair and deceptive business practice. TI is not responsible or liable for any such statements.

TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of TI products in such safety-critical applications.

TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.

TI products are neither designed nor intended for use in automotive applications or environments unless the specific TI products are designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated products in automotive applications, TI will not be responsible for any failure to meet such requirements.

Following are URLs where you can obtain information on other Texas Instruments products and application solutions:

| Products                    |                        | Applications       |                           |
|-----------------------------|------------------------|--------------------|---------------------------|
| Amplifiers                  | amplifier.ti.com       | Audio              | www.ti.com/audio          |
| Data Converters             | dataconverter.ti.com   | Automotive         | www.ti.com/automotive     |
| DSP                         | dsp.ti.com             | Broadband          | www.ti.com/broadband      |
| Clocks and Timers           | www.ti.com/clocks      | Digital Control    | www.ti.com/digitalcontrol |
| Interface                   | interface.ti.com       | Medical            | www.ti.com/medical        |
| Logic                       | logic.ti.com           | Military           | www.ti.com/military       |
| Power Mgmt                  | power.ti.com           | Optical Networking | www.ti.com/opticalnetwork |
| Microcontrollers            | microcontroller.ti.com | Security           | www.ti.com/security       |
| RFID                        | www.ti-rfid.com        | Telephony          | www.ti.com/telephony      |
| RF/IF and ZigBee® Solutions | www.ti.com/lprf        | Video & Imaging    | www.ti.com/video          |
|                             |                        | Wireless           | www.ti.com/wireless       |

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265 Copyright © 2008, Texas Instruments Incorporated