The C66x subsystem is based on the TI's standard TMS320C66x™ DSP CorePac module. It includes subsystem logic to ease the C66x CorePac integration into the SoC, while maximizing software reuse from previous devices.
The C66x DSP extends the performance of the C64x+ and C674x DSPs through enhancements and new features. Many of the new features target increased performance for vector processing. The C64x+ and C674x DSPs support 2-way SIMD operations for 16-bit data and 4-way SIMD operations for 8-bit data. On C66x DSP, the vector processing capability is improved by extending the width of the SIMD instructions.
The C66x DSP can execute instructions that operate on 128-bit vectors. For example, the QMPY32 instruction is able to perform the element-to-element multiplication between two vectors of four 32-bit data each. The C66x DSP also supports SIMD for floating-point operations. Improved vector processing capability (each instruction can process multiple data in parallel) combined with the natural instruction level parallelism of C6000 architecture (for example, execution of up to eight instructions per cycle) results in a very high level of parallelism that can be exploited by DSP programmers through the use of TI's optimized C/C++ compiler.
For more information, see C66x DSP Subsystem section in Processors and Accelerators chapter in the device TRM.