Looking at the comparison between the C2000 device on the left, the 180 nanometer F05 versus the F28M35x device on the right, the 65 nanometer F021, on the left the current flash system that would be found on a device such as a Delfino. With the 180 nanometer F05 technology, flash access time is about 36 nanoseconds. To smooth the effect of flash on code base on consecutive instructions, a two level 64-bit pre-fetch buffer has been implemented, this is referenced to as flash pipeline in the presentation, this is what pre-fetch is. Depending on the frequency that the system is running at, a 36 nanosecond access time on flash involves a setup of one to five wait states for every flash access. TI has run a PID benchmark with CPU running at various frequencies and compared the results with same PID codes running from on-chip RAM, which is single cycle access. With the previous flash pipeline, execution from flash was almost identical to RAM when running at 50 MHz, however, the performance degrades when CPU frequency increases. For example, at 150 MHz, execution from flash over RAM only achieves about 62% efficiency, Concerto benefits from a new flash pipeline. This pipeline allows us to achieve 94% efficiency when Concerto is run at 100 MHz and 91% efficiency when the C28x subsystem is run at 150 MHz. So there is quite a bit of a performance upgrade just in this flash pipeline. The data cache is also present on the C28x side as it will also have a positive impact on performance of Concerto against Delfino, but TI has not had a chance to measure that yet.

