|M.Sc Student||Manor Shimon|
|Subject||Multi-Synchronous Clocking for Low Power|
|Department||Department of Electrical Engineering||Supervisor||Professor Ran Ginosar|
|Full Thesis text|
Today’s standard system on a Chip (SoC) is using synchronous communication between systems blocks. To enable this, clock trees and interconnect dissipate high levels of power and require high silicon area.
This research investigates Multi-Synchronous SoC, in order to reduce clock and interconnect power. Multi-Synchronous clock domain is driven by clocks that share the same frequency but differ in clock phase, while the relative phase may change slowly over time, as a result of voltage and temperature changes. When transferring data from one such domain to another, a phase compensation circuit is employed in order to enable sampling the data by the receiver at a time that avoids conflict with the receiving clock. Phase detector circuits operate continuously to adjust the phase compensator in cases of phase change.
In a normal single-clock synchronous SoC, a powerful global clock tree is used to assure that all blocks receive the same clock phase, and data can be moved freely from one block to another. To achieve this, the global clock tree consumes high power and high area.
When using Multi-Synchronous clocking, each block receives an un-buffered version of the single clock. While each block may receive a different (and slowly varying) clock phase, less power and less area are needed for the global clock.
In addition to the savings in the global clock tree, power and area are also saved in the inter-block data interconnect. In normal synchronous SoC, the data transfer time between blocks is restricted to a fraction of the clock cycle. In Multi-Synchronous systems the only restriction is inter-line skew for multi-bit parallel transfers. Therefore, additional saving of area and power could be made possible.
Two SoC architectures were used to evaluate the Multi-Synchronous clocking. A ring of AES blocks was designed at high frequencies, in an attempt to stress the test. It achieved 300MHz in a 0.18u CMOS, which is relatively high frequency for that technology. A large matrix of very simple units with long interconnects was also studied at up to 500 MHz, this in order to investigate a large and fast clock tree.
We demonstrate a reduction of up to 68% in interconnect and clock power (up to 33% in interconnect combinational power and up to 67% in clock network power). Alternatively, we demonstrate up to 23% faster clock frequency. In contrast with other advanced clocking methods, the Multi-Synchronous method is applied using standard VLSI tools.