[Eda-dev] Experiment with gscl45nm with Verilog output from an LRU implemented in Chisel

Øyvind Harboe oyvind.harboe at zylin.com
Thu Jul 11 06:45:42 EDT 2019


Did something change recently that improved fMax for the LRU case?

I have no other explanation for the improvement I'm seeing...

After applying the attached patch that speeds up qrouter and setting the
following options, surprsingly I saw fMax go from 108MHz to 560MHz. If
trivial well written Verilog runs at 2.5GHz on gscl45nm, then 560MHz for
the LRU.v makes sense: for reasons having to do with how the LRU.v has been
generated from Chisel to Verilog, pipelining stages have been left out of
the LRU, so I would expect a significant slowdown compared to trivial well
written Verilog.

fanout_options="-l 200 -c 20 -F 1000"
initial_density="0.1"

Increasing initial_density to 0.5 didn't change fMax.

I tried the same settings above with a different design and fMax didn't
change. I tried eliminating the fanout_options and it didn't change the
fMax(still 560MHz).

Attached is a Dockerfile that precisely describes the exact versions and
and environment I'm using.


On Sat, Jul 6, 2019 at 5:58 AM Øyvind Harboe <oyvind.harboe at zylin.com>
wrote:

>  I've implemented a very simple 4x4 LRU in Chisel, which I thought would
> be a good smoketest for qflow. The resulting Verilog is attached. All
> output from Chisel is very similar, but usually MUCH bigger.
>
> The LRU is a classic nxn matrix implementation. Upon access of n, set row
> to all 1's and column to all 0's and 0 in the intersection. n with the most
> zero's is least recently used.
>
> I was surprised to see an fMax of 108MHz. This is a very simple circuit, I
> would have expected it to be pretty much the closest to fMax for any
> non-trivial design at gscl45nm.
>
> I believe I'm doing something silly w.r.t. how I've set up this
> "smoketest" of fMax for gscl45nm or that I'm not reading it right. What I'm
> looking for in qflow is to do P&R of individual modules of my design to
> give me pushback on the design. My thinking is that fMax of the design will
> be lower than the fMax of an individual tiny module.
>
> I'm unsure on how to read the output. I can find matrix_0_1 in the
> Verilog, but where's the "other end" to this "tallest pole in the tent"?
>
> # qflow build --tech gscl45nm LRU
> [deleted]
> Top 20 maximum delay paths:
> Path _321_/CLK to _343_/D delay 9180.4 ps
>       0.0 ps  clock_bF$buf3: CLKBUF1_insert1/Y -> _321_/CLK
>     108.8 ps     matrix_0_1:           _321_/Q -> _235_/A
>     152.5 ps           _67_:           _235_/Y -> _238_/A
>     225.7 ps           _70_:           _238_/Y -> _256_/B
>     388.2 ps           _88_:           _256_/Y -> _257_/A
>     865.9 ps           _89_:           _257_/Y -> _258_/B
>    1058.8 ps           _90_:           _258_/Y -> _267_/A
>    1447.9 ps           _99_:           _267_/Y -> _268_/A
>    1457.6 ps          _100_:           _268_/Y -> _269_/B
>    1906.0 ps            _1_:           _269_/Y -> _343_/D
>
>    clock skew at destination = 10.697
>    setup at destination = 7263.67
> [deleted]
> Computed maximum clock frequency (zero margin) = 108.928 MHz
>
>
>
> --
> Øyvind Harboe, General Manager, Zylin AS, +47 917 86 146
>


-- 
Øyvind Harboe, General Manager, Zylin AS, +47 917 86 146
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.opencircuitdesign.com/pipermail/eda-dev/attachments/20190711/20e7bf9b/attachment.html>
-------------- next part --------------
141c141,143
<     if {$result < 10} {
---
>     #
>     # HACK! speed up qrouter by eliminating cleanup
>     if {0} {
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dockerfile
Type: application/octet-stream
Size: 4448 bytes
Desc: not available
URL: <http://www.opencircuitdesign.com/pipermail/eda-dev/attachments/20190711/20e7bf9b/attachment.obj>


More information about the Eda-dev mailing list