Notes about FPGAs, Digital Design, Verilog and SystemVerilog concepts
Last Updated: November 10, 2018 by Pepe Sandoval
If you find the information in this page useful and want to show your support, you can make a donation
Use PayPal
This will help me create more stuff and fix the existent content...
function
initial
task
fork...join
, fork...joinany
, fork...joinnone
for
& generate
typedef
struct
union
package
In a Digital System the information is in discrete and finite values
Combinatorial Logic - Logic that does not rely on the previous state of the system to set the current output state. The outputs are purely dependent on the inputs
Sequential Logic - Logic that typically uses flip flops and the current output state influences future output states.
Truth Table - A table showing a logic circuit's possible inputs and the outputs that will result.
reset
types:
The clock
of a system must NOT be controlled inside of the module by any type of logic, it can be controlled outside of the module with clock gating techniques
A digital system must be LTI (Linear Time invariant) system, this means it's characteristics of input-output does not change over time
In a synchronous design everything is related to the clock signal and it must behave in predictable and reliable way by not violating setup and hold times. It needs to avoid delay chains and combinational chains
Types of HDL description:
FPGA - Field Programmable Gate Array. Programmable logic blocks interconnected through a programmable switch fabric or mesh
LUT (Look-Up Tables) are a key component on the creation of some FPGAs since they can be used to create combinational logic, they are implemented using some memory (ROM, RAM, etc.) and a series of multiplexers cascaded
Most manufacturers combine LUTs with other components (like registers, adders, multipliers, etc.) to create the basic logic modules used to implement an FPGA
FPGAs components/organization:
SoC FPGAs can contain an FPGA, a processor (E.g ARM core) and other peripherals in the same chip
The main inputs of the designer are the design files (usually HDL code) and the tools settings & constraints (timing constrains, IO timing requirement, pin placements, etc.)
TCL (Tool Command Language) scripting language used to configure design tools
FPGA development process:
FPGA Design
constraint
Power an Thermal main considerations in an FPGA design
The time scale setup is done with the timescale
directive so for example
timescale 1s/1m
express delays in seconds and the tells the simulator should keep track of time down to 1 millisecond.
In other words it says we will express delays in seconds and the minimum precision the simulator must keep is down to ms so delays
like: #0.1
, #0.5
would be valid (it would be 100ms
and 500ms
)
Setup Slack = (Time it takes clock to arrive) - (Time it takes data to arrive)
+
Setup slack is positive it means data arrives first than clock so the timing requirements are met-
Setup slack is negative it means data arrives after the clock so there is a timing violationHold Slack = (Time it takes data to arrive) - (Time it takes clock to arrive)
Timing Constraints categories:
Other & Exceptions Where timing for that particular path or IO timing doesn't matter like LEDs, Switches, static lines, etc. because it has multiple cycles to settle
Clocking - define master clocks and generated clocks
IO Timing - define setup, hold and clock to output timing requirements for each IO port. These constraints specify timing requirements of an external device
bitstream (binary configuration file), you write Verilog and the tools create a bitstream that you download into the FPGA
Some FPGAs are actually very similar to memory devices. Some are nonvolatile and some require something (like a configuration memory chip or a CPU) to load them every time, but essentially you write to them the bits (bitstream) that configure the connections and options that wire up the circuit you've designed in a HDL
The processes of taking your Verilog and creating a file that can be sent to the FPGA or a configuration memory is a workflow so an FPGA toolchain converts Verilog code into a configuration bitstream
the bitstream isn't an executable. It just defines how the internal blocks of the FPGA connect together and possibly sets options for some of the blocks.
During simulation, Verilog & SystemVerilog act like programming languages, allowing to write constructs that would not be transferable to the FPGA
Blocking assignment (=
)
Non-blocking Assignment (<=
)
RHS = right-hand side equation and LHS = left-hand side expression
The always
statement tells Verilog that the code following should execute whenever any of the inputs you use in it change
The Verilog HW description code becomes connecting wires that wire up circuit elements just as though you had a sea of gates on a PCB and you connected them with wire wrap
Statements can either be synthesizable which means it can produce HW or non-synthesizable which cannot produce HW but can be used to describe elements and logic in a test-bench
A module represents an abstraction of an specific task. The highest level module (Top Level) should only have the instantiation of the other modules that make up the design
Continuous assignments are done using the assign
keyword
SystemVerilog is a superset of Verilog
Altera recommendation is to use logic
and bit
instead of reg
and wire
with procedural blocks for HW synthesis
When there a multiple assignments using wire
is recommended
A PB (procedural block) is behavioral description but when interpreting it can be read as a sequential block. We cannot use nets in a PB
if
, case
) must include all possible combination to avoid generating latches and errorsalways
: use logic
and blocking assignment (=
) for combinational logic and non-blocking Assignment (<=
) for sequential logic, must specific sensitive list of signals E.g always @(sel, dat_1, dat_0) begin ... end
always_comb
: used to describe combinational logic, use logic
and blocking assignment (=
), no need to add sensitive listalwyas_ff
: used to describe a FF, use logic
and non-blocking Assignment (<=
) required to put clk
and reset
in the list of signalsalways_latch
: Used to describe latchesAllows numeric representation of 4-states than can occur in HW:
X
values is used to represent unknown or don't careZ
value is used to represent unconnected or tri-state design logic0
logic zero1
logic oneA number is represented by the format: <number of bits> '<base identifier> <digits>
. The underscore (_
) can be used as separator
4'd5
=> 0101
8'b1_0101
=> 00010101
8'hF
=> 00001111
-8’b101
=> 11111011
(complement 2
of 101
)12'hXA
=> xxxxxxxx1010
(Hex fill with X)-4'shA
=> 0110
(4 bits Hex with sign)wire
, uwire
, wand
, wor
, tri
, triand
, trior
, tri0
, tri1
, trireg
, supply0
and supply1
) or variable (var
).bit
, byte
, shortint
, int
, longint
) or 4-state (logic
, integer
, reg
, time
)(var ) data type |
Description |
---|---|
integer |
4-states, 32-bits signed integer |
real |
64-bits floating point |
time |
4-states, 64-bits signed integer |
realtime |
64-bits floating point used for storing time |
reg |
4-states, user defined size |
logic |
4-states, user defined size |
shortint |
2-states, 16-bits, signed integer |
int |
2-states, 32-bits, signed integer |
longint |
2-states, 64-bits, signed integer |
bit |
2-states, 1-bit |
byte |
2-states, 8-bits, signed integer |
shortreal |
32-bits floating point |
void |
indicates no-storage (or returns nothing) |
A wire
has to be constantly driven to some value because is a net. A reg
is more like a regular variable, you can set a value in a reg
and it sticks.
Variables cannot be driven by multiple sources so it is NOT possible to have multiple continuous assignments write to the same variable or to combine procedural assignments with continuous assignments. In general a signal declared as a variable only has a single source
Only nets can have multiple sources, such as multiple continuous assignments and/or connections to multiple output ports of module
integer
and time
are variables that are 4-state data types with predefined vector sizes. int
, byte
, shortint
and longint
are variables that are 2-state data types with predefined vector sizes
SystemVerilog uses the more intuitive logic
keyword to represent a general purpose, hardware-centric data type
reg
variable and the hardware that will be inferred. It is the context that determines the hardware that will be represented (combinational logic or sequential logic). A reg
may, but doesn't always, infer a flip floplogic
data type is identical to the Verilog reg
type except that the reg
keyword cannot be paired with net type keywordsThe real
(64-bits) and shortreal
(32-bits) types are used to store floating point numbers, they are not synthesizable
logic
, bit
, byte
, ... are data types they indicate the number of signal states the variable can have. However,
when any of these is used by itself, a var
(variable) is implied.
// Equivalent declarations but with state representation difference
// Uses 4-state
var logic [63:0] addr;
logic [63:0] addr;
var [63:0] addr;
// Uses 2-state
var bit [63:0] addr;
bit [63:0] addr;
A Verilog net type (wire
, wor
, wand
) defaults to being a 4-state logic data type but you can be explicit and use a 4-state data type. There are no 2-state net types
// Equivalent declarations for a 64-bit wide net
wire logic [63:0] data;
wire [63:0] data;
Usage for the C-like 2-state types, like int
and byte
X
or Z
One problem you wind up with in Verilog is that if you make up a name, the compiler (by default) will assume you mean for it to
be a wire
unless you tell it otherwise. if you misspell a name, it just becomes a new wire
and then you can’t figure out
why your code doesn’t do what you want. that's why we use 'default_nettype none
The 4-state variables, such as reg
, logic
, and integer
, default to beginning simulation with all bits at a logic X
.
In this state are considered uninitialized, until a first value is assigned to the variable for example, by the design reset logic
All 2-state date types begin simulation with a logic 0
. It is legal to assign 4-state values to 2-state variables.
bits that have an X
or Z
value in the 4-state type will be translated to a logic 0
It is preferable to use 4-state types to represent synthesizable RTL models. The 4-state logic
type and the 2-state bit
, byte
, shortint
, int
,
and longint
types are synthesizable they are the same for synthesis compilers but using 4-state or 2-state primarily affects simulation.
The general rule is that a variable must be used when modeling using a procedural blocks (like: always
, always_comb
, always_ff
or always_latch
), and a net must be used
when modeling using continuous assignments and module instances
System Verilog supports the signed
and unsigned
modifiers
reg [63:0] u; // unsigned 64-bit variable
reg signed [63:0] s; // signed 64-bit variable
int s_int; // signed 32-bit variable
int unsigned u_int; // unsigned 32-bit variable
SystemVerilog static
and automatic
begin...end
blocks, fork...join
blocks, and non-automatic tasks and functions, all storage defaults to static
, unless explicitly declared as automatic
. meanwhile all storage in an automatic
task or function is automatic.static
, with the expectation that these variables are for modeling hardware, which is also static in nature. At the module level, all variables are static.automatic
mean that the variable storage is dynamically allocated by the software tool when required, and deallocated when no longer needed.automatic
in a non-automatic task or function will be dynamically created each time the task or function is entered, and only exists until the task or function exitsstatic
variable with an in-line initial value this will be assigned one time, before the start of simulation. Calls to the task or function will not re-initialize the variable.SystemVerilog Constants:
parameter
: Accessible outside of the module, it can be overwritten when instantiating a module and are assigned at compile timelocalparam
: Private parameter, cannot be overwrittenspecparam
: used to define temporary constantsconstant
: used to declare constant data types
parameter port_id = 5;
parameter cache_line_width = 256;
localparam initial_state = 2;
localparam binlocalconst = 4'b1100;
const logic [23:0] C1 = 7; // 24-bit constant
const int C2 = 15; // 32-bit constant
const real C3 = 3.14; // real constant
Type | Name | Description | # of operators | Example |
---|---|---|---|---|
Arithmetic | + / += |
Sum | Two or more |
|
Arithmetic | - / -= |
Subtract | Two or more |
|
Arithmetic | * / *= |
Multiply | Two or more |
|
Arithmetic | / / /= |
Division | Two or more |
|
Arithmetic | % / %= |
Modulo | Two or more |
|
Arithmetic | *** |
Exponent | Two or more |
|
Logic | ! |
Not | One |
|
Logic | && |
And | Two |
|
Logic | || |
Or | Two |
|
Relational | > |
Greater than | Two |
|
Relational | < |
Less than | Two |
|
Relational | >= |
Greater or equal than | Two |
|
Relational | <= |
Less or equal than | Two |
|
Equality | == |
Equality | Two |
|
Equality | != |
Inequality | Two |
|
Equality | === |
Case Equality | Two |
|
Equality | !== |
Case Inequality | Two |
|
Bitwise | ~ |
Not | One |
|
Bitwise | & / &= |
And | Two |
|
Bitwise | | / |= |
Or | Two |
|
Bitwise | ^ / ^= |
Xor | Two |
|
Bitwise | ~^ |
Xnor | Two |
|
Reduction | & |
And | One |
|
Reduction | ~& |
Nand | One |
|
Reduction | | |
Or | One |
|
Reduction | ~| |
Nor | One |
|
Reduction | ^ |
Xor | One |
|
Reduction | ~^ |
Xnor | One |
|
Shift | >> / >>= |
Right Shift | Two |
|
Shift | << / <<= |
Left Shift | Two |
|
Shift | >>> / >>>= |
Arithmetic Right Shift | Two |
|
Shift | <<< / <<<= |
Arithmetic Left Shift | Two |
|
Concatenation | { } |
Concatenation | Two or more |
|
Replication | {{ }} |
Replication | Two or more |
|
Conditional | ?: |
Conditional | Two or more |
|
Auto inc/dec | ++ / -- |
Auto inc/dec | One |
|
Logic Eval | -> / <-> |
Logic Evaluation | TBD |
|
Packaging | {<<{}} / {>>{}} |
Packaging | TBD |
|
Verilog had only one type of array meanwhile SystemVerilog arrays can be either packed or unpacked.
A packed array is treated as both an array and a single value, the packed dimensions are specified before the variable name
and increase horizontal size (describe columns). E.g. bit [3:0][7:0] mybytes;
is a packed array that is stored as a contiguous set of bits
An unpacked array is treated only as an array and it is stored as an independent set of bits, the unpacked dimensions are
specified after the variable name and increase vertical size (describe rows). E.g. bit mybytes [7:0][3:0];
is an unpacked array that is stored as an independent set of bits
Packed and unpacked arrays can be mixed. E.g. bit [3:0][7:0] barray [2:0] // 3 elements of 4 8-bit elements (in other words each of the 3 elements is packed into 32 bits)
Vector:
reg [7:0] aReg;
reg [2:0] sub_vector; assign sub_vector = aReg[5:3];
Array:
reg aMem [7:0];
reg bMem [7:0][3:0];
reg [7:0] cMem [3:0];
#4
) nor posedge
or negedge
statementstask
assign
)
// C Style function declaration
function integer DoubleWordLength(input integer Word);
begin
DoubleWordLength = 2*Word;
end
endfunction
// Module Style function declaration function integer DoubleWordLength(); input integer Word; begin DoubleWordLength = 2*Word; end endfunction
initial
& task
initial
initial
statementinitial
blocks in a module start at time 0 and run concurrentlyrepeat
(to repeat an statement N
times) and forever
(to repeat an statement until simulation ends)
// initial with delays
initial
begin
#31 rst = 1;
#23 rst = 0;
rst = #31 1;
rst = #23 0;
end
// initial with repeat
initial repeat (44) #7 enable = ~enable;
// initial with forever
initial
begin
clk = 1'b0;
forever #2 clk = !clk;
end
Storage for an
automatic
task or function is effectively allocated each time it is called.
initial Task_Clk;
task Task_Clk;
begin
clk = 1'b1;
forever #1 clk = !clk;
end
endtask
initial begin
fork
// All tasks start executing simultaneously
Task1;
Task2;
Task3;
join
end
// fork...join equivalent to separate initial statements
//initial Task1; initial Task2; initial Task2;
task Task1;
forever #1 A = !A;
endtask
task Task2;
forever #2 B = !B;
endtask
task Task2;
forever #4 C = !C;
endtask
fork...join
block schedules each statement in the block but blocks execution in parent thread
initial begin
fork
// Task3 starts until Task1 and Task2 completes
Task1;
Task2;
join
Task3;
end
task Task1;
repeat (8) #1 A = !A;
endtask
task Task2;
repeat (16) #1 B = !B;
endtask
task Task3;
repeat (4) #1 C = !C;
endtask
fork...join
block, then when the first statement completes, execution continues in the parent thread
initial begin
fork
// Task 1 and 2 start at the same time,
// when Task1 finishes Task3 starts
TasK1;
TasK2;
join_any
TasK3
end
task Task1;
repeat (8) #1 A = !A;
endtask
task Task2;
repeat (16) #1 B = !B;
endtask
task Task3;
repeat (4) #1 C = !C;
endtask
fork...join
block, but doesn't block execution in the parent thread, in other words execution continues normally in the parent thread
initial begin
fork
// All tasks start executing simultaneously
TasK1;
TasK2;
join_none
TasK3
end
task Task1;
repeat (8) #1 A = !A;
endtask
task Task2;
repeat (16) #1 B = !B;
endtask
task Task3;
repeat (4) #1 C = !C;
endtask
SystemVerilog Interfaces offers a new paradigm for modeling ports and interconnections, they are an abstraction of an interconnection between modules
Interfaces help us to model communication buses and interconnection in general
They avoid duplication of port declarations in multiple modules and avoid changes in multiple modules if a change in the port is required
Allows signals to be grouped together and represented as a single port. Each module that uses this group of signals has as single port of the interface type instead of many port definitions
With the use of Interfaces common signals are encapsulated into a single location, this eliminates the need of redundant declarations in other modules and simplify connections between modules.
An Interface can contain type declarations, tasks, functions, procedural blocks, program blocks and assertions
Interfaces cannot contain a design hierarchy (this means instances of other modules)
Interfaces can have external signals coming into the interface and can these signals can now be connected to other modules through the interface without explicitly connecting the signals to each module
.*
syntax indicates that all ports, nets and variables of the same name in the module should automatically be connected together
modport
enable
signal, the control unit will need to see the enable
signal as an output but the counter needs to see it as an inputmodport
definitions only defines whether the connecting module sees a signal as an input
, output
or inout
, modport declaration do not contain vector sizes or types, this information must be defined as part of the signal type declarations in the interfaceSignals in a module from an interface can be used inside a module as a normal signal using the syntax <interface_name>.<signal_name>
or <interface_name>.<modport_name>.<signal_name>
for
& generate
for
and generate
are used to describe hardware that has a structure with a well define regularity.
generate
for
, if
, case
for
loop)genvar
type to iteratefor
reg
or integer
typedef
typedef
keyword SystemVerilog adds the ability for the user to define new net and variable types. E.g. typedef logic [7:0] uint_t; uint_t a;
typedef
data can be synthesizablestruct
A structure is a collection of variables and/or constants under a single name
Structures are synthesizable depending on the context
In order to initialize a structure, SystemVerilog uses the tokens '{ }
We can declare a struct
type using typedef struct { ... } MyStruct_t;
and then create an instance or create a define and create instance just using struct { ... } mystruct;
// structure variable
typedef struct {
bit[31:0] a, b;
byte opcode;
logic [23:0] address;
} Instruction_t;
module StructModule(
input Instruction_t ModuleInputs,
output out
);
logic out_log;
assign out_log = (ModuleInputs.opcode == 8'h01) ? ModuleInputs.a[0]: ModuleInputs.b[0];
endmodule
By default, a structure is unpacked which mean means the members of the structure are treated as
independent variables or constants that are grouped together under a common name. However it can be explicitly
declared as packed in order to store all the members of the structure as contiguous bits in a specified order
module FrameGenerator (
input bit error,
input byte data,
input bit [3:0] address,
output [13:0] frame
);
struct packed {
logic valid;
logic [3:0] address;
logic [7:0] data;
logic parity
} FrameStruct;
always_comb begin
if(error)
FrameStruct = 14'h0FDF;
else begin
FrameStruct.valid = 1'b1;
FrameStruct.address = address + 1'b1;
FrameStruct.data = data;
FrameStruct.parity = ^data;
end
end
assign frame = FrameStruct;
endmodule
union
A union
is similar to a struct
, it is also a collection of variables and/or constants under a single name
but the amount of reserved space is dictated only by the largest variable in the union
By default, a union
is unpacked which means all its members are represented as contiguous bits but they can also be declared
as packed as long as the number of bits in all the union members is the same
Unpacked unions are not synthesizable and packed unions can be synthesizable depending on the context and also depending on the synthesis tools
Packed unions cannot contain real
, shortreal
, unpacked structs, unpacked unions or unpacked arrays;
typedef struct packed {
logic [15:0] source_address;
logic [15:0] destination_address;
logic [23:0] data;
logic [ 7:0] opcode;
} data_packet_t;
union packed {
data_packet_t packet; // packed structure
logic [7:0][7:0] bytes; // packed array
} dreg;
package
parameter
, localparam
, const
, task
, function
package definitions;
parameter Word_Length =8;
typedef logic [7:0] Uint;
typedef struct packed {
logic valid;
logic [3:0] address;
logic [7:0] data;
logic parity
} Frame_t;
function integer DoubleWordLength(input integer Word);
begin
DoubleWordLength = 2*Word;
end
endfunction
endpackage
::
and import
to include a package definitions, they can import all or just certain definitionsimport definitions::*;
import definitions:: Word_Length;
For synchronous logic. the clock rate needs to be at least twice as fast as the fastest input you need to process.
Clock must be longer that the longest combinational path to avoid glitches. Timing and path delays causes glitches
Inputs to the flip flop must be stable for the setup time before the active clock edge. The inputs also have to stay stable for at least the hold time, in other words the signal into the flip flop has to be stable a little bit before the clock edge (set up time) and a little bit after the clock edge (hold time)
Most common flip flop implementations
A state machine is one of the strategies used to control a digital system, the control unit of a digital system usually can be implemented with a state machine
Usually implemented in two parts
State machines types
In SystemVerilog we can be descriptive and define the state variable as an enum
type. E.g. Enum logic {Waiting_State, Multiplying, Close_Channels} state;
EDIF (Electronic Design Interchange Format) If you buy IP (intellectual property) from a vendor, it will probably come as an EDIF file. That means you won't be able to easily edit it, but you can add into your designs.
FPGAs usually have a speed rating and the tools have great models of how long a signal will take to propagate from point A to point B at a certain temperature.
Simulate/test/debug system behavior: don't take into account timing High-fidelity simulate/test/debug on actual device-specific configuration takes into account timing
POF (Programmable Object File) File used to download a design to an FPGA
testbench environment to test a logic design
reg
for each input we want to feed the device under test and a wire
for each outputIf you find the information in this page useful and want to show your support, you can make a donation
Use PayPal
This will help me create more stuff and fix the existent content...