Porting OpenHW’s CVA6 to Litex

I’ve just seen the LiteX‘s console appearing on my terminal after booting my Digilent Nexys A7 board with a LiteX bitfile containinig OpenHW’s CVA6 core.

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
   Build your hardware, easily!

 (c) Copyright 2012-2022 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on May 19 2022 12:08:20
 BIOS CRC passed (c172daf0)

 LiteX git sha1: 48b523cf

--=============== SoC ==================--
CPU:            CVA6 @ 75MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            128KiB
SRAM:           8KiB
MAIN-RAM:       8KiB 

--========== Initialization ============--
Memtest at 0x40000000 (8.0KiB)...
  Write: 0x40000000-0x40002000 8.0KiB   
   Read: 0x40000000-0x40002000 8.0KiB   
Memtest OK
Memspeed at 0x40000000 (Sequential, 8.0KiB)...
  Write speed: 17179869183.10GiB/s
   Read speed: 17179869183.10GiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
No boot medium found

--============= Console ================--

litex> help
LiteX BIOS, available commands:

help                     - Print this help
ident                    - Identifier of the system
crc                      - Compute CRC32 of a part of the address space
flush_cpu_dcache         - Flush CPU data cache                                   
leds                     - Set Leds value                                         
boot                     - Boot from Memory                                       
reboot                   - Reboot
serialboot               - Boot from Serial (SFL)

mem_list                 - List available memory regions
mem_read                 - Read address space
mem_write                - Write address space
mem_copy                 - Copy address space
mem_test                 - Test memory access
mem_speed                - Test memory speed
mem_cmp                  - Compare memory content

Ident: LiteX SoC on Nexys4DDR 2022-05-19 12:07:58
crc <address> <length>
boot <address> [r1] [r2] [r3]
Available memory regions:
ROM       0x10000000 0x20000 
SRAM      0x20000000 0x2000 
MAIN_RAM  0x40000000 0x2000 
CSR       0x80000000 0x10000 


Being my first experiment with this tool I got pretty excited and I decided that it was worth a post on PlanV’s brand new blog page. So, there we go!

I’ve already mentioned that I am new to LiteX, but I have been a long time friend of CVA6, having contributed to its integration in Hensoldt Cyber’s MiG-V. Using it as starting point for my adventure with LiteX has simply been a natural choice.

Adding a core is actually an easy process; what you have to do is:

  • pack the source files in a dedicated pythondata repository
  • create a cva6 class which connects the Verilog ports to the LiteX ports

Wrapping CVA6

By looking at the cores already included in the tool, I noticed that the CPU-rest_of_the_system interface is normally composed by

  • one or more “data” buses (Wishbone or AXI or OBI)
  • a bus of interrupt sources
  • JTAG signals
  • clock and reset

CVA6’s ports are:

  • AXI interface
  • hart ID
  • boot address
  • IRQ (2 bits, for M and S mode) – these normally come from the PLIC
  • IPI – this is normally generated by the CLINT
  • timer IRQ – this is normally generated by the CLINT
  • debug request – coming from the debug module
  • clock and reset

Hart ID and boot address can be hardcoded. The routing of the interrupt and debug request signals puzzled me. cv32e40p shows an example of how to integrate the debug module (DM) at core level. I could have followed the same approach and integrated DM, PLIC and CLINT at LiteX level, but there were (and still are) some open questions (1) which prevented me from following this path. Therefore I decided to integrate all these components in a Verilog wrapper; the approach is not new in the LiteX ecosystem: I’ve seen the same happens e.g. for Rocket Chip.


The architecture is probably not optimal, since the data must traverse 2 interconnectors (the AXI interconnector in the wrapper and the main Wishbone interconnector in the LiteX system), but it is good enough for a first attempt.

The memory mapping is

Target Start addr Length
DM 0x00000000 0x1000
CLINT 0x02000000 0xC0000
PLIC 0x0C000000 0xh3FFFFFF
others 0x10000000 0xEFFFFFFF

The pythondata-cpu-cva6 repository contains a snapsot of the original CVA6 repository Together with the original source files and cva6_wrapper.sv the pythondata repository also includes 2 SystemVerilog packages: ariane_pkg.sv, which is a copy of the one included in the CVA6 repository and which specifies some parameters specific for the current implementation, and cva6_wrapper_pkg.sv, which defines the memory mapping within the wrapper as well as the configuration parameter for CVA6.

CVA6 class

This part of the design is quite trivial. I just had to connect the IOs of the cva6_wrapper module with the appropriate signals in the CVA6 class (derived from the CPU class). On top of this, the AXI interface has to be converted to Wishbone (since the system bus uses this protocol) and the memory map for the system peripherals has to be defined. The system peripherals have to be mapped to addresses higher than 0x10000000 (see the mapping done in the CVA6 wrapper).

Target Start addr
ROM 0x10000000
SRAM 0x20000000
CSR 0x80000000

Note that the io_region parameter must be included in the CSR address range.

The Verilog source files are included in the project by parsing the flist file in pythondata-cpu-cva6. CVA6 offers several versions of this file list, depending on the project configuration. For the time being I only used the cv64a6_imafdc_sv39.


I’ve tested the code both using Verilator and an FPGA board.

To start a simulation with Verilator you just have to run

litex_sim --cpu-type=cva6 --trace

--trace is necessary to generate the waveforms, for debugging

To implement the code on FPGA, the command to run is

python3 -m litex_boards.targets.digilent_nexys4ddr --cpu-type=cva6 --build

from within the litex-board folder.

The behaviour with Verilator and the FPGA is slightly different. The simulation showed no problem, while the console froze in FPGA after printing a few characters.

The issue seems to be related to the interrupt handling, but I’ve not yet found the root cause. I’ve just found that compiling the code with the the flag -DUART_POLLING overcomes the problem.

Another problem I encountered is a failure during the SDRAM initialization

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00000000000000000000000000000000| delays: -
  m0, b02: |11111111111110000000000000000000| delays: 08+-08
  m0, b03: |00000000000001111111111111111111| delays: 22+-09
  m0, b04: |00000000000000000000000000000000| delays: -
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b03 delays: -
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00000000000000000000000000000000| delays: -
  m1, b02: |11111111111100000000000000000000| delays: 07+-07
  m1, b03: |00000000000001111111111111111111| delays: 22+-09
  m1, b04: |00000000000000000000000000000000| delays: -
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b03 delays: -
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB     
   Read: 0x40000000-0x40200000 2.0MiB     
  bus errors:  14/256
  addr errors: 0/8192
  data errors: 520296/524288
Memtest KO
Memory initialization failed

I have not yet debugged this issue. It is possible to workaround this problem by not including an SDRAM in the design, by building the system with the --integrated-main-ram-size option

python3 -m litex_boards.targets.digilent_nexys4ddr --cpu-type=cva6 --integrated-main-ram-size=8192 --build



  • how to connect the interrupt variable of the CPU class (which contains all the interrupt sources at platform level) if the actual CPU expects a single IRQ bit (generated by the PLIC)?
  • how to define the memory mapping of the DM, PLIC and CLINT?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: