RISC-V Von Neumann Architecture with 5-Stage Pipeline, Branch Predictor, L1 Cache and UART
π Language: English β’ δΈζ
HAVE FUN ! π
.
βββ docs
β βββ pic # pictures used
β βββ Architecture.drawio # design draft
β βββ project_desciption.pdf # project description
β βββ Report.md # report of this project
β βββ riscv-card.pdf # ISA reference
βββ generated
β βββ *.bit # bitstream file with different configurations
βββ program
β βββ lib # library of hardware API (driver)
β βββ pacman # game by C++ (easier cross-compiling to RV)
β βββ testcase # used for project presentation
βββ sources
β βββ assembly
β β βββ *.asm # assembly code for test and fun
β β βββ *.coe # hex version of machine code
β β βββ *.txt # data using UART to be put into memory
β βββ constrain
β β βββ constr.xdc # constrain file
β βββ core
β β βββ *.sv # code of CPU core
β β βββ *.svh # head file for constant
β βββ io
β β βββ *.sv # code related to IO and Clock
β βββ sim
β β βββ *.cpp # verilator simulation
β β βββ *.sv # vivado simulation
β βββ Top.sv # top module of MineCPU
βββ test
β βββ DiffTest.cpp # differential test of CPU
β βββ *.sv # on-board-test code
β βββ *.xdc # on-board-test constrain
βββ tools
β βββ inst2txt.py # instruction to text file for UART
β βββ ecall2sv.py # coe to code for burning into ROM
β βββ UARTAssist.exe # tool for UART
βββ .gitignore
βββ LICENSE
βββ README.md
* indicates bonus function
- Core
- IF Stage
- Branch Prediction *
- Pre-Decode *
- Branch History Table, BHT *
- Return Address Stack, RAS *
- Instruction Cache *
- Direct Mapping *
- Pre-Fetch *
- Branch Prediction *
- ID Stage
- Immediate Generation
- Register File
- Control Unit
- Hazard Detection
- EX Stage
- ALU
- RV32I
- RV32M *
- BRU
- Forward Unit *
- ALU
- MEM Stage
- Byte / Halfword / Word Memory Access
- Data Cache *
- Direct Mapping *
- Write Back *
- WB Stage
- Memory
- BRAM
- MMIO
- ROM
- ecall & sret *
- IF Stage
- IO
- Switch & Button
- 4*4 Builtin Keyboard *
- Led & 7 Segment Display
- UART *
- VGA *
- Software
- Testcase #1
- Testcase #2
- Pacman *
Powered by draw.io
- Von Neumann Architecture, RISC-V ISA RV32IM support, 5 Stage PipelineοΌIPC ~0.95
- 32-bit Register
- Memory Space 32 bit (4 byte)
- Clock:
- CPU: Maximum 50MHz
- MEM: Share with CPU
- VGA: 40MHz
- Branch Predictor:
- BHT: 32 entries, 2 bits
- RAS: 32 entries, 32 bits
- Cache:
- ICache: Direct Mapping, 1472 bits, 32 entries
- DCache: Direct Mapping/Write Back, 1504 bits, 32 entries
- Trap:
- ecall: internally triggered to access MMIO, see Environment Call for detail
Supports RV32IM ISA
Instruction | Type | Operation |
---|---|---|
add rd, rs1, rs2 |
R | rd = rs1 + rs2 |
sub rd, rs1, rs2 |
R | rd = rs1 - rs2 |
xor rd, rs1, rs2 |
R | rd = rs1 ^ rs2 |
or rd, rs1, rs2 |
R | rd = rs1 | rs2 |
and rd, rs1, rs2 |
R | rd = rs1 & rs2 |
sll rd, rs1, rs2 |
R | rd = rs1 << rs2 |
srl rd, rs1, rs2 |
R | rd = rs1 >> rs2 |
sra rd, rs1, rs2 |
R | rd = rs1 >> rs2 (sign-extend) |
slt rd, rs1, rs2 |
R | rd = ( rs1 < rs2 ) ? 1 : 0 |
sltu rd, rs1, rs2 |
R | rd = ( (u)rs1 < (u)rs2 ) ? 1 : 0 |
addi rd, rs1, rs2 |
I | rd = rs1 + imm |
xori rd, rs1, rs2 |
I | rd = rs1 ^ imm |
ori rd, rs1, rs2 |
I | rd = rs1 | imm |
andi rd, rs1, rs2 |
I | rd = rs1 & imm |
slli rd, rs1, rs2 |
I | rd = rs1 << imm[4:0] |
srli rd, rs1, rs2 |
I | rd = rs1 >> imm[4:0] |
srai rd, rs1, rs2 |
I | rd = rs1 >> imm[4:0] (sign-extend) |
slti rd, rs1, rs2 |
I | rd = (rs1 < imm) ? 1 : 0 |
sltiu rd, rs1, rs2 |
I | rd = ( (u)rs1 < (u)imm ) ? 1 : 0 |
lb rd, imm(rs1) |
I | Read 1 byte and sign-extend |
lh rd, imm(rs1) |
I | Read 1 half-word (2 bytes) and sign-extend |
lw rd, imm(rs1) |
I | Read 1 word (4 bytes) |
lbu rd, imm(rs1) |
I | Read 1 byte and zero-extend |
lhu rd, imm(rs1) |
I | Read 2 byte and zero-extend |
sb rd, imm(rs1) |
S | Store 1 byte |
sh rd, imm(rs1) |
S | Store 1 half-word (2 bytes) |
sw rd, imm(rs1) |
S | Store 1 word (4 bytes) |
beq rs1, rs2, label |
B | if (rs1 == rs2) PC += (imm << 1) |
bne rs1, rs2, label |
B | if (rs1 != rs2) PC += (imm << 1) |
blt rs1, rs2, label |
B | if (rs1 < rs2) PC += (imm << 1) |
bge rs1, rs2, label |
B | if (rs1 >= rs2) PC += (imm << 1) |
bltu rs1, rs2, label |
B | if ( (u)rs1 < (u)rs2 ) PC += (imm << 1) |
bgeu rs1, rs2, label |
B | if ( (u)rs1 >= (u)rs2 ) PC += (imm << 1) |
jal rd, label |
J | rd = PC + 4; PC += (imm << 1) |
jalr rd, rs1, imm |
I | rd = PC + 4; PC = rs1 + imm |
lui rd, imm |
U | rd = imm << 12 |
auipc rd, imm |
U | rd = PC + (imm << 12) |
ecall |
I | Transfer control to firmware (ROM) |
sret * |
I | Transfer back to user program |
mul rd, rs1, rs2 * |
R | rd = (rs1 * rs2)[31:0] |
mulh rd, rs1, rs2 * |
R | rd = (rs1 * rs2)[63:32] |
mulhsu rd, rs1, rs2 * |
R | rd = (rs1 * (u)rs2)[63:32] |
mulhu rd, rs1, rs2 * |
R | rd = ( (u)rs1 * (u)rs2 )[63:32] |
div rd, rs1, rs2 * |
R | rd = rs1 / rs2 |
rem rd, rs1, rs2 * |
R | rd = rs1 % rs2 |
No. (a7) | Arguments | Operation | Return Value |
---|---|---|---|
0x01 | a0 | Write 1 byte to LED #1 | N/A |
0x02 | a0 | Write 1 byte to LED #2 | N/A |
0x03 | a0 | Write 4 bytes to 7Seg | N/A |
0x05 | N/A | Read 1 byte from switch #1 | a0 |
0x06 | N/A | Read 1 byte from switch #2 | a0 |
0x07 | N/A | Read 1 byte from switch #3 | a0 |
0x0A | N/A | End of program (Idle loop) | N/A |
- MMIO (Memory Mapping IO) to perform IO and supports UART
- UART
- Burn program and data into memory through UART without reprogramming FPGA
- Baud Rate: 115200Hz, 8 data bits, 1 stop bits
- Data directly written into memory, CPU will start after receiving data and being idle for more than 0.5s
- Input
- 24 switches
- 5 buttons
- 4 Γ 4 builtin keyboard
- Output
- 24 LEDs, 8 of which shows CPU status
- 7 segment display to print 4 Bytes
- VGA
- buffers with builtin fonts and colors
- 800Γ600 60Hz
- font: 8Γ16, 96Γ32 fullscreen with ratio 3 : 2
MMIO Address
Address | R/W | Note | Range |
---|---|---|---|
0xFFFFFF00 | R | Switch #1 (8) | 0x00 - 0xFF |
0xFFFFFF04 | R | Switch #2 (8) | 0x00 - 0xFF |
0xFFFFFF08 | R | Switch #3 (8) | 0x00 - 0xFF |
0xFFFFFF0C | W | LED #1 (8) | 0x00 - 0xFF |
0xFFFFFF10 | W | LED #2 (8) | 0x00 - 0xFF |
0xFFFFFF14 | R | Button 1 (Center) | 0x00 - 0x01 |
0xFFFFFF18 | R | Button 2 (Up) | 0x00 - 0x01 |
0xFFFFFF1C | R | Button 3 (Down) | 0x00 - 0x01 |
0xFFFFFF20 | R | Button 4 (Left) | 0x00 - 0x01 |
0xFFFFFF24 | R | Button 5 (Right) | 0x00 - 0x01 |
0xFFFFFF28 | W | 7 Segment Display | 0x00000000 - 0xFFFFFFFF |
0xFFFFFF2C | R | 4*4 Keyboard Status | 0x00 - 0x01 |
0xFFFFFF30 | R | 4*4 Keyboard Location | 0x00 - 0x0F |
0xFFFFE___ (000-BFF) | W | VGA Font | 0x00 - 0xFF |
0xFFFFD___ (000-BFF) | W | VGA Color | 0x00 - 0xFF |
A little game in assemblyοΌas a showcase of the capability of MineCPU.
-
Create Vivado ProjectοΌProject device xc7a100tfgg484-1οΌTarget Language:
System Verilog
οΌimport all codes in sources, sources/core and sources/ioοΌand then importconstr.xdc
from sources/constrain. -
Create IP
-
Create Clocking Wizard
- Rename to
VGAClkGen
- Select PLL Clock
- Modify
clk_in1: Source
to Global buffer - Set frequency of
clk_out1
to 40MHz and uncheck reset and locked signal
- Rename to
-
Create Block Memory Generator
- Name it to
Mem
- Select
True Dual Port RAM
as Memory Type - Modify Write Width of Port A to 32οΌWrite Depth to 16384 (Read Width, Read Depth and Port B will also be changed automatically)
- Name it to
-
-
Execute Synthesis -> Implementation -> Generate BitstreamοΌto generate bitstream (*.bit)
-
Alternatively, use pre-generated bitstream to program FPGA
Compile source codes to binοΌopen sources with RARS
οΌexecute it and click File -> Dump Memory
οΌselect Hexadecimal Text
in Dump Format
οΌclick Dump To File...
Alternatively, cross-compiling C++ code to RISC-V assembly code and then to binary
riscv64-unknown-elf-g++ -march=rv32im -mabi=ilp32 --static your-program.c -o your-program
riscv64-unknown-elf-objcopy -O binary your-program your-program.bin
hexdump -v -e '1/4 "%08x" "\n"' your-program.bin > your-program.txt
Then convert hex to txt with inst2txt.py, simply change the path inside script and run python inst2txt.py
Open UARTAssist.exeοΌselect COM6 and baud rate as 115200, open connection and send by hex with the text file generated above.
Or if you're using Linux/Mac, you can use minicom
. You know how to do it.
- Program: riscv-gnu-toolchain cross compiler
- Simulation: Verilator, Unicorn for simulation of instruction in differential test
- Serial: UARTAssist serial tool, inst2txt to translate the binary to hex