WF FPGA Ideas: Difference between revisions
No edit summary |
No edit summary |
||
| Line 10: | Line 10: | ||
== FPGA-based CPU == | == FPGA-based CPU == | ||
65816 but with a genuine 16-bit data bus | 65816 but with a genuine 16-bit data bus | ||
== Bitstream readers/writers == | |||
Write a byte or word to a FPGA location, it takes a CPU cycle to write it, and bumps its pointer. | |||
For a bit stream, a 32-bit bit pointer covers 4Gb = 512MB. A write would need to know the width to write. Maybe 16 registers, write a value to one of those to declare how many bits from the written value to write. This actually allows the index register to determine width dynamically, which is nice. Both read & write interfaces should use this. | |||
Separate read & write context, so copies, decompression, etc, can be done. Bit pointers can be directly read/written as well. Direction is always in the positive direction, though, at least for now. | |||
8bit interface: 2 bitpointers, then 8 byte locs for pointer 0, and 8 byte locs for pointer 1. | |||
16bit interface: 2 bitpointers, then 16 word locs for pointer 0, 16 word locs for pointer 1. Writes triggered on high byte write. Reads trigger on low byte read, which readies the high byte. | |||
This is a CPU-blocking interface for reads, buffered for writes. | |||
== RLE Format(s) == | |||
RLE layers, DMA, and potentially sprites can use RLE encoding. | |||
{| class="wikitable" | |||
|+Span-based RLE formats | |||
!bpp | |||
!Layout | |||
!length | |||
!Max compression | |||
!Breakeven | |||
|- | |||
|1 | |||
|<code>clllllll</code> | |||
|1-128 | |||
|16:1 byte | |||
|8px | |||
|- | |||
|2 | |||
|<code>cc111111</code> | |||
|1-64 | |||
|16:1 byte | |||
|4 pixels | |||
|- | |||
|4 | |||
|<code>ccccllll</code> | |||
|1-16 | |||
|4:1 byte | |||
|2 pixels | |||
|- | |||
|4 | |||
|<code>ccccCCCC llllllll llllllll</code> | |||
|1-256 | |||
|170:1 byte (512:3) | |||
|3+3 pixels | |||
|- | |||
|8 | |||
|<code>cccccccc llllllll</code> | |||
|1-256 | |||
|128:1 byte | |||
|2 pixels | |||
|} | |||
However, it would be useful to have spans of literal pixels as well, instead of just solid color span fills. | |||
<code>0lllllll cccccccc</code> = span length L of color C, 0 = transparent | |||
<code>1lllllll cccccccc...</code>= L count of individual pixels | |||
For a bpp less than 8, probably require them to fill an even byte or word count | |||
For now, RLE layers should be simple length + 8bpp aligned words. RLE bitmaps would be something different, maybe it's too flexible so we should just leave that to the CPU. | |||
== DMA/Blitter == | |||
Maybe separate out 2d mode into its own blitter? | |||
Flag to mask out the 'fill/mask' color (default 0) | |||
Clip to output screen dimensions. | |||
Xflip, yflip, maybe 90° rotation, but that means dest dimensions change? Scaling? Full affine transform? | |||
Unpack RLE graphics, for better memory usage.Could still do x/y flip because this isn't raster-dependent. Must know the total x/y though if clipping is supported | |||
Fields: | |||
* bpp (could expand from src to dest given an offset?) | |||
* src w/h/stride | |||
* dest w/h/stride | |||
'''TODO - V2''' | |||
Clipping? Or should the src/dest be handled in software? | |||
Ideally, there'd be a clip bounds defined at the dest address, w, h, stride, bpp. The source address is defined, and it's blitted into an x/y in the dest screen, automatically clipped. | |||
If there end up being a large number of parameters for a src or dest, it would be nice to have multiple profiles. Either read src/dest profiles from ram, or have say 4 src & 4 dests saved, and blit from src N to dest M. | |||
RLE graphics should probably save their w/h and mode implicitly as the first 2 words, as they are their own free-form shapes. | |||
Revision as of 14:35, 27 December 2025
SDCard
Auto-tx on read. Superceded by auto-read a 16-bit length to a storage pointer, running in the background.
Stream MP3 or MIDI file from disk (or ddr3?) straight to chips
FPGA-based CPU
65816 but with a genuine 16-bit data bus
Bitstream readers/writers
Write a byte or word to a FPGA location, it takes a CPU cycle to write it, and bumps its pointer.
For a bit stream, a 32-bit bit pointer covers 4Gb = 512MB. A write would need to know the width to write. Maybe 16 registers, write a value to one of those to declare how many bits from the written value to write. This actually allows the index register to determine width dynamically, which is nice. Both read & write interfaces should use this.
Separate read & write context, so copies, decompression, etc, can be done. Bit pointers can be directly read/written as well. Direction is always in the positive direction, though, at least for now.
8bit interface: 2 bitpointers, then 8 byte locs for pointer 0, and 8 byte locs for pointer 1.
16bit interface: 2 bitpointers, then 16 word locs for pointer 0, 16 word locs for pointer 1. Writes triggered on high byte write. Reads trigger on low byte read, which readies the high byte.
This is a CPU-blocking interface for reads, buffered for writes.
RLE Format(s)
RLE layers, DMA, and potentially sprites can use RLE encoding.
| bpp | Layout | length | Max compression | Breakeven |
|---|---|---|---|---|
| 1 | clllllll
|
1-128 | 16:1 byte | 8px |
| 2 | cc111111
|
1-64 | 16:1 byte | 4 pixels |
| 4 | ccccllll
|
1-16 | 4:1 byte | 2 pixels |
| 4 | ccccCCCC llllllll llllllll
|
1-256 | 170:1 byte (512:3) | 3+3 pixels |
| 8 | cccccccc llllllll
|
1-256 | 128:1 byte | 2 pixels |
However, it would be useful to have spans of literal pixels as well, instead of just solid color span fills.
0lllllll cccccccc = span length L of color C, 0 = transparent
1lllllll cccccccc...= L count of individual pixels
For a bpp less than 8, probably require them to fill an even byte or word count
For now, RLE layers should be simple length + 8bpp aligned words. RLE bitmaps would be something different, maybe it's too flexible so we should just leave that to the CPU.
DMA/Blitter
Maybe separate out 2d mode into its own blitter?
Flag to mask out the 'fill/mask' color (default 0)
Clip to output screen dimensions.
Xflip, yflip, maybe 90° rotation, but that means dest dimensions change? Scaling? Full affine transform?
Unpack RLE graphics, for better memory usage.Could still do x/y flip because this isn't raster-dependent. Must know the total x/y though if clipping is supported
Fields:
- bpp (could expand from src to dest given an offset?)
- src w/h/stride
- dest w/h/stride
TODO - V2
Clipping? Or should the src/dest be handled in software?
Ideally, there'd be a clip bounds defined at the dest address, w, h, stride, bpp. The source address is defined, and it's blitted into an x/y in the dest screen, automatically clipped.
If there end up being a large number of parameters for a src or dest, it would be nice to have multiple profiles. Either read src/dest profiles from ram, or have say 4 src & 4 dests saved, and blit from src N to dest M.
RLE graphics should probably save their w/h and mode implicitly as the first 2 words, as they are their own free-form shapes.