Hacking Up an RGB Framebuffer Driver for Wii-Linux – Take Two

Note: For background information and ‘take one’, please see the previous post on this topic. There will be as little repetition as possible in this post.

After the previous post on this topic went online, Malcolm Parsons from DSlinux commented that it is possible to track modifications to virtual framebuffer content by using kernel memory management mechanisms and mentioned deferred IO support in kernel framebuffer subsystem as an example. This suggestion was later concurred by GC-Linux guys. In this post, we will take a look at this idea.

Deferred IO Explained, Maybe Poorly

Deferred IO (defio), as the name amply states, is a scenario where the actual IO operation, which usually involves sending data to device(s) in high-latency (slow) manners, is deferred (delayed) until certain criteria are met. The criteria could be as simple as a predefined time interval. Common use cases of defio can be seen in device write routines, for example in filesystem drivers. By reducing the number of high-latency device writes involved, defio improves driver performance in certain situations.

Defio was introduced into kernel framebuffer subsystem (to be called fb_defio) back in 2007 (link) to support e-ink (or e-paper) devices. These devices have low screen refresh rates accompanied by high access latency, making them ideal beneficiaries of defio. Interestingly, the documentation for fb_defio mentioned that it could also be useful for ‘a device framebuffer that is in an unusual format’. It is almost certain that Wii, or GameCube for that matter, was not what the author had in mind when that part of the documentation was written, but it sure is a sentence good to come upon in our context.

Let’s first briefly discuss how fb_defio works. Since the physical framebuffer on device is slow and/or costly to write to, a virtual framebuffer is setup in system memory, just like what was done in the previous attempt for Wii, but in our case, it was mainly for format incompatibility reasons, rather than writing latency issues. The virtual framebuffer can be written to using three different paths:

  • A, kernel fb_con console driver writes directly to it through fb_ops including fb_fillrect, fb_copyarea and fb_imageblit;
  • B, userspace program could open /dev/fb(0) as file and directly writes to virtual framebuffer through fb_write. This is what ‘cat file > /dev/fb0’ actually does behind the scenes;
  • C, userspace could mmap the virtual framebuffer into its own address space and write to it. Most well-behaving programs, including fbdev drivers for X and SDL, do this.

For A and B, nothing is deferred and all writes are immediately processed and sent to device. This is because there is no trivial way of tracking and aggregating the writes to virtual framebuffer for processing later. For C, however, defio is possible thanks to kernel memory management, which is capable of monitoring which page in mmaped framebuffer is being written to and calling a callback function in fb_defio.c, which stores the page info in a linked list and starts a deferring timer set to a value specified at compile time. Before the timer is up, all subsequent writes will only result in updates to the linked list.  When the specified time interval has passed, fb_defio calls a callback function in framebuffer driver with the list of modified pages. The latter traverses through the list and writes the modified content to physical framebuffer. Upon return from driver callback, fb_defio clears the list of modified pages and waits for the next round of writes.

What About Wii?

So how can all this benefit Wii framebuffer driver? The previous attempt made it obvious that display efficiency is decreased by the necessity of converting (or comparing , in partial update) every pixel (pair) on the screen, even when most of them have not been changed, as in a majority of usual desktop usage time. Fb_defio suggests the possibility of using memory management to reduce wasted conversion and consequently improving performance of userspace programs mmapping framebuffer for access. In other words, we can get partial update without having to perform whole screen comparisons every time. What is useful is not deferred IO per se (we were already doing that by only  performing conversions at vtrace time), but rather memory page modification monitoring.

Hacks III – Deferred IO

Since fb_defio provides a full framework for handling mmapped framebuffer monitoring, it is easier to use it directly in gcnfb instead of reimplementing a lot of the same stuff by duplicating code.

+static struct fb_deferred_io gcnfb_defio = {
+	.delay		= HZ / 60,
+	.deferred_io	= gcnfb_deferred_io,

@@ -1975,29 +2076,56 @@ static int __devinit vifb_do_probe(struc
 	ctl->io_base = ioremap(mem->start, mem->end - mem->start + 1);
 	ctl->irq = irq;

+	void *vfb_mem;
+	unsigned long adr;
+	unsigned long size = PAGE_ALIGN(xfb_size);
+	vfb_mem = vmalloc_32(size);
+	if (!vfb_mem) {
+		drv_printk(KERN_ERR, "failed to allocate virtual framebuffer\n");
+		error = -ENOMEM;
+		goto err_framebuffer_alloc;
+	}
+        else {
+		memset(vfb_mem, 0, size);
+		drv_printk(KERN_INFO,
+			   "virtual framebuffer at 0x%p, size %dk\n",
+			   vfb_mem, PAGE_ALIGN(xfb_size) / 1024);
+	}
 	 * Location and size of the external framebuffer.
-	info->fix.smem_start = xfb_start;
+	info->fix.smem_start = (unsigned long) vfb_mem;
 	info->fix.smem_len = xfb_size;

-	if (!request_mem_region(info->fix.smem_start, info->fix.smem_len,
+	if (!request_mem_region(xfb_start, xfb_size,
 			   "failed to request video memory at %p\n",
-			   (void *)info->fix.smem_start);
+			   (void *)xfb_start);

-	info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
-	if (!info->screen_base) {
+	/* Save the physical fb info */
+	fb_start = xfb_start;
+	fb_size = xfb_size;
+	fb_mem = ioremap(fb_start, fb_size);
+	if (!fb_mem) {
 			   "failed to ioremap video memory at %p (%dk)\n",
-			   (void *)info->fix.smem_start,
+			   (void *)fb_start,
 			   info->fix.smem_len / 1024);
 		error = -EIO;
 		goto err_ioremap;

+	/* Clear screen */
+	int i = fb_size >> 2;
+	uint32_t * j = (uint32_t *)fb_mem;
+	while (i--) {
+		*(j++) = 0x10801080;
+	}
+	info->screen_base = (char __force __iomem *)vfb_mem;

@@ -2048,6 +2176,10 @@ static int __devinit vifb_do_probe(struc
 		goto err_request_irq;

+	/* Init defio */
+	info->fbdefio = &gcnfb_defio;
+	fb_deferred_io_init(info);
 	/* now register us */
 	if (register_framebuffer(info) < 0) {
 		error = -EINVAL;

Compared to the previous attempt, virtual framebuffer allocation is simplified, because mmapping stuff will be taken care of by fb_defio. Note that the deferred time interval (‘delay’) is set to ‘HZ/60’, which theoretically corresponds to the time between vtraces at 60Hz refresh rate. Values much higher than this could result in GUI jerkiness, because screen updates are not sent to physical framebuffer frequently enough; whereas values much lower than this could cause performance drops, because physical framebuffer is updated unnecessarily, for multiple times, between vtraces.

This entry was posted in Linux, Wii and tagged , , , , . Bookmark the permalink.

12 Responses to Hacking Up an RGB Framebuffer Driver for Wii-Linux – Take Two

  1. Strece says:

    Hey I tested your driver in my own kernel and it works like a charm 😉
    But I wanted to know is it possible that the wii graphic card can work in 32bit colourdepth, because some XWindow Managers works only a half in 16bit mode. Higher resolultions I think is impossible, because of the PAL or NTSC limitation, or am I wrong?
    I read your text but I’m not so good in understanding the mystical things 😉

    • farter says:

      In short, the hardware can’t work in 32bit. You will have to create 32bit virtual framebuffer and transcode to 16bit on the fly to hardware. It’s possible, but requires a bit more CPU and RAM.

  2. Markus says:

    Hi, is it possible that you can build a kernel with mikep5, your patch and bfs, because I tried it but I’m not so good with this. I can follow instructions, but I can’t deal with compile errors. And I get compile errors, with this kernel and the three patches (bfs patch also throw Hunk failed errors). Or is your prebuilded kernel are enough up to date for the wii?

  3. neutronscott says:

    Hi. I am glad that someone is doing new work on Wii Linux. I just got one (again, years later).

    Probably there is a better place to ask such a question, but what does the ARM processor do now in Linux? I assume nothing. Could we bootstrap it to perform this function? (Or is this idea silly and I have more learning to do?)

    I’m stuck at the bottom of the world for a few more months and looking for something geeky to do. gc-linux wiki seems outdated. Where should I be looking for current sources? Maybe is not too difficult to catch up to mainline again…


    • farter says:

      Hi, unfortunately almost all low-level Wii-hacking, especially linux-related, has stalled, and for quite some time.

      I don’t understand much about what happens before the kernel boots, you should probably try going through wiibrew, hackmii, bootmii and probably devkitppc sites. You can also try contacting the devs through email or irc, if you need any help.

      Good luck hacking!

  4. jeremy says:

    i just wanted to say thank you again for all your hard work. i hope u keep it up. i have 2 questions though. is the debian image you provide using your driver or cube? what are the “real world” differences i will notice between the cube driver and yours? thanx again

    • farter says:

      The pre-installed images both use the defio vfb driver, because cude only works in lenny and 640×480 non-overscan-safe NTSC mode. Speed-wise, the difference between the two drivers is not very noticeable.

  5. This looks like a good result, is it going to be merged into Wii Linux?

    It might still be useful to allow access to a physical framebuffer for playing video.

    • farter says:

      Eh, it will be up to the GC-Linux devs.

      Video playback from within desktop environment is probably too demanding for Wii, so a driver supporting both vfb and direct access to physical fb may not be able to find many realistic use cases.

  6. “a device framebuffer that is in an usual format”


    I think the unusual format referred to was JPEG:

Comments are closed.