Friday, August 9, 2019
Wednesday, April 17, 2019
You may already know GZX, my humble venerable ZX Spectrum emulator. It runs on Linux/SDL, Windows, HelenOS and has a collection of interesting features (Spec256, MIDI, HiRes, AY file, I/O recording, child lock, ...) It is free software, under the terms of a BSD-style license.
I am trying to gradually modernize and improve the software. Currently I am in progress of revamping the tape support. Currently GZX tape is mostly read only. There is an abstraction layer that unifies TZX, TAP and WAV fiels (not all TZX block types are supported), above that we have a layer that either quick-loads data blocks into the Spectrum memory or synthesizes the tape audio to the virtual Spectrum's EAR. Separate from that, you can quick save blocks to a TAP file and that's it. This is very simple with very low use of memory (one version of GZX worked in 16-bit DOS, after all), but very limited. No way to browse the tape, modify it or convert files between different formats.
Now the goal is to move to a different model of in-core tape representation, where we load the file to memory, keep it in an editable format and then write it back to a file. The memory representation must represent all TZX features to allow for a perfect round trip. With this we can browse the tape, edit it and perform format conversions. After all RAM is somewhat cheaper today ^_^ and it is still possible to optimize the code to load stuff lazily from the file.
I have the basic scaffolding for the in-memory tape in place and now starting on TZX file loading. Once I can load and save a simple file, I can theoretically add a backend for the current tape support to read from this in-memory tape as a starting point for a switch over to the new system.
I will need to add complete support for loading/saving of all TZX blocks. This will be a challenge to test, so I guess I will probably need to fabricate some test files.
Progress on this has been painfully slow simply because I am working on other things most of the time (such as having Sycek compile C for ZX Spectrum, let's leave that for another day). With the basic tape and TZX work added in one commit in August last year and finally some progress on TZX loading yesterday. Hope I will be able to push this forward as tape support is one of the crucial big pieces.
Thursday, December 13, 2018
- Deceloped Ccheck, a C-style checker, and integrated it into HelenOS developer workflow
- Volume mounting automation and fdisk improvements
- Initial support for persistent system volume (/w)
- Unification of character device interfaces
- DDF-ization of legacy device drivers
- Huge amount of work on bringing the C library into alignment with C standard
- Replaced kernel AVL trees and B+trees with ordered dictionary in the kernel
- Fixed ISA-only PC support (finally!)
- Made XCW work with Coastline and added harbours for Sycek/Ccheck and GZX
- Added perf command to collect benchmarks, new memory allocator benchmark
- Very basic printing with standard PC parallel port driver and lprint command that can send files or messages to the parallel port
- File system management / Installation / Distribution / Packaging
- VSN/UUID support
- Need persistent system volume on wider range of platforms / configurations
- Attack problems around persistence and different location of packages / relocatability of packages
- Initrd minimization and support for booting natively from CD-ROM, HDD
- Network configuration
- We need a new model for network configuration that is more dynamic (think WiFi)
- We need a model that will support persistent network configuration
- PC platform support (old and new)
- Look at issues when running on modern PC HW (interrupts, etc.)
- WiFi, Ethernet drivers
- Old PC support (CPU, devices...)
- Look at user-space memory allocator performance
- User interface (GUI / Console / Shell / Serial / Remote)
- A large rework is needed in this are - big plans
- This is a long-term project
- End user usability
- General focus on usability of the stock system without modifying sources or re-compiling
- Practical use cases (e.g. web server)
Friday, November 2, 2018
I introduced the ordered dictionary (odict) module into HelenOS about two years ago (September 2016). I originally developed this module within another project (where I use it quite a lot) with the intent to use it in HelenOS, too.
It is a fairly versatile data structure based on Red-black trees. It never calls to malloc (all storage is provided by the user via odict_t and odlink_t structures). The key can be anything (the user provides a key comparison function). It also has some additional optimizations and such.
It didn't actually have any consumer in HelenOS, until SIF recently. In user space I assume it will see more use as the foundation for other data structures (such as the ID allocator, which I will talk about sometime else..).
In kernel, we currently have AVL trees and B+ trees. Jakub pointed out that we could replace B+ trees with odict, with the benefit that odict does not call malloc (and B+trees do). I realized that odict can replace AVL trees just as easily -- although the only clear benefit is that we will only have one tree implementation in the kernel -- which is a good reason in itself, right?
The only possible downfall is that, while red-black trees are slightly faster at insertion and deletion than AVL trees (at least in theory), our AVL tree has the advantage over odict that it does not need to call to the getkey/comparison functions (as the key is always an integer), which should bring some speed advantage. Therefore I was worried that replacing it with Odict could make the system slower -- although I thought the difference in real-time performace probably would not be measurable.
Therefore I decided to create a prototype implementation with Odict replacing AVL and benchmark it. For the measurement I first tried measuring some kernel tests with a stopwatch, but that obviously wasn't precise enough to show any difference. So I went for the only actual benchmark available in HelenOS, that is the IPC ping-pong benchmark.
I ran the benchmark 10 times and recorded the results, then calculated the averages and standard deviation using Gnumeric. I tested with AVL and Odict and also tested both in Qemu / KVM and QEMU without KVM.In the following table all measurements were done with 10 samples.
|Test case||Average RTs / 10 s||Std. dev.|
|Odict / KVM||403320||4210|
|AVL / KVM||388700||4962|
|Odict / softmmu||25300||249|
|AVL / softmmu||23520||434|
In the table above we can see Odict beating AVL by 4% in the KVM case and even a 7% improvement of Odict compared to AVL in the softmmu case, with standard deviation of the measurements being around 1 %. This is kinda surprising.
I then went and made some changes to the ping_pong benchmark. Instead or repeatedly doing 100 iterations until we measure that 10 seconds have expired, I changed it to determine the number of iterations (lowest power of two) needed to have the test run for at least 10 seconds, then measure the actual time taken (and divide number of iterations / duration). I hoped this would give me more accurate results and measure less of how getuptime() performs. I also had the ping_pong benchmark run 10 times and calculate the statistics for me.Here's the results: Again all measurements are with 10 samples:
|Test case||Average RTs / s||Std. dev.|
|Odict / KVM||38988||241|
|AVL / KVM||38601||285|
|Odict / softmmu||2452||44|
|AVL / softmmu||2290||27|
There still seems to be some improvement when switching from AVL to Odict, albeit it is much closer to the noise level of the measurement.
AVL trees are used to hold the global lists of threads and tasks in the kernel. I am not sure if their performance ever comes to play in the ping_pong test. Any ideas how AVL/odict could affect results of the ping-pong test? How come we see any difference? Any suggestion for a benchmark that would touch better upon this area? (e.g. creating and destroying a thread repeatedly)