CDFS
Early on in the RISC OS 4 development CDFS was updated to a new version. Initially during the RISC OS 3.8 development there was a move away from CDFS 2 (which was entirely in Assembler and incredibly difficult to maintain) to a re-implementation, CDFS 3 (which was written in C, and hopefully more maintainable). During the pre-RISC OS 4 development, I tried to get a version of the module which would work with the CD drives that we had to hand but didn't really get anywhere.
Because of time pressures, we decided to step back and keep with CDFS 2. It was also a significant factor that CDFS 3 would have also have needed replacement CDFS drivers, which it wasn't looking like these were going to be developed. There was an improved version (CDFS 2.34) which hadn't been tested as much but which did address a few of the issues with the earlier version.
CDFS itself was an evil source to work with. It might have started out quite structured, but it was really complicated to work with. Functions had to know what state the system had been left in before they were called, which included the details that were on the stack and in workspace locations. It really was painful to work with, and took a lot of thought. FileCore was complicated, but mostly its constraints and implementation made a kind of sense. Approaching CDFS was something to be done with extreme care.
The handling of non-RISC OS CDs wasn't very friendly - there was a fixed table of extensions which could be updated by setting a system variable to contain the name of a file of extensions and their mappings. This was changed so that the extensions were looked up using the MimeMap module.
CDFS only supported ISO 9660, the basic standard for CDFS formats. The RockRidge extensions allow for filenames in lower case. It was relatively easy to add support for the extra filename handling, although that was as far as I went. Another extension to the ISO 9660 format, was the Joliet format which allowed Unicode characters in filenames. Although CDFS didn't really support these, it was at least partially supported by just allowing the names to be read. In this way, CDFS got support for mixed case names.
At the same time, support for filenames with commas was added so that
discs which had filenames with hex suffixed names (',xxx
')
could be used to
store filetypes. This made it easier to burn regular discs, which didn't
have the RISC OS extensions for the filetypes, from other systems.
The ',xxx
' format was commonly used by network filesystems
particularly NFS to store the filetype.
DOSFS
DOSFS was quite an ageing piece of software that hadn't been updated in quite a few years. Its reliance on the Image File Systems interface within FileSwitch, which accessed files through a file handle using regular file operations, meant that it was limited to 31bits as the largest file offset it could handle. The solution to this would have been to update FileSwitch to handle larger file operations. I never really got around to this, although I did start to work out what would be necessary.
DOSFS had a small update to add support for MimeMap, just like CDFS, but otherwise nothing useful was done. I started adding some support for the long filename extensions, but never got them working well enough to enable the functionality.
Oh, there was a change for DOSFS to support partition tables if it saw them on the device, which I added whilst I was rewriting the SCSI stack for USB support. It only used the first partition entry that it found, which for most USB devices was sufficient. Really the whole thing should have been reworked in the planned work for DiscCore, but that never happened.
FileSwitch
The filesystem interfaces were pretty fixed in RISC OS. Changing things about the way that the filesystems work had to be done in a way that's not going to break things. On the other hand, problems have to be fixed where they exist. Many of the times there are no consequences to some of the simple changes except to make things easier to maintain.
Originally FileSwitch would define a whole set of system variables when it started up for load and run actions, and type names. I moved these to a separate module so that updating them wouldn't update the cycle the module more than necessary - separation of function and data.
Path aliases
It has always been possible to use path variables to create aliases within
RISC OS. In general these were used to provide multiple locations for a
set of resources to be located. The Font$Path
variable is
a typical use - it allows multiple font directories to be collected together
and referred to as one. Less often recognised, but more important, the
File$Path
and Run$Path
variables were also able
to be used for multiple components (Select used the latter extensively to
make separate the library directories).
These paths weren't as useful as they might be, because you couldn't enumerate them and see all the related directories overlayed, but that wasn't (usually) their purpose. The intention was that these be used for referencing resources. Because there were multiple paths present, it wasn't always obvious what would happen when you wrote to such paths.
Additionally, to refer to objects using these aliases it was necessary
to use variable:path
, for example
Font:Homerton.Medium.Italic
. This had the disadvantage that
it doesn't follow the usually rules for RISC OS filenames - that fully
specified filenames (which do not related to the current directories)
be anchored by a $
(root directory).
[ It could be anchored with &
(User Root Directory),
%
(filesystem Library directory), @
(Currently
Selected Directory), or \
(Previously Selected Directory).
However, these were variable according to configuration and
aren't part of the consideration here - they cannot have any meaning on
an alias which spans multiple filesystems ]
In a few places these names were used incorrectly - for example the Internet
library which had been distributed for some time had the names of the
resources embedded as '<InetDBase$Path>Hosts
' (for
example). This was fine when there was only a single directory present in
the path variable, but would fail if multiple directories were specified
(as they were in Select). This particular failing was fixed up by the
AppPatcher, but it remained something to be aware of when looking at some
problems.
In almost all places the checks for 'leafname' are performed by searching
for the last component of a filename after a .
directory
separator. This was used pretty much everywhere, not only with the OS
itself but in most 3rd party applications. Many users and some applications
used the alias prefix as a way of making it easier to access directories,
for example, Apps:
could be used to access a directory.
The actual specification of the path variables was that any name
component specified when the variable was used
would be appended to the filenames given in the path variable. This meant
that the path variables always had to end in a '.
',
and failure to terminate the values in
this way could provoke surprising results. Some versions of RISC OS
would fault access to nonexistent directories during the search of
the path, but only if the requested name had not been found in an earlier
component - which led to an interesting problem to diagnose.
Almost no application or tool understood that the parent of (for example)
Apps:Databases
is Apps:
which mattered when
finding associated directories - and could result in data in the Currently
Selected Directory being accessed. Filer would not handle this either,
resulting in hacks such as using 'Root
' or
'£
' as the first component of the path alias when used
within the desktop, eg 'Apps:£.Databases
', which meant
that it was still possible to move up to the higher level.
The whole reason that enumerating the path worked at all was due to a special case within FileSwitch to allow the enumeration of directories if the directory name was terminated with a directory separator found during path alias expansion.
To try to rationalise the behaviour, I added explicit support for the root
specifier '$
' on alias paths. This made their format fall in
line with the general form of fully specified RISC OS filenames. The old
way was still supported, though, because there were still lots of applications
and libraries that used the alias without a '$
' specifier.
This meant that a correct specification to load from a path might be
'InetDBase:$.Hotlist
'. For singly specified directories it
was also possible to open directories on these paths, eg
'Apps:$
' and they would work as expected whatever application
used them.
Applications which followed Acorn's recommended way of handling leafnames
saved in a dialogue box (eg just pressing return with a leafname in the
dialogue box) were unable to distinguish between the usual specification
and a bare leafname, eg Work:MyNotes
would report an error.
With the new format, Work:$.MyNotes
was valid and would
be allowed to be saved.
When handling saving through a drag, the usual mechanism was to append
the leafname supplied by the sender on to the directory where the file
was to be stored, separated by a directory separator. Without the root
specifier, if you saved MyNotes
to Work:
you
would end up with Work:.MyNotes
which was an invalid
filename. With the new form, the directory name would be Work:$
and the resulting filename would be Work:$.MyNotes
.
To remove the ambiguity from the operations - where Filer windows and applications appeared to work, up to the point where you tried to do an operation that was invalid - Filer was updated so that it reported an error when the unqualified path alias was used. This ensured that the problematic directory names would never be exposed to applications, and users would not become confused by directories that were clearly visible but wouldn't allow files to be saved to them, etc.
This provoked a little bit of a bad response from a few people. A couple of additional explanations were given to Usenet. I think a few of those who complained accepted it, but it seemed that there were a few who felt that this was a completely retrograde step, and they would never use the system again. I think that at that time I just sighed and moved on - the justification was reasonable to me, and the only argument that was offered to counter was that it wasn't the way that worked before. As the problems caused both technically and semantically by the old method greatly outweighed any perceived gain, I was happy with my decision.
Related to this, as it was relatively easy to insert into the reworked code in
FileSwitch, was the addition of the '$Write
' variables.
These stemmed from the recommended usage of Choices:
which
was that you read from Choices:Blah
but you wrote to
<Choices$Write>.Blah
. As this was only done during
the configuration operations in applications it was easy for developers
to ensure that they wrote to the correct place. This particular
requirement stemmed from the 'universal' boot sequence, which was expected
to be able to be used for both local boot, for remote booting and for
booting off read-only systems.
FileSwitch was updated such that at the point at which it had detected
a write to a path variable with multiple components, it would check for
the path variable's $Write
variant. If it existed, that
directory would be used as the prefix for the write. If it did not, an
error would be raised as previously. This made it easier to write to
configuration directories - you didn't need to care about special cases.
Now I think about it, I'm not sure whether the change was at the point at which multiple components was detected, or on all path variables. I have a feeling it was the former, but it should have been the latter to ensure that there was consistent behaviour. Nah, it's just too long ago and can't remember which it is.
Collusion
FileSwitch was a mess of collusion - colluding mostly with the Kernel's memory allocations. Such things made it more difficult to upgrade components independently. Every change had to be considered in the context of its interactions with other components outside of the documented interfaces. This made it very difficult for maintenance and development of new features. Other components had collusion removed, and FileSwitch was no exception to this.
In the early RISC OS 4 days, we received a bizarre report about the
operation of one of the SWI calls that a vendor was relying on in their
calls. For no readily apparent reason one Independent Software Vendor was
relying on an undefined register when they checked the existence of a file.
SWI OS_File 5 returns the file details in R2
-R5
,
and the type of the object (nonexistent, file, directory, image) in
R0
.
They were relying on R4
containing a pointer to the SVC stack
when the file didn't exist, in some of their software. This was despite the
fact that R0
was defined to contain 0 in the case of an object
which does not exist. It seems baffling now that we didn't just say 'fix your
software'. Well, not quite baffling - we did understand that the vendors
were what kept things going, but it's still so utterly wrong. Instead, we
updated the behaviour so that R4
returned this random value,
just to keep their software working.
A better solution would probably have been for them to produce a patch to fix their software and for us to distribute it with the OS. This had been done with previous releases and there was a dedicated application for the patching, which could have done the job. Oh well.
As part of the later developments, dynamic areas were removed from the Kernel where they weren't necessary, and those that were created had their access permissions changed. Moving the areas so that they were owned by the components that used them removed the collusion, and reduced the likelihood that the corruption caused by one component would break others.
The System Heap was used throughout the FileSwitch module for its workspace. This wasn't a good idea, and when the System Heap was moved in memory the FileSwitch module had (for a very short time) as number of calls to read the location before using it, though the defined APIs. This was only transitional and the FileSwitch soon gained its own Dynamic Area which it used for all its workspace.
Together with the change of the SystemVars module to control its own area, this meant that the System Heap didn't need to be as big as it had been in earlier versions. The FileSwitch area was marked as being read only in user mode, which made it a little safer for applications that ended up referencing it. I intended to investigate making the area abort in user mode, but didn't get around to it.
There were some quite amusing collusions which used zero page to manipulate the environment string. These would look in zero page for the correct address to use, rather than calling the correct SWI. This was made more amusing by references to Arthur within the code. It was quite nice to strip out those things - there had been quite a few places where Arthur (or in some cases Brazil in other components) was referenced in the code or comments, and which could be completely removed.
Some functions were added to SWI OS_EvaluateExpression to manipulate
filenames - the DIRNAME
and LEAFNAME
functions.
I felt a little uncomfortable in making these rely on the directory separator
'.
', and so I created a new SWI OS_FSControl reason which
would read the details of the file naming conventions. Whilst I can see good
reasons for it, the mass of software which assumes the file name conventions
would make any change incredibly difficult. Probably this was a bit of wasted
time. Oh well.
In addition to colluding with other components, FileSwitch also performed some tasks that it shouldn't really be doing - they are not directly relate to the operation of the filesystem, so shouldn't really be part of the module. The FileTypes module is a good example of one of those cases. In the original versions of FileSwitch, the initialisation would set up the names and actions for the core file types used by the system.
That's not a problem if you don't mind the two being tied together, but it isn't really right - FileSwitch doesn't really do anything with those types except use them. The system variable configuration that FileSwitch used to perform was moved to a separate FileTypes module. As well as separating the functionality of the modules it meant that those types could be updated without change to the core FileSwitch module - which prevents the version number racing up when really the change doesn't affect the functioning of the file systems at all.
Similarly, in some places the location and size of the SVC stack was 'known' in FileSwitch, and so the module wouldn't run on systems where these values differed. Initially these uses were updated to call the SWI OS_ReadDynamicArea which was intended for exactly this purpose. Later, the main uses were replaced with the necessary calls to the Kernel to flatten the SVC stack - as this was the main use of the values. But these, too, were also replaced.
FileSwitch handled the invocation of absolute and transient utilities itself, as part of the 'run file' handling code. This meant that if the code to run Absolutes needed to change, the entire FileSwitch module needed to be updated. Moving towards a 32bit system, it would be necessary to run 26bit components in emulation, and there needed to be better checking of the code that was being run. Additionally, the invocation of the absolute code could include debug data - which it was pointless to load if no debugger was present. It would make sense to only load and run what we need.
So the entire handling of Absolutes and Transient Utilities was moved out of the FileSwitch module, through new SWI OS_FSControl calls. This allowed other modules to use the vectored entry point to change the way that the files ran, if needed. As a result, the changes to FileSwitch to flatten the stack through the Kernel calls became redundant as well - the external modules took care of all of that.
Obscure problems
There were many problems in FileSwitch which you probably would never see in real use. Personally I can't leave these things alone - if there's a bug, then it needs to be fixed. Knowing about a problem, and saying "oh, yeah, don't do that" when something should work, isn't really a good answer. Which isn't to say that there weren't 'known bugs' in the system - many were just too large to fix, or their fix exposed far larger problems. FileSwitch had a few of these, and I tried to fix them where I could.
Early on during RISC OS 4 development we found problems with copy operations. A simple to fix issue, but one which meant that the copy operations would have been significantly slower than expected. The intention had been to use the Wimp 'next slot size' for the copy operations, but would instead force the size to just 640K. We found this whilst we were copying the objects around test machines - they were a lot slower than we expected, and increasing the slot size had no effect, which was surprising.
The 'ioctl
' entry point had been added in RISC OS 4, and this
had a rather bad effect if you tried to call it on a file handle that had
been closed. FileSwitch would abort, because it never even bothered to check
the handle before using it. Clear lack of testing... oh well.
Stream redirection could produce a loop if you weren't paying attention - and thus hang the machine. Typical examples would be to use *Build file { < kbd: } (build a file using user input, taking input from the keyboard), or *Type file { > vdu: } (print out the contents of a file, redirecting the output to the screen). They're really just pathological cases of the failure, which could happen elsewhere if the streams weren't used with care.
As I was intending to make the C redirection and OS redirection share
functionality, I wanted the redirections to be a lot safer. The output
operation was updated to report 'Stream operation in progress', and the
input operation returned an EOF
, which immediately terminates the
redirection. Corner cases they might be, but they're only small examples
of where the system was more robust than it had been.
If the TerritoryManager wasn't present, FileSwitch would be unable to find files unless they matched case exactly. This was because it used the case tables provided by the territory to check the filenames case insensitively. Obviously this wouldn't be a problem most of the time, but if - for example one of your ROM modules didn't use the correct filename when it tried to load its resources from ResourceFS before the TerritoryManager had been initialised, it might not start, or might crash. This happened when the ROM order was changed significantly, and it took a little while to work out what was going on - testing on the command line always worked because the TerritoryManager was present.
I had two choices - fix the module so that it referred to the correct filename, or make sure that FileSwitch could always resolve the names in ASCII if the tables weren't present. I chose both, as that meant that the next time the problem would crop up things would just work, and if the module that was a problem this time was loaded on an older system in such a circumstance it would also work.
There was also the very fun case where if you loaded FileSwitch within a TaskWindow you could cause corruption of the RMA. I managed to mitigate that a little, but it's such an obscure one that I wasn't convinced that it needed a complete fix.
There were many other little things that were fixed along the way.
FS special table
The FileSwitch module contains a special table which associates bug fixes that are needed for certain file systems. The table is associated by the file system number, and mostly covers RISC OS 2 file systems which have bugs that the later version of FileSwitch has to work around. Unfortunately, this causes problems for other file systems which reuse the numbers.
The case we found which was a problem was for IDEFS. The file system number had been allocated long ago to someone else, and problems had been found with it, so it had been added to the special table to allow a particular work around to be applied inside FileSwitch. Other file systems had come along and reused this file system number and thus had the special processing applied to them - and they broke, because the workaround was not necessary for the newer file systems.
Obviously, it would be wrong to remove the special case handling because these later file systems broke with it set. The usurper file system is the one in the wrong for reusing a file system number that does not belong to them. It would set the wrong precedent to say that you could just use an allocation that doesn't belong to you, and then have the system fixed because of your mistake.
As well as talking to the authors involved, I added a special flag to the file system registration which allowed the file system to opt out of the special table. Because this could then be applied in FileCore, it meant that the file system became fixed without changes on later versions of the system which included this code. The real fix was for the authors to get a proper allocation themselves instead of using one that did not belong to them.
ResourceFS
ResourceFS wasn't changed a lot. It works pretty well as it is and hasn't needed many changes since it was first introduced. For Select 3, support was added for the 'Free space' operations, so that it could be used with the desktop Free application. It doesn't really mean that much, but it's useful to ensure that the support is consistent across file systems.
The load and data transfer operations were improved as well. Previously they'd used byte copy operations, which made it relatively slow. Updating these to be word operations made things a bit faster. Not that you use ResourceFS that often, but little things like this do improve the general responsiveness of the system when you're loading data. Of course, as I later added more tools in to the Library in ResourceFS, this benefited us quite a bit.
PipeFS
PipeFS was a much maligned, but sometimes quite useful filing system which could be used as (very) temporary storage for transient operations. However, it was always quirky, and any odd failures in operating on it could leave the file content in a strange state. The principle was that files written to PipeFS appeared as they were written to, but the files could be opened for read and write at the same time - an operation which was generally disallowed by most other filing systems. Writing always went to the end of the file, and reading always read from the start - and caused the file to shrink as data was removed.
It also had the notion of blocking operations, which most other file systems didn't need because they didn't have the same kinds of restrictions. Once a sufficiently large amount of data was written to a PipeFS file, it would issue an UpCall to request that the task be slept - which would cause TaskWindow to block until the pollword specified in the UpCall was set. So multiple TaskWindows could pipe data to one another using PipeFS, so long as the data was small.
A very strange system really. I'd heard - I don't know from where, or how reliable it was - that the reason it existed at all was that a customer of Acorn's had required that there be some form of pipes available, and this was a response that was suitable. Not sure how true that is, but given the somewhat strange nature of the module, I can well believe it.
As it was used for some things, I tried to improve its performance by reducing the number of times that the module issued its notifications about file size changes - these notifications would happen as every byte was written to the file, and could make things very slow. This improved the performance quite a bit, but it still wasn't great.
Adding support for GBPB
operations directly in the module
helped for tasks which supported it. There were also some odd problems
caused by the changes which meant that files ended up untyped, which was a
little amusing to track down in the end.
Disclaimer: By submitting comments through this form you are implicitly agreeing to allow its reproduction in the diary.