Networking

There are many areas that the networking on RISC OS could be improved. It is hard to remember all the things that I wanted to do, but there are a few that were obviously required and needed addressing.

It might be appealing to upgrade the Internet module to a later release, from one of the BSD variants. There certainly are many updates that are able to be back ported. There had been a few back ports of features, and bug fixes, which had been applied during Select, in addition to the many fixes that had been implemented independently. But there had been quite a few customisations that had been made for RISC OS specific features in the Internet module. These would be need to be retained if a new port was attempted.

For any such work, you have to ask whether the benefit is sufficient to warrant the reasonable amount of work that would be involved. Some of the newer features of the network stack would still need to be tailored for use within RISC OS, and many of them would be far less useful than they are in the Unix variants.

Security and stability fixes would be useful. Some of the more specialised wireless handling might be useful, but would in any case need to be worked into the DCI protocol. This might be able to be done independently of any upgrade of the stack infrastructure, but some features would probably need to be configured centrally.

Wireless

My thoughts for the wireless stack would be that any wireless devices should provide a new configuration interface (a new SWI, similar to how SWI DCI_MulticastFilter configures the behaviour of multicast reception). The new interface could be indicated either through a flag in the device flags (obtained through enumeration or SWI DCI_Inquire), or by providing a valid type (one of the many 802.11 types) in the interface type returned by the statistics.

From there, there are two ways that I think you could go. One is to follow the route of allowing the configuration of the interface through the network stack. This method would require either new Internet SWIs to be created for the control, or for the interface to be exposed through an ioctl operation (on maybe an AF_LINK socket, or similar). The other way, which would be more in keeping with the RISC OS module philosophy, would be to have a separate module which would watch for the arrival of devices, and would control them - separate from the Internet module.

The latter method is far preferable to me. The RISC OS network implementation keeps the devices apart from the protocol - Internet should not be considered the only protocol driver (there is an argument that everything is Ethernet these days, but that is only true until it isn't). Allowing the device's configuration to be performed as a distinct function that is not reliant on the protocols which use it makes sense. There is a reasonable argument that such configuration might include the configuration of other parts of the interface, for example access concentrators (for PPP links), VPN access and similar. Providing a separate module also means that the configuration module is able to be replaced independent (and, hopefully transparently) of the protocols underneath.

IPv6

The issue of the protocols being distinct draws a question about whether future IPv6 interfaces should be distinct from the IPv4 implementation provided by Internet. The distinction is that these are two different stacks - they are often referred to as 'dual stack', despite most systems sharing much of their implementation and interfaces. Certainly there is nothing to prevent IPv6 from being implemented completely independent of the Internet module.

However, such an implementation would not be able to (for example) share socket select operations, and would not be able to be used transparently with the same socket operations. Ideally, it should be possible to write neutral code that does not care whether it is talking to IPv4 or IPv6 - this is possible on other systems. This implies that interacting with IPv6 should be through the same interfaces as for IPv4.

Internally, many of the operations are able to be specialised for the protocol in use, through protocol handling functions. It might be possible to provide an abstraction at this level which allowed a separate module to provide the IPv6 functionality, through the same APIs as the current stack, determining the operations based on whether an AF_INET or AF_INET6 operation was in use.

Though this is appealing, the amount of work necessary to separate the two stacks would probably make this prohibitively difficult, for very little gain. Any integral solution should still try to separate out functionality where possible - although link-local addressing was implemented separately as ZeroConf for IPv4, in IPv6 link-local addressing is a core function and cannot be made optional. However, DHCPv6 is an independent protocol and could be implemented separately (although probably with a more sensible interface than the current DHCP method).

Quality-of-Service

Quality-of-Service may not be a significant concern for some users, but it would be useful to at least be able to provide support at the DCI interface (layer 2), which currently does not allow for 802.1Q (VLAN/priority tagging). Introducing a means to support this would at least allow the flags to be set. The existing code already handles the Type Of Service IP options, so this probably doesn't need extra examination.

VLAN/priority tagging probably is of lesser use to most workstations - such settings are often applied at the switch or router, rather than at the workstations. However, it would make the stack a little more capable. Probably not enough to warrant the work, and I am probably being generous in even saying that. I just think it would be fun to have the stack handle it well <smile>.

Firewall

I had already added support for packet logging, and support for NAT (Network Address Translation), and the stateful firewall provided by the filters would make it a lot safer to use a RISC OS system closer to the external network. These were absolutely requirements if the system was intended to be used on a portable device. The next stage in their development, aside from tidying up, documenting and improving the back end, would have been to add a easier to understand interface for controlling the firewall.

The standard way to control the built in firewall was to throw an editor at the command file. That is fine for those who want to use the command line, but that isn't the target of the system. There needed to be a way to control the firewall, that normal people might be able to handle. Some simple buttons to block the normal protocols would be nice.

I had envisaged a basic configuration window which included the core firewall rules that were provide by the Internet module's built in processor. These would be the fundamental rules that denied access to the local machine for certain protocols. The default settings would offer rules to block ShareFS, and a few other common ports that offered little benefit. There could easily be options for blocking the ports 'not on the private network', which would catch external cases.

An advanced configuration would allow control over the stateful filtering, when present, which would allow ports to be forwarded to other machines, blocked, or offered as NAT when outgoing. There could be other common settings, and we would always offer a way to get down to the bare configuration file if necessary.

I wanted to investigate how easy it would be to provide plug ins for the filter module, to allow other protocols to extend the operations. On Linux, this was usually achieved by building the additional filter types as modules and loading them at run time. A similar system should be possible for the filtering on RISC OS, albeit using relocatable modules.

Name services

I have mentioned that I wanted to improve the Resolver, and some of the work that had already gone into it. Essentially, I wanted the Resolver to be a switching service that could take input from a number of other sources, which were already available to us. We already had name services using the local Hosts file, DNS, Freeway host names, LanMan host names, and Multicast DNS, but these should have been tidied up so that the could be selected more usefully.

It might be that you never wanted to trust LanMan name service, so being able to turn the queries off might help. This sort of thing is more useful when you're on an insecure network, such as a portable device might be when using wireless networking. Similarly, such traffic is a drain on processor time and battery in such situations.

At its simplest, a basic set of options in the configuration application to enable and disable resolution using different services should be possible. The Multicast DNS services offered information about the services that were available to the system, but without anything using them, it was a little pointless.

Network printing

Particular areas that I planned to add were automatic registration of printers based on their service discovery using multicast DNS. This should have been reasonably easy to achieve. The service definitions for the printing protocol were quite stable even when I was looking at it, all those years ago. Mappings would most likely be necessary for certain model types, but those same mappings would be necessary for USB printers, so could have been managed centrally.

The existing MDNSService selection gadget might have been used, or extended to handle the different information types supplied in the printer service discovery. Launching the administrative interfaces for printers should also be possible from such an interface. The only real issue with using the gadget was that the main !Printers application does not use Toolbox, so anything that was added on, like this, would need to communicate cleanly as if it were a part of the main application, or the functionality would have to be duplicated in !Printers. There was always the option to rewrite it to use Toolbox, but that way lies madness - it was very unlikely to happen, however much maintenance gain there might be.

Work had been done by others to provide an Internet Printing Protocol driver, and this was functional, although not very friendly. At the time I tested it, it was prone to hang the machine, or just crash. This could have been used as part of the general solutions for printing to network devices or a fresh implementation created (I have this nagging suspicion it was under the GPL, therefore completely incompatible with the rest of the system - I could be misremembering, however).

LDAP

One of the areas that had not been addressed was the use of LDAP on RISC OS. At the very least it could be used as a generalised address book, but there was greater potential than just that. Providing a means for storing and retrieving data in an organised directory, it could have offered a useful program interface for some clients. As a data source, it could have quite easily fit into the generalised data retrieval interface.

A module providing LDAP access would allow queries to be made using a simply defined set of SWI calls. Because of the need to authenticate (bind) with a server, it might be necessary to open connections for communication rather then just offering a plain anonymous query interface. There are a variety of different ways that LDAP can be queried, so I can imagine that the interface might be quite involved for some operations - despite being lightweight that doesn't necessarily mean that the protocol would be simple.

SNMP

Which leads me to the 'Simple Network Management Protocol' which, is often far from 'simple'. There are two sides to SNMP support - the client and the server (or 'agent'). The client can query the server, and may use a number of defined 'MIBs' (a system of describing the results from the server), to retrieve the state of individual data items on the system. Each data item is given a unique number sequence called 'OIDs' (Object Identifiers). Some of those states are writable, so the client can actually control the remote device. Usually this is done under strict control and requires authentication to do so.

Additionally the SNMP agent can issue notifications to clients, unsolicited. These are known as 'traps', and often provide notification that an important event has occurred, or limit has been reached. Therefore, a client implementation as a module could easily provide a means to query the remote system, and to enumerate the details from the remote host, as well as a mechanism for receiving traps, and converting them into service notification and potentially Wimp broadcasts.

I remember that I when I was doing some work for Pace which was required to report details about the number of print jobs performed from an STB, I had offered as one of the solutions adding an SNMP server to report the necessary details. It seems like a rather nice, and appropriate, solution, but they did not wish to provide the details that way.

Anyhow, a client part of an SNMP system would allow routers, switches and other systems to be queried and monitored. Like LDAP, this would fit nicely into the data source interfaces.

A SNMP server would need to be more specialised. Provided as a module, this would allow clients to register handlers for specific OIDs or ranges of OIDs. Handlers could be registered either as callbacks to other modules, or though the service call interfaces (although I am not at all convinced that this is a good idea). Whether there should be a means for Wimp applications to provide responses for is something that I'd like to see, but introduces a degree of additional complexity to the implementation.

Similarly, the SNMP server should be able to send SNMP traps to clients, so there would need to be both a configuration interface to allow the traps to be delivered (possibly as a standardised configuration interface which could be instantiated within a OID, allowing a groups of OIDs to configure their traps in a similar manner.

SNMP is not easy to get right, and so it would be really useful to ensure that the code interacted well with other common implementations. Being able to query simple things like the interface state on the RISC OS system should be trivial, but it would also be possible to export other state information, possibly even a generalised export which produced details from the network statistics enumeration calls.

Of course, any such functionality would need to be configurable, both coarsely (turn the entire service off), and at a finer level (disable groups or particular OIDs). Obviously, killing the module would be one way to control this, but it should be possible to filter based on host address and similar other features.

SNMP is a little specialised, and the server interface more so - it is probably less useful for RISC OS as it is generally quite a poor server. The client could prove to have some uses in monitoring, however.

Universal Plug and Play

Universal Plug and Play (UPnP) is one of the worst collections of interfaces it has been my misfortune to work with. I didn't get very far in my investigation, and it convinced me that there are always worse ways to do things than you can imagine. To give it its due, it solves some interesting problems, but it solves them in ways that other systems could have done better.

The usual way that UPnP is visible is through notifications that a device is available - a router, or a video source. It can do more than that, but those are the most obvious. Initially, I wanted to be able to provide both the client and service for the Internet Gateway Device (IGD) part of the protocol. This is the bit that makes it possible to request address translation at the router to make applications seamlessly accessible to the world - and with a complete lack of security in doing so.

I had envisaged having a module which would be able to listen for the IGD requests and announce its presence, and then offer details to the machine. The statistics enumeration interface could provide details of the properties of the device, and dedicated interfaces would allow mapping of ports externally. I also wanted to let the RISC OS system become a gateway device, as this is something that it should be able to do.

The problem here is one of the technologies required. Universal Plug and Play requires zero configuration networking (the link-local addressing of interfaces), which I'd already implemented. However, being a Microsoft protocol, it doesn't use multicast DNS for its service discovery, but instead uses Simple Service Discovery Protocol (SSDP). This is a HTTP based protocol, which uses multicast UDP as the transport for announcements and discovery and unicast UDP for responses.

Because only a single client can be using SSDP as a service at a time, it would make sense that there be a SSDP module which provided registration and query SWI calls to control requests and responses.

Having obtained details about the device, a URL is then extracted to access the definition of the device. This is usually a HTTP request which returns an XML description of the device, its details, what it provides, how it is controlled and how it can provide notifications. Icons are also allowed for the device, and can be many image formats, but must at least support PNG.

The URL fetchers would be able to handle HTTP requests. XML decoding was already available through ParseXML, or could have used the libxml2 library if necessary. Image format decoding could be done with the ImageFileConvert stack, or using the newer URLImage toolbox gadget. So parts of this were already in place - it would have been a matter of just bringing these together.

It gets to be more fun, though. Device operations can be controlled through SOAP requests - an XML formatting schema that allows Remote Procedure Calls to be performed over a number of transports, in this case HTTP. I had considered, but not implemented, a SOAP module which would allow Remote Procedure Calls to be performed either blocking or with polled or callback responses. This would allow a greater range of operations to be performed than just making the UPnP protocol work.

Notifications of events were provided though another protocol, one which is otherwise a forgotten Internet Draft. The General Event Notification Architecture, offers a HTTP based notification of events. As this has no other use (to the best of my knowledge), the notification system doesn't really need to be that extensible.

All these bits make up the basic protocol, and that is before we get to any device specific operations like controlling video playback, or setting up port forwarding. Out of the technologies, however, there are a few interesting and useful things, which is one reason why I was investigating it (in addition, obviously, to the original goal). There existed a few libraries for handling UPnP, but whilst I did investigate a few, they were sufficiently complex that I didn't get very far with them. Breaking the implementation into sections which could be implemented in isolation would have made the job far simpler.

In particular, a SOAP client would probably be quite useful for some operations. It should not be that difficult to produce a modular implementation. I had already made a start on the beginnings of such an interface, but it was far from complete.

ShareFS

As network file systems go, ShareFS was never going to set the world alight, and with its legacy design as a single client service, it was always going to perform worse than other systems.

ShareFS has been much maligned for its poor performance in some network configurations. There is no one cause of this, and the 'fixes' that people have proposed show a lack of understanding of the root cause of the issues - because their understanding of ShareFS is poor, and because there is no single root cause. Many factors conspire to limit the performance, only a few of which are present in ShareFS itself.

Select addresses a significant portion of the problems, although some lie outside its control, in the network drivers themselves. One of my commit messages for improvements to the handling of MBuf scavenging says "Tested on build machine in 32bit OS and appears to work correctly. Transfers over ShareFS of the 6M executables are now considerably more reliable - I have not had any failures from them, as compared to failures around 95% of the time previously." This was just one of many improvements in which were intended to address issues with the transfer speeds.

ShareFS, however, suffers from its heritage as a point-to-point file transfer system. It is able to service only a single client at any given time, and other clients will be intentionally queued until it is free to service them. This is the primary reason why ShareFS was never going to be a replacement for NetFS as a general storage protocol. Well, that and its lack of authentication.

The transaction code needed to be rewritten, and that means more than doing spot analysis of the code - the whole algorithm needed significant work to allow it to be used by multiple clients simultaneously. It is a relatively sensible module, which is only let down by the fact that it was not designed for the task to which it has been put.

The internal handling of its protocol needed some work, however. Despite being quite good at what it did, it could still be subverted and behave badly when invalid input was given to it. Validation is limited in the module. Similarly, it is very simple to cause ShareFS to be accessible across the Internet, if the RISC OS system is directly connected without firewalls.

The filing system interfaces were reasonably flexible in ShareFS, but it had some signed issues with some of the operations. This meant that trying to access very large files failed. However, as the rest of the system did not handle large files well, this issue was not significant. Both could be addressed in line with one another - remembering, of course, that any changes in the protocol required that the ShareFS module fall back to older methods, rather than assuming it would only communicate with capable versions of the module.

Sharing

The sharing interface needed to be abstracted further. The initial implementation provided by the ShareFS Filer was only ever intended as a stop gap. Other systems exist which allow sharing, and when the 'Share' item is selected on menus it should invoke a similar dialogue box to the one that ShareFS Filer provides, but controlled by a switching module that allows other sharing systems to be used as well.

The share settings would be savable in a central manner, and the necessary configuration read by other sharing systems. This would allow an NFS or CIFS server, or even a WebDAV server to be able to be configured by the user in the same way. Advanced options could be provided through a protocol similar to the configuration tool plug ins, allowing more specialised settings.

Splitting out some of the user interface wouldn't save very much in the ShareFS module, but it would give a much better user experience when there were multiple sharing systems available.

SSH

I always wanted to return to !NettleSSH and add in proper support for the modern SSH 2 protocol. This should never have been hard in the first place, but I focused on other areas. There would need to be some areas that were reworked in order for support for tunnelling to be handled nicely. The user interface to configure tunnelling is always a bit complicated, especially as many times you want to configure the tunnel as part of the connection, rather than as a single user tunnel.

So probably the configuration options would need expanding quite a bit, so that the tunnels could be configured on connect. Putty provides a reasonable interface, but it's still not exactly obvious what you need to use in order to set up the tunnel. Any improvement in that area would be useful!

It would be useful if the same code could be used to allow file copies to and from the remote system. It is possible that this could actually be provided through a URL fetcher, rather than directly in !NettleSSH itself. The protocol doesn't allow for directory listings, which might limit its use, but it would still be quite useful as a method of obtaining files.

User key handling would also have to be improved, allowing generation of user keys, and their association with different connections. It could all be quite fun to get working properly.

URL fetchers

After I stopped doing RISC OS things, I did return to reimplement the URL fetcher modules, and the manager that sits above them. This was really just an exercise to remind myself how to write C code, getting some practice in before I started at Picsel. However, I had planned on providing some improved functionality to the fetchers where they were limited previously.

The URL fetcher module itself, which performs the dispatch to the actual fetchers would be improved by adding buffering for the headers and decoding of header fields. Essentially, the URL fetcher modules could return data in any size of chunks that they liked, returning the header and the entire body if it would fit into the data block. Or they could return a line at a time, as it arrived (or because that's just how they were implemented).

This meant that clients using the fetchers had to implement their own header buffering and parsing to separate the headers and the body. The flags that indicated the progress (eg 'awaiting header', 'received header') were relatively useless if the client had to provide all the decoding themselves. They are only slightly less useless, as they could skip decoding if they had not received the headers yet - but the headers usually appear in the first couple of packets, so this is a marginal improvement.

As the data has pass through the URL_Fetcher in the first place, the very first data blocks received could easily be buffered by it and decoded there, either to be delivered on to the application as a single data block, or decoded through accessor calls to retrieve headers. Amalgamating the entire header decode into the module and allowing it to be read separately would be reasonably easy.

My own implementation already handled the buffering of the headers separate to the body. Primarily this is because the HTTP fetcher that I used did not separate the headers and body, because there is no requirement to do so in the specification. The main problem was that !Browse relied on the them being separate. In any case, for most clients, being able to read and decode headers with a single SWI call would be exactly what they wanted. Being able to say something like SYS "URLFetcher_Status", %1, session%, "Content-Type", buffer%, bufferlen% TO status%,,response%,read%,total%,usedlen% to obtain specific information out of the header would be handy.

Equally it may be useful to allow the fetcher, or URL_Fetcher to buffer more data than its default. This could particularly help with fast connections or longer periods between poll times, as would happen in Wimp applications.

At the same time, it might be useful to remove the remaining collusion that the Acorn fetcher modules have with the URL_Fetcher, or document them, if necessary. I don't actually remember how the problem presented itself, but the code I have looked at implies that returning a session handle which isn't a pointer to the status word is bad - something tries to use it as the status word pointer, with bad results.

The HTTP fetcher would need its support for ranges, and for different content encodings, to be tested. Ranges are supported only as extra data headers, and so they are protocol specific. In principle, the same headers might be usable for FTP content as well, but this would need to be checked - and in any case would be dependant on the FTP server supporting range requests, just as the HTTP based request would be reliant on the HTTP server supporting them.

Making range requests, or authentication, easier for clients would also be a beneficial improvement to the URL fetcher system. At present it requires knowledge of the protocols in order to be implemented correctly, where the fetcher could take much of the work off the developer's hands.

The AcornSSL module needs quite a bit of work to make it possible to report useful messages about certificates - at they time there was no handling at all available for them other than to provide client certificates. The URL fetcher interface also doesn't allow for any messages about certificates to be returned, which limits their usefulness. I had avoided working in that area because the need for additional checks for export control made it more complex.

I had already implemented the Data URL fetcher. It probably needed a review and check against the most recent standards, but would probably be mostly correct. It would need some checks to make sure that it wasn't itself a security problem for clients - I don't think it should be, but it would be prudent to at least check.

None of this is particularly difficult, but the URL fetchers should be core to many new applications using the Internet so they should be featureful and robust.

Routing

Originally the Internet stack was, I believe, supplied with a RouteD which provided routing through the Routing Information Protocol. Vestiges of this exist within the network configuration tool - they existed in the pre-Select version of the tool, and when I created the separate applications I retained the configuration options in the Routing settings, despite not knowing anything more about the module.

The source isn't around, and I believe that parts of it were rolled into the NetG, which is used on !Gateway machines to provide routing between networks. I'm not completely sure about that as we had abandoned all the other variants of the Net module's builds in favour of retaining NetI.

RIP is not really suitable for modern networks, and I can't imagine many times that you might ever need it. Router configuration can be made by ICMP, using the RouterDiscovery although this is often not very useful.

At my work we use both BGP and OSPF routing protocols as part of our infrastructure - we are pretty specialised. It would be an interesting project to implement one of these to control routing. Probably not useful for almost any cases, but I'm a sucker for 'interesting' over 'useful'.

Any changes that exercised the routing system would undoubtedly expose more problems with that code - it had already been problematic when I was implementing DHCP and link-local address configuration.