PPPoE

I was talking to a friend at work about tun/tap interfaces and how they work and the like, and the subject of PPPoE (PPP over Ethernet) came up - because obviously a user mode implementation can easily use such interfaces. So I mentioned about how I had written the PPPoE implementation (well, the access concentrator) for RISC OS a few years ago. It's not all that complicated, but without any real use as there's no place where you'd have to use it really.

Apparently I had it all working it a satisfactory extent in June 2005. And - I'd forgotten this - I implemented both ends, the Access Concentrator and the Client. It happily worked with the RASPPPoE server on Windows XP. The notes I've left myself give an example of having 3 simultaneous PPPoE sessions running (local Access Concentrator, PPPoE session connected to it, and a PPPoE session connected to remote Windows XP), alongside 4 wired Ethernet connections (and a virtual eh1 as well, for good measure). There's even certain IPv6 address negotiation performed as well on the PPP link, as well as support for IPX and AppleTalk packets.

As well as a long discussion in my documentation, there's a little log of a session:

*pppoe_client -interface eh0 -service RiscPCService -user testuser -pass testpassword -vj
*pppoeinfo
PPP-over-Ethernet session information:

Interface:     pe unit 0
Type:          Client
Via:           eh unit 0 (0:c0:32:0:88:84)
Peer:          0:c0:32:0:72:6f
Session ID:    &0008
Stage:         Established
AC name:       RISC OS Experimental PPPoE AC
Service:       RiscPCService
TX statistics: Frames=0, Bytes=0, Errors=0
RX statistics: Frames=0, Bytes=0, Errors=0, Unwanted=0
Local IP:      10.0.0.4
Remote IP:     10.0.0.1
Local IPv6:    fe80::0200:a4ff:fe11:5077
Remote IPv6:   fe80::0200:a4ff:fe10:1726
User:          testuser
Max connect:   0 seconds
Idle timeout:  0 seconds
VJ compression:       Disabled
Protocol compression: Disabled
Interfaces:           IP up, IPv6 up
MTU:                  1492
Filters:
  Type=&0800 FrmLvl=1 AddrLvl=0 ErrLvl=0 Handler=&03a75a70
  Type=IEEE  FrmLvl=4 AddrLvl=2 ErrLvl=0 Handler=&0220297c


PPP-over-Ethernet AC information (RISC OS Experimental PPPoE AC):

Service:        RiscPCService
Server address: 10.0.0.1
Address range:  10.0.0.2-10.0.0.5
DNS servers:    192.168.0.100, unset
Users:
  familypc            10.0.0.5
  anotheruser         <dynamic>
  testuser            10.0.0.4

Service:        Experimental
Server address: 172.16.24.1
Address range:  172.16.24.2-172.16.24.5
DNS servers:    192.168.0.100, unset
Users:
  ringo               <dynamic>
  george              <dynamic>
  paul                <dynamic>
  john                <dynamic>


PPP-over-Ethernet transport statistics

Received:     12 packets (584 bytes)
  Discovery:  2 packets (124 bytes)
  Session:    10 packets (460 bytes)
Transmitted:  0 packets (0 bytes)
  Discovery:  0 packets (0 bytes)
  Session:    0 packets (0 bytes)

*ping -f 10.0.0.1
PING 10.0.0.1 (10.0.0.1): 56 data bytes
...............................
--- 10.0.0.1 ping statistics ---
4612 packets transmitted, 4581 packets received, 0% packet loss
round-trip min/avg/max = 0.000/35.583/80.000 ms

Which I think is pretty amusing - especially the fact that my driver has negotiated IPv6 link-local addresses address.

I also seem to remember that writing the PPPoE module helped me to find the MBuf leak that had been nagging for some time, which made the whole restartable Internet stack far more reliable.

GenericPPP

The PPPoE module was actually just one part of a suite PPP modules. There was GenericPPP which provided the actual PPP implementation. PPPoE, which provided the PPP-over-Ethernet implementation - acting as an Access Concentrator providing a number of configured services, and a client which could connect to such Access Concentrators. Then there was the PPPServerConfig, which provided the service configuration for when PPP was acting as a PPP server. PPPoE would accept configuration and then do whatever was necessary to negotiate a session, before attaching itself to GenericPPP, which would in turn register with the Internet module and feed PPP data from it to the PPPoE module for transmission 'on the wire'. PPPoE would obviously feed data from the wire to the GenericPPP module. For the LCP negotiation as a client, configuration would be provided by PPPoE. When acting as a server, however, the GenericPPP module would call to an interface which PPPoE registered when it started - which would be the PPPServerConfig module. This would then perform necessary authentication and return configuration parameters. For example, it might validate the user and password, and based on that (and service configuration details) return a specific IP address to use. Or it might just return an address from a range. The system supported PAP, CHAP and MS-CHAP, I believe - although I had to faff around quite a bit to make the MS-CHAP implementation work.

Whilst looking for this code, I also stumbled upon a reimplementation of MessageTrans and ResourceFS, which were unfinished but had a reasonable amount implemented.

InetFilter

There's other interesting network code lying around too. InetFilter module, and its friend the InetLog, which provides an IPFilters interface - supporting packet filtering, NAT and logging. There's a tcpdump port as well, which can record all the packets going through the Internet module (which acts as a switcher in this instance), and a little BASIC packet capture which can write pcap logs which can be read by other network packet processing tools.

I was relatively certain that the timer code wasn't present, and so NAT connections never timed out and stuff like that - but looking at the code, it seems that I did at least start that implementation. There's even a little 'Skeleton' module source, which was intended as the example to supply to anyone interested in using the API. The communication through the Internet module was - at the time - the most sensible of routes. Given a bit more time and need, I should really have created a broker module for low level interface - access with which Internet and others would claim interfaces, and they would be registered on their behalf with the Ethernet drivers. This would allow multiple handlers for each protocol. Mostly it'd be useful for logging, but filtering would come in at that level as well. There's downsides to that, but it would mean that the Internet module wouldn't be 'special' from that point of view.

It's not quite right having the log capture within the Internet module, but due to the way that there can only be a single claimant with the drivers (DCI 4 design), that was the only useful way to provide such handling. As many DCI 4 drivers were already around and unlikely to change, there wasn't scope to change the drivers themselves.

The situation with the DCI 4 specification and drivers in general was always a bit strange. The specification was closely guarded for such a long time - so much so that the barrier to actually developing a driver was too high for many people to do anything. The DCI 2 implementation was available, as the FreeNet stack existed and allowed people to use drivers, but it was limited in that it was so old that nobody actually had real DCI 2 drivers for anything. The original EconetA module was DCI 2 only, and was only updated to DCI 4 late in the day (if I remember correctly).

In many cases - such as the DCI 4 specification - I(/we) had to stick by the prior Acorn standpoint that the distribution was heavily restricted. Partly because we had no rights to say otherwise, and partly because best practices stand in lieu of decision to the contrary. It's usually nice to go 'tada, here have all this stuff', but you can't really do that when you're restricted by agreements and the like.

Ping and Traceroute

There were quite a lot of network components that got knocked together, partly out of curiosity, partly out of need, and partly speculatively. The Ping and Traceroute modules were written back in 2000, just to provide a shared way of collecting results from Ping (ICMP Echo) and traceroute (UDP with increasing TTL - no other forms were implemented), rather than having to reinvent the wheel for other tools. The desktop interface never got written for them - partly because the command line desktop front ends did the job quite well. The other reason they existed was that sometimes it makes those functions available, easily, for other developers. Maybe someone would want to probe a remote system when they were getting no data - not saying that's a great plan, but it'd be easier if there were a shared implementation to do so.

Ultimately, though, the modules weren't so useful - they could (for example) have made the command line tools far smaller. But why bother when they're so rarely used ?

DHCPClient

DHCP is quite interesting - obviously it was created in order to fill a gap back in... um... Select 1? Somewhere around there. It was not only a tick list item, but also a deficiency that prevented some users from having any network connectivity. Testing it was... amusing. It was tested again (if I remember correctly) the Linux DHCP server and a Windows DHCP server to check that it would work with the most common systems. It then went to various people to try it out. Chris Williams, in particular, was interesting, as he had Internet access in his University room but could not get it to work with the DHCP module. Amusingly, the initial email discussions about DHCP appear to be around 15 Feb 2002 - which is the first logged change for the DHCP module. The problems on his network were that the system ended up in SELECTING state until it timed out and would then reset to INIT and try resending DISCOVER. Around the time, I added the Parameter list because (seemingly) the Windows DHCP server wouldn't even respond if I didn't send a list. Then there were other things that were fixed like server identifier not being sent, T1/T2 values not being defaulted and the client identifier - which seemed to help with NTL cable at the time.

Previously, when I'd been at Pace, I'd spoken to Stewart Brodie about the issues he'd had with their DHCP implementation. It wasn't particularly helpful to doing mine, but it was useful in convincing me that the same sorts of problems happen whatever implementation you make. Plus, he's way cleverer than me and was fun to talk to at the time <smile>.

Eventually the whole automatic reconfiguration of addresses was made more sane, with much of the common code that went into DHCP and the later ZeroConf (link-local address allocation) module (oh, and Freeway, which did reconfiguration for AUN addresses) being moved into the InetConfigure module. ZeroConf had some shared bits with DHCP (ARP probing and the like) but was pretty much a complete implementation of the RFC - I was quite pleased with it really. One of the few places that I actually noted the compliance and variance from the specification.

RouterDiscovery

RouterDiscovery was similar, albeit it was a far older system. It would never have been much use, but it was capable of tracking routers and of announcing itself as a router if so configured. It also worked with DHCP such that when the correct option was specified it came alive on the interface. I thought that was pretty neat, although it would have been nicer if it had been a more extensible interface.

RevARP

The RevARP module provided a Reverse ARP server (listens for RARP requests and responds if the ARP entry is known), which could be used as an alternative means of providing a machine's address. RISC OS has supported it for quite some time as a client. Nothing really uses it, so it's not that useful, but it's handy to have these things lying around.

TFTPd

Similarly unuseful, I had a TFTP server that I was using as part of the test of the TFTP client, and partly as an example of accelerating an existing protocol within the constraints of the specification. Mostly I was focusing on keeping multiple packets in flight, rather than increasing the packet size (which was used elsewhere but not supported by the target clients). The improvements in the speed were quite significant, but the server had to track what had been acknowledged more carefully as the number of packets in flight and the number of clients in use increased. It never really worked out, though so it sat in an experimental area for a long time. So much so that as I talk about it, I can't actually find the code.

Ah, 5 minutes later and I've found it - it's pretty bad in places, but looks like it would do the job. There is a nice '#ifdef notdef' bracketing the "mail" transfer format, which kinda amuses me. I don't think I ever had plans to implement it, although queueing via !Gmail wouldn't have been too hard.

WakeOnLAN

There's a little WakeOnLAN module here that will issue the wake up messages to other systems, given their MAC address. Appears to be pretty simple, but like the Ping and Traceroute modules it is of very limited use. It appears that in order that the WOL packet be ignored by the receiving system, I chose to use a ICMP RouterAdvertisement with no routers present. The idea seems to be that this shouldn't leave the local subnet, and that any RouterDiscovery clients should silently discard the redundant announcement - but, of course, the Ethernet device would wake up when it saw the payload with its MAC address in. The module certainly worked, but whether using an ICMP RouterAdvertisement was better than any other method, I've no idea.

InternetTime

The InternetTime module was really a way to get some form of time synchronisation that didn't rely on regular command line invocation, and could support multiple protocols. Normally you would have the RTC on the system keeping your time, and your system clock might drift from that but the RTC should be pretty good. However, you don't want to be probing the RTC too often - it's slow (usually) and in any case you shouldn't be drifting much (so don't leave IRQs turned off for long periods!). The RiscPC RTCAdjust module addressed this by trusting the RTC and changing the period of the centisecond timer to compensate - it should drift back to alignment (assuming no further problems) over a two hour period - it adjusts every hour.

However, this requires knowledge of how to manipulate the internal timer. There were interfaces in the works for manipulating timers so that this could be done in a generic way, but it was still not great. RTCAdjust wasn't so useful - especially on other non-RiscPC hardware (that module should not know anything about how to manipulate the timers that are owned by the Kernel).

Anyhow, aside from RTCAdjust we can use external clock sources, which is what InternetTime did, picking up time from a Time server, or an NTP server (or both), and would keep the clock up to date. The module would use the variables configured by DHCP for the servers.

Because of the aforementioned lack of interfaces for skewing the timers, the times are just set directly, which is a bit tacky, but got the job done - the actual implementation would have changed later when the APIs were available.

Like most of the service components - PPP, DHCPClient, RouterDiscovery, ZeroConf, ResolverMDNS, Resolver - InternetTime provided information about its behaviour through the statistics interface, so you could see stuff with *ShowStat. It had always been the intention that the ShowStat tool would be replaced by a desktop display of the details, but nothing ever got written. Chris Williams created !SockStats, which was a collection of the information that the *InetStat tool collected from the Internet module, but I don't think we advanced much further. The ShowStat tool itself actually says in its code that it's just an example of how you might do it.

Resolver

The Resolver module was a bit of a challenge really. I put off doing much with it for a long time, partly because It Did The Job, and partly because there were other more important things to deal with. When I started looking at how we got asynchronous updates as part of the Internet module - pollwords and events - it became clear that we needed some way that clients could know when a resolution was 'complete'. Otherwise everyone had to manually set up their own timers and re-poll the module to check if a result had arrived.

So it came to be that there were events for 'DNS Service Complete' created - notifying clients about different types of resolution completions once the module had determined their state. They weren't perfect because the event would be raised for a host name that might not be the one that you queried - because of canonicalisation, or search domain addition, or because the primary name for a resolution that had resolved was different to that expected (eg a host found in the Hosts file). So you still had to ask the Resolver the state of the host you wanted to know about. Not the best of things, but at least you got notified in a timely manner <sigh>.

Also, if you had a negative response, and two clients were making the request the first which checked the state would invalidate the entry (assuming that caching of lookup failures remained disabled), so the second would just get back a 'in progress' response because the Resolver tried to look it up again. That's not specific to the asynchronous notifications though - that happens anyhow. There were so many problems with the Resolver that I spent quite a while just cursing problem after problem. Clearly I introduced my own, but things like bad hostent structures being returned, or hostent structures being expired as you look at them are less of a problem now. Only, the expiry issue was a little of my own making - I made the Resolver actively expire entries, rather than just keeping them forever.

There was also the automatic nameserver discovery that I added, which had limited usefulness. The idea was to probe local addresses to find a resolver that we could use. It worked quite well, but was a little noisy and relied on the particular behaviour of common Operating Systems and DNS implementations in order to work. The automated discovery could also trigger when no resolvers responded within a time limit - which was quite nice if you had (say) a nameserver on your network, and then you moved to another network with a different server on it (there were other issues there, but resilience in the face of a changing network was one of the goals).

Along with the nameserver discovery came the nameserver reordering, which would try to pick a faster nameserver by changing the order of the known servers. Under experimental conditions this performed really well, doing exactly what I had hoped. However, when tested on real systems with 'flakey' nameservers, the recovery wasn't fast enough. Changing the thresholds meant that when used with normal nameservers on a busy machine could cause them to change around without much useful effect (but obviously doing more work). So really the only use might be to have a distant-but-functioning nameserver as your backup and a near-but-flakey server as your primary - a situation that generally didn't happen because, if your nearby server was flakey, you fixed it.

On the other hand, I've found that when I'm on the train using WiFi (with other OSs), this sort of behaviour from the nameserver is more common - the local being flakey, but the remote (eg Google's server) being more reliable. So maybe it would have helped with that.

Resolver was also multi-protocol, in the sense that it could talk to a few different resolution systems. Well, I say that, but it wasn't as nice as I wanted. I wanted to make Resolver into ResolverDNS, and create a new Resolver which was purely a switcher for FreewayHosts, Hosts file (which might be provided internally), LanManFS's NBNS, Resolver and ResolverMDNS (and other resolution protocols which could register themselves). As it was, Resolver talked to these itself.

Resolver could get the names from FreewayHosts and use them as resolutions if so requested. FreewayHosts would always register your hostname as a host on the network, which meant that any host which had FreewayHosts active should be resolvable by its hostname without any configuration of Hosts files or nameservers. Essentially this gave you link-local name resolution. Because there aren't enough of those protocols already; off the top of my head, Multicast DNS, NetBIOS Name Service, Link Local Multicast Name Resolution, Simple Service Discovery Protocol, Service Location Protocol, AppleTalk Name Binding Protocol, and of course Freeway.

Resolver could get the names from LanManFS's name services, so if there was a Windows machine (or a Samba system) on the network, its advertised name would be resolvable without any configuration. LanManFS had to be extended so that all its resolutions could happen asynchronously (previously they were only ever invoked synchronously).

It had always had the ability to use the Hosts file, so that wasn't any work. But it could also use the ResolverMDNS module to fetch hostnames announced by it. Linux systems running Avahi, or any modern Apple system, or Windows systems running the Rendezvous/Bonjour responder (hmm... does that include any system running iTunes then?) would appear available because of that.

Also, if the Resolver was configured to be a server it would announce itself as such using ResolverMDNS. I don't think I ever got it to pick up the multicast DNS server announcements though. Pretty sure I didn't.

Also never got around to adding extra type resolution through Resolver. It was never really necessary - vaguely interesting though it might have been to obtain TXT results and the like.

ResolverMDNS

I mentioned the ResolverMDNS module earlier, and its ability to decode names using the Apple Multicast DNS resolution protocol. This was a part of the general 'Zero Configuration' networking which was being more widely used. The library implementing it was released by Apple under license terms that were not too onerous, which meant that it was possible to use it within a module. The library itself expected compilers that supported packed structures, which was an issue as the Norcroft did not. The solution was to do what you usually did with such data - unpack it into a format which is in the expected structure layout. There weren't actually that many points at which the translation was needed, as the data going in and out happened in quite sensible ways.

Multicast DNS is intended to be used within a local subnet to announce and locate services which are provided by other systems. At its simplest level, it can announce the name of a host, which allows anyone on that network to locate the machine by name (consider this similar to NetBIOS name resolution used by LanManFS). However, it is more useful than this. The server can announce the services which have been registered with the system. For example, a web server might be announced as running on a server with a given name - say 'Intranet site', or 'Documentation'. This would be distinct from the name of the server itself. Printers can announce not only their name and type but also what capabilities they have, which greatly improves the automation of finding printer drivers. It is similar in many respects to Freeway, albeit Freeway is a very limited system.

Clients can also browse for services, as well as locating explicitly named services. Browsing for a printer is a typical example, which might offer many locations that printers can be found and their interfaces. A printer server might offer multiple views on the same printer if it supported transcoding of content, or could natively support different formats.

All these functions were provided by the ResolverMDNS module through SWI calls. As mentioned previously, the Resolver module had been updated to use the host lookups to perform simple resolutions.

My notes here say that EtherX and EtherY don't support multicast reception, so the whilst they could send packets, they would never receive anything back themselves. This hampered development for a little while - once I'd realised what the problem was, and changed network cards, it became easier.

I created a small toolbox gadget, MDNSService which would use a scrolling list to present services which had been discovered. The gadget would be updated with the new details as servers appeared and were removed externally. The results could be returned to the application, making the browsing for services much simpler than it would otherwise be.

Whilst I rather liked the module in general, the API still needed work to make it stable and sensible. In particular, it was continuing the tradition from Resolver of returning pointers to its own workspace. Whilst these were protected, and should have been safe to use most of the time, it really wasn't a good solution. I had intended to provide a proper way of returning the data in a user block.

It might seem like it was a function that wasn't absolutely necessary for the operation of the system, but it made a lot of sense in the general scheme. At the time it was written (2004), Apple systems were being more widely used, and the rise of broadband meant that ease of connection to any systems that users might want to add would be more important. The concept of a 'home hub' or similar set of services available across the entire site had been floated by people so many times - and it was going to happen. The efforts that Apple put into making things simple to the user (however it might be underneath) were obvious, and the multicast DNS system was a part of that.

Systems were going to be used more often in 'zero configuration' modes, and this would be more prevalent as portable devices were in use. Much of the work I had set about doing had been to make portable devices more feasible for use with RISC OS, this worked well. And, of course, in a static desktop, or other embedded type system it would be important to have capabilities to work with other clients.

MimeMap

The MimeMap module is vaguely interesting. I had my own that I had written before I joined RISCOS Ltd, and then we had the ANT one to use. I resisted importing my own because I didn't think it was production quality, it hadn't been tested as much and - more importantly to me - I didn't think it was right to just say 'well I wrote this so it must be better'. The ANT implementation was a lot smaller - I had reused a few earlier libraries (particularly libraries that weren't especially suitable for the job) but was quite a bit slower than mine at searching, and didn't have wildcard support.

The wildcards were quite important - and became more so when the XML types were declared to have +xml as their suffix. Additionally we needed to support Apple type mappings as we were including AppleTalk in the system. Features like automatically updating when the mappings file is changed or the system variable is updated were added, too. Some of the searches were sped up by trying the system with 'common' cases, mainly focusing of the extension to filetype conversion as these were the most likely - CDFS and (I think) DOSFS had already been updated to use the calls, and other clients, like ImageNFS were also using MimeMap as the way to convert types. LanManFS was still stuck in the dark ages, but I believe an option was present for using MimeMap over its internal mappings. Obviously, the MimeMap module was also used in decoding the media types returned from HTTP servers in a number of programs, so any improvement helped.

Anyhow, there were some speedups in both the common searches and the failed searches - which were actually more likely. Simple returns for empty strings were surprisingly effective as a fast reject (and the calls themselves could be removed in places).

SysLog

The SysLog module had been present for quite some time - obviously it was originally implemented by Doggysoft, and I reimplemented it from their documentation in the initial RISCOS Ltd days. Actually, I have a vague feeling that it was one of the modules I wrote between Hexen and RISCOS Ltd. In any case, it had the ability to act as both a SysLog server (accepting messages from other clients using the 'standard' SysLog protocol) and sending them to other clients for logging elsewhere. It wasn't the first network module I'd written, but it was the first that was really reliable and incredibly useful. During early development of components, output could be thrown at SysLog for logging. If it was configured to buffer data and direct it to another machine, you could get quite a lot logged and reliably captured elsewhere during the course of a test.

This was particularly useful when debugging the filing systems, as (obviously) you don't want to be writing to disc, and you want a 'fire and forget' logger. For FileCore, I think we actually used the DebugIt system - a module called DebugIt which had a SWI that was called to log a character or line - and various implementations existed of this module, one of which logged to memory, one logged to file, one just wrote to the screen, and then there was the SysLog variant which logged to SysLog. I think that's the reason that the SWI SysLog_LogCharacter interface exists - DebugIt would just call it, specifying the log name and level as DebugIt and 100 respectively.

Because it was a module developed quite early on, it stayed the same for quite some time. This meant that the module itself was a bad drain on the system when configured to be a server. It would poll for data every few centiseconds, and invariably would have nothing to do, so it was quite wasteful. It wasn't until quite late that the implementation was changed so that it used the Event vector instead. This made it a lot less wasteful.