Wednesday, 25 July 2012

Fowlers Delight?

Fowler published his paper on Continuous Integration (CI) way back in 2000. Personally at the time I was a much more naive developer, I was not really an adherent to the whole XP community and had worked for a series of companies where the idea of a release build was of a month long process and the concept that the whole codebase could be built and tested on every commit was a comical notion at best.

By the time the time the 2006 edition of the paper came out however I was convinced. Ever since then I have tried to make the projects I worked on have as many CI features as possible. It is one reason I ran the ARM Kernel autobuilder for years as an attempt to provide at least build coverage.

I especially like the way these concepts mean that releases are much less "scary" events and release anxiety is reduced to a minimum. When every commit produces something that is of release quality a project really is doing it right.

The reason I mention any of this is I have recently set up the NetSurf projects CI server. The heavy lifting is being done by a Jenkins instance running on a VPS fulfilling the "Every Commit Should Build the Mainline on an Integration Machine" part of the Fowler guidelines.

Setting up Jenkins was much less trouble than many of the previous systems I have used. I have been a little underwhelmed by the support for plain old make based projects which you have to resort to shell scripts for, but leaving that aside the rest was straightforward. The documentation could do with a purge pass for the old hudson name and the error reporting (especially under memory pressure) is not great.

And there lies my one real issue with the tool, memory. The VPS has 512MByte RAM and runs nothing beyond a web server, jenkins and the C compiler to build the project components. One would hope that was plenty of memory, doing the builds by hand it did seem gcc is happy in that space, alas Jenkins is a Java application and guzzles RAM (over 60% of RSS right now) and causes itself out of memory exceptions with distressing frequency.

The project is left with the choices:
  • The project is relatively poor (and I already blew the budget) so "living with it" and keeping an eye on it manually is a possibility.
  • Extending the VPS with another 512MBytes of RAM (Due to my VPS provider choice I had not anticipated the need so this option would cost almost as much as the original set-up)
  • Buying another more suitable VPS from somewhere reputable like Mythic Beasts (why I did not go there in the first place...dumb, sometimes I am just dumb)
It is such a pity too, aside from the RAM issues it is all going well and it has already encouraged us to get the toolchains for the less commonly built targets (ppc amiga and windows for a start) working.

A sad state of affairs

Surprisingly my previous post on my Debconf trip has gathered more queries over its title than anything else.

Mostly my blog titles are, if I can manage it, relevant quotes or book titles. For the last post I used the title of a collection of Mark Twain scripts surrounding his trip through Nicaragua in 1866.

I have not been able to read the publication myself beyond the quoted excerpts in more modern articles because, somehow, almost 150 years after the words were written the only access to these words is to buy a very expensive and rare 1940 publication.

This appears to stem from the fact the original Clements scripts were simply not published before 1940 and hence appear to gain copyright from that date (IANAL I might be wrong here is the source I used) and thanks to the US government effectively making copyrighted works published after 1922 be "protected" forever I may never read it at all.

Rather sad really, but if you do have a copy I could borrow...oh no that is probably illegal too? better just let the words fade to dust eh?

Tuesday, 24 July 2012

Travels with Mr. Brown

My return from Debconf12 has been tinged with a little wistfulness, I had a great time but wish I could have spent a little more time there to justify the seventeen hours travel each way. I took a lot of pictures which gave me a good record of my trip.

The talks, BOF and discussions were, as usual, very useful. The release team explaining what needed to be done for Wheezy was both informative and amusing.

The numerous BOF from Steve Mcintyre were a great source of discussion and ideas and appear to have generated progress on some quite contentious issues.

I especially enjoyed the Sylvestre Ledru talk on building the archive with clang and how this might be another useful tool in finding bugs.

Hideki Yamane gave a really useful talk "Let's shrink Debian package archive!" He gave a practical explanation on how Debian could benefit from using xz compression, where it is not appropriate and had a selection of real numbers to help the discussion. Given this was Hideki first talk at a Debconf I must congratulate him on doing an excellent job.

There were many other talks which I have not singled out here but that says nothing about their quality or usefulness, more about why I should blog immediately after an event and not leave it a week. Though the video team have managed to capture many of the talks so you can go and watch them too.

The event was well organised and the accommodation was pleasant, if a little crowded with three to a room. The hotel had a pool which was the centre for evening activities most days, though I did miss Neil McGovern (one of my room mates) unintentionally swimming in his kilt.
The lunch and dinner catering was outdoors which was novel. The food was generally good if a little limited for those of us with less straightforward dietary requirements.

Some of us did venture out to have dinner at the continental hotel on one evening for a change of scene.
There was of course the obligatory conference meal by the lakeside and an awesome day trip where I saw a mangrove swamp and (fortunately) no salt water crocodiles.
All in all I had a fabulous and productive time. I would like to thank Collabora for travel sponsorship to the event and to Neil who was a great travelling companion.

Saturday, 26 May 2012

Each morning sees some task begun, each evening sees it close; Something attempted, something done, has earned a night's repose.

Thursday saw all the Collabora employees at Cambridge office go out and socialise at the beer festival. They seemed to have selected a wonderful day for it, the sun was shining and it was warm and blue sky day.

Alas, I had to attend some customer conference calls and work on some time sensitive research so I could not go to the ball as it were. At about eight my brain had run out of steam so I decided to call it a day and go and meetup with people at the festival for an hour or two.

The queue when I arrived dissuaded me from that notion. I asked one of the stewards and they indicated it would take at least an hour from where the queue finished.

So I decided to wend my way home along the bank of the Cam. I proceeded slowly along and to my utter surprise bumped into Ben Hutchings and his Solarflare work colleagues having their own soiree. I was immediately invited to sit and converse. Pretty quickly I was inveigled into accepting a glass of wine by John Aspden from his floating bar (AKA houseboat).

From here on my evening was a pleasant one of amusing new people, easy conversation and a definite pondering if the host would be discovering the delights of Cam swimming as he became progressively inebriated!

So although I missed the festival I did manage to have an enjoyable time. A big thanks to the solarflare guys and especially John who was the consummate host and provided me with far too much alcohol.

Thursday, 24 May 2012

Interrupt Service Routines

Something a little low level for this post. I have been asked recently how to "test" for the maximum duration of an Interrupt Service Routine (ISR) in Linux

To do this I probably ought to explain what the heck an ISR is!

A CPU executes one instruction after another and runs your programs. However early in the history of the electronic computer it soon became apparent that sometimes there were events happening, generally caused by a hardware peripheral, that required some other code to be executed without having to wait for the running program to check for the event.

This could have been solved by having a second processor to look after those exceptional events but that would have been expensive, difficult to synchronize and the designers took the view that there was a perfectly good processor already sat there just running some users program. This Interruption in the code flow became known as, well, an Interrupt (and the other approach as polling).

The hardware for supporting interrupts started out very simply, the processor would complete execution of the current instruction and when the Program Counter (PC) was about to be incremented if an Interrupt ReQest (IRQ) was pending the PC would be stored somewhere (often a "special" IRQ stack or register) and then execution started at some fixed address.

The interrupting event would be dealt with by some code and execution returned to the original program without it ever knowing the CPU just wandered off to do something else. The code that deals with the interrupt is known as the Interrupt Service Routine (ISR).

Now I have glossed over a lot of issues here (sufficient to say there are a huge number of details in which to hide the devil) but the model is good enough for my purpose. A modern CPU has a extraordinarily complex system of IRQ controllers to deal with numerous peripherals requesting the CPU stop what its doing and look after something else.

This system of controllers will ultimately cause program execution to be delivered to an ISR for that device. If we were living in the old single thread of execution world we could measure how long execution remains within an ISR, perhaps by using a physical I/O line as a semaphore and an external oscilloscope to monitor the line.

You may well ask "Why measure this?" well historically while the ISR was running nothing else could interrupt it executing which meant even if there was an event that was more important it would not get the CPU until the first ISR was complete. This was known as IRQ latency which was undesirable if you were doing something that required an IRQ to be serviced in a timely manner (like playing audio)

This is no longer how things are done while the top half runs with IRQ disabled many are threaded interrupt handlers and are preemptable (I.e. can be interrupted themselves) which leads to the first issue with measuring ISR time in that the ISR may be executed in multiple chunks if something more important interrupts. Indeed it may appear an ISR has taken many times longer one time than another because the CPU has been off servicing multiple other IRQ.

Then we have the issue that Linux kernel drivers often do as little as possible within their ISR, often only as much as is required to clear the physical interrupt line. Processing is then continued in a "bottom half" handler  this leads to ISR which take practically no time to execute but processing is still being caused elsewhere in the system.

The next issue is the world is not uniprocessor any more, how many processors does a machine have these days? even a small ARM SoC can often have two or even four cores. This makes our timing harder because  it is now possible to be servicing multiple interrupts from a single peripheral on separate cores at the same time!

In summary measuring ISR execution time is not terribly enlightening and almost certainly not what you are interested in. The actual question is much more likely that you really want to be examining something that the ISR time was an historical proxy for like IRQ latency or system overheads in locking.

Linux kernel presentation

Recently I was asked to present a short introduction to the Linux kernel for our project managers. I put together a short slide deck for the presentation which I have decided to share.
I feel its important to note that I had a lot more to say about each section and the slides were more an aid for my memory to cover the important points. Of special note would be the diagram showing the "hierarchy" of contributors, this is of course nowhere near as well stratified as portrayed.

Tuesday, 8 May 2012

NetSurf at a show

The wakefield RISC OS show is an event the NetSurf project has attended for a long time. in fact since 2005 when the "stand" was a name on an A4 sheet through 20062007, 2008, 20092010 to 2011 we have always been present.

The event has changed in that time from a large affair with many exhibitors to a small specialist interest event with a handful of stands. I took some pictures this year which give a fair impression of the event.

We were seriously considering not attending this year as 2011 had seen us barely break even on donations versus expenses to attend. However we decided that the projects annual Grey Ox Inn post event dinner was probably worth making the effort.

So we all met up in a hotel just off the M1 near Wakefield and set up our table. And although NetSurf as a project now has much more usage on other platforms we still represent the principle browser for the RISC OS platform!

We had a pleasant time, talked to a lot of users and made our expenses back in donations. Overall an amusing Saturday. Based on the size of the event and number and age of the attendees, I fear the RISC OS may be destined for the history books.