Sunday, 30 September 2018

All i wanted to do is check an error code

I was feeling a little under the weather last week and did not have enough concentration to work on developing a new NetSurf feature as I had planned. Instead I decided to look at a random bug from our worryingly large collection.

This lead me to consider the HTML form submission function at which point it was "can open, worms everywhere". The code in question has a fairly simple job to explain:
  1. A user submits a form (by clicking a button or such) and the Document Object Model (DOM) is used to create a list of information in the web form.
  2. The list is then converted to the appropriate format for sending to the web site server.
  3. An HTTP request is made using the correctly formatted information to the web server.
However the code I was faced with, while generally functional, was impenetrable having accreted over a long time.

screenshot of NetSurf test form
At this point I was forced into a diversion to fix up the core URL library handling of query strings (this is used when the form data is submitted as part of the requested URL) which was necessary to simplify some complicated string handling and make the implementation more compliant with the specification.

My next step was to add some basic error reporting instead of warning the user the system was out of memory for every failure case which was making debugging somewhat challenging. I was beginning to think I had discovered a series of very hairy yaks although at least I was not trying to change a light bulb which can get very complicated.

At this point I ran into the form_successful_controls_dom() function which performs step one of the process. This function had six hundred lines of code, hundreds of conditional branches 26 local variables and five levels of indentation in places. These properties combined resulted in a cyclomatic complexity metric of 252. For reference programmers generally try to keep a single function to no more than a hundred lines of code with as few local variables as possible resulting in a CCM of 20.

I now had a choice:

  • I could abandon investigating the bug, because even if I could find the issue changing such a function without adequate testing is likely to introduce several more.
  • I could refactor the function into multiple simpler pieces.
I slept on this decision and decided to at least try to refactor the code in an attempt to pay back a little of the technical debt in the browser (and maybe let me fix the bug). After several hours of work the refactored source has the desirable properties of:

  • multiple straightforward functions
  • no function much more than a hundred lines long
  • resource lifetime is now obvious and explicit
  • errors are correctly handled and reported

I carefully examined the change in generated code and was pleased to see the compiler output had become more compact. This is an important point that less experienced programmers sometimes miss, if your source code is written such that a compiler can reason about it easily you often get much better results than the compact alternative. However even if the resulting code had been larger the improved source would have been worth it.

After spending over ten hours working on this bug I have not resolved it yet, indeed one might suggest I have not even directly considered it yet! I wanted to use this to explain a little to users who have to wait a long time for their issues to get resolved (in any project not just NetSurf) just how much effort is sometimes involved in a simple bug.

Tuesday, 7 August 2018

The brain is a wonderful organ; it starts working the moment you get up in the morning and does not stop until you get into the office.

I fear that I may have worked in a similar office environment to Robert Frost. Certainly his description is familiar to those of us who have been subjected to modern "open plan" offices. Such settings may work for some types of job but for myself, as a programmer, it has a huge negative effect.

My old basement officeWhen I decided to move on from my previous job my new position allowed me to work remotely. I have worked from home before so knew what to expect. My experience led me to believe the main aspects to address when home working were:
Isolation
This is difficult to mitigate but frequent face to face meetings and video calls with colleagues can address this providing you are aware that some managers have a terrible habit of "out of sight, out of mind" management
Motivation
You are on your own a lot of the time which means you must motivate yourself to work. Mainly this is achieved through a routine. I get dressed properly, start work the same time every day and ensure I take breaks at regular times.
Work life balance
This is often more of a problem than you might expect and not in the way most managers assume. A good motivated software engineer can have a terrible habit of suddenly discovering it is long past when they should have finished work. It is important to be strict with yourself and finish at a set time.
Distractions
In my previous office testers, managers, production and support staff were all mixed in with the developers resulting in a lot of distractions however when you are at home there are also a great number of possible distractions. It can be difficult to avoid friends and family assuming you are available during working hours to run errands. I find I need to carefully budget time to such tasks and take it out of my working time like i was actually in an office.
Environment
My previous office had "tired" furniture and decoration in an open plan which often had a negative impact on my productivity. When working from home I find it beneficial to partition my working space from the rest of my life and ensure family know that when I am in that space I am unavailable. You inevitably end up spending a great deal of time in this workspace and it can have a surprisingly large effect on your productivity.
Being confident I was aware of what I was letting myself into I knew I required a suitable place to work. In our previous home the only space available for my office was a four by ten foot cellar room with artificial lighting. Despite its size I was generally productive there as there were few distractions and the door let me "leave work" at the end of the day.

Garden office was assembled June 2017
This time my resources to create the space are larger and I wanted a place I would be comfortable to spend a lot of time in. Initially I considered using the spare bedroom which my wife was already using as a study. This was quickly discounted as it would be difficult to maintain the necessary separation of work and home.

Instead we decided to replace the garden shed with a garden office. The contractor ensured the structure selected met all the local planning requirements while remaining within our budget. The actual construction was surprisingly rapid. The previous structure was removed and a concrete slab base was placed in a few hours on one day and the timber building erected in an afternoon the next.

Completed office in August 2018
The building arrived separated into large sections on a truck which the workmen assembled rapidly. They then installed wall insulation, glazing and roof coverings. I had chosen to have the interior finished in a hardwood plywood being hard wearing and easy to apply finish as required.

Work desk in July 2017
Although the structure could have been painted at the factory Melodie and I applied this ourselves to keep the project in budget. I laid a laminate floor suitable for high moisture areas (the UK is not generally known as a dry country) and Steve McIntyre and Andy Simpkins assisted me with various additional tasks to turn it into a usable space.

To begin with I filled the space with furniture I already had, for example the desk was my old IKEA Jerker which I have had for over twenty years.

Work desk in August 2018
Since then I have changed the layout a couple of times but have finally returned to having my work desk in the corner looking out over the garden. I replaced the Jerka with a new IKEA Skarsta standing desk, PEXIP bought me a nice work laptop and I acquired a nice print from Lesley Mitchell but overall little has changed in my professional work area in the last year and I have a comfortable environment.

Cluttered personal work area
In addition the building is large enough that there is space for my electronics bench. The bench itself was given to me by Andy. I purchased some inexpensive kitchen cabinets and worktop (white is cheapest) to obtain a little more bench space and storage. Unfortunately all those flat surfaces seem to accumulate stuff at an alarming rate and it looks like I need a clear out again.

In conclusion I have a great work area which was created at a reasonable cost.

There are a couple of minor things I would do differently next time:
  • Position the building better with respect to the boundary fence. I allowed too much distance on one side of the structure which has resulted in an unnecessary two foot wide strip of unusable space.
  • Ensure the door was made from better materials. The first winter in the space showed that the door was a poor fit as it was not constructed to the same standard as the rest of the building.
  • The door should have been positioned on the end wall instead of the front. Use of the building showed moving the door would make the internal space more flexible.
  • Planned the layout more effectively ahead of time, ensuring I knew where services (electricity) would enter and where outlets would be placed.
  • Ensure I have an electrician on site for the first fix so electrical cables could be run inside the walls instead of surface trunking.
  • Budget for air conditioning as so far the building has needed heating in winter and cooling in summer.
In essence my main observation is better planning of the details matters. If i had been more aware of this a year ago perhaps I would not not be budgeting to replace the door and fit air conditioning now.

Wednesday, 1 August 2018

Irony is the hygiene of the mind

While Elizabeth Bibesco might well be right about the mind software cleanliness requires a different approach.

Previously I have written about code smells which give a programmer hints where to clean up source code. A different technique, which has recently become readily available, is using tool-chain based instrumentation to perform run time analysis.

At a recent NetSurf developer weekend Michael Drake mentioned a talk he had seen at the Guadec conference which reference the use of sanitizers for improving the security and correctness of programs.

Santizers differ from other code quality metrics such as compiler warnings and static analysis in that they detect issues when the program is executed rather than on the source code. There are currently two  commonly used instrumentation types:
address sanitizer
This instrumentation detects several common errors when using memory such as "use after free"
undefined behaviour sanitizer
This instruments computations where the language standard has behaviour which is not clearly specified. For example left shifts of negative values (ISO 9899:2011 6.5.7 Bit-wise shift operators)
As these are runtime checks it is necessary to actually execute the instrumented code. Fortunately most of the NetSurf components have good unit test coverage so Daniel Silverstone used this to add a build target which runs the tests with the sanitizer options.

The previous investigation of this technology had been unproductive because of the immaturity of support in our CI infrastructure. This time the tool chain could be updated to be sufficiently robust to implement the technique.

Jobs were then added to the CI system to build this new target for each component in a similar way to how the existing coverage reports are generated. This resulted in failed jobs for almost every component which we proceeded to correct.

An example of how most issues were addressed is provided by Daniel fixing the bitmap library. Most of the fixes ensured correct type promotion in bit manipulation, however the address sanitizer did find a real out of bounds access when a malformed BMP header is processed. This is despite this library being run with a fuzzer and electric fence for many thousands of CPU hours previously.

Although we did find a small number of real issues the majority of the fixes were to tests which failed to correctly clean up the resources they used. This seems to parallel what I observed with the other run time testing, like AFL and Valgrind, in that often the test environment has the largest impact on detected issues to begin with.

In conclusion it appears that an instrumented build combined with our existing unit tests gives another tool to help us improve our code quality. Given the very low amount of engineering time the NetSurf project has available automated checks like these are a good way to help us avoid introducing issues.

Friday, 1 June 2018

You can't make a silk purse from a sow's ear

Pile of network switches
I needed a small Ethernet network switch in my office so went to my pile of devices and selected an old Dell PowerConnect 2724 from the stack. This seemed the best candidate as the others were intended for data centre use and known to be very noisy.

I installed it into place and immediately ran into a problem, the switch was not quiet enough, in fact I could not concentrate at all with it turned on.

Graph of quiet office sound pressure
Believing I could not fix what I could not measure I decided to download an app for my phone that measured raw sound pressure. This would allow me to empirically examine what effects any changes to the switch made.

The app is not calibrated so can only be used to examine relative changes so a reference level is required. I took a reading in the office with the switch turned off but all other equipment operating to obtain a baseline measurement.

All measurements were made with the switch and phone in the same positions about a meter apart. The resulting yellow curves are the average for a thirty second sample period with the peak values in red.

The peak between 50Hz and 500Hz initially surprised me but after researching how a human perceives sound it appears we must apply the equal loudness curve to correct the measurement.

Graph of office sound pressure with switch turned onWith this in mind we can concentrate on the data between 200Hz and 6000Hz as the part of the frequency spectrum with the most impact. So in the reference sample we can see that the audio pressure is around the -105dB level.

I turned the switch on and performed a second measurement which showed a level around the -75dB level with peaks at the -50dB level. This is a difference of some 30dB, if we assume our reference is a "calm room" at 25dB(SPL) then the switch is causing the ambient noise level to similar to a "normal conversation" at 55dB(SPL).

Something had to be done if I were to keep using this device so I opened the switch to examine the possible sources of noise.

Dell PowerConnect 2724 with replacement Noctua fan
There was a single 40x40x20mm 5v high capacity sunon brand fan in the rear of the unit. I unplugged the fan and the noise level immediately returned to ambient indicating that all the noise was being produced by this single device, unfortunately the switch soon overheated without the cooling fan operating.

I thought the fan might be defective so purchased a high quality "quiet" NF-A4x20 replacement from Noctua. The fan has rubber mounting fixings to further reduce noise and I was hopeful this would solve the issue.

Graph of office sound pressure with modified switch turned on
The initial results were promising with noise above 2000Hz largely being eliminated. However the way the switch enclosure was designed caused airflow to make sound which produce a level around 40dB(SPL) between 200Hz and 2000Hz.

I had the switch in service for several weeks in this configuration eventually the device proved impractical on several points:

  • The management interface was dreadful to use.
  • The network performance was not very good especially in trunk mode.
  • The lower frequency noise became a distraction for me in an otherwise quiet office.

In the end I purchased an 8 port zyxel switch which is passively cooled and otherwise silent in operation and has none of the other drawbacks.

From this experience I have learned some things:

  • Higher frequency noise (2000Hz and above) is much more difficult to ignore than other types of noise.
  • As I have become older my tolerance for equipment noise has decreased and it actively affects my concentration levels.
  • Some equipment has a design which means its audio performance cannot be improved sufficiently.
  • Measuring and interpreting noise sources is quite difficult.