Wednesday 27 November 2019

Twice and thrice over, as they say, good is it to repeat and review what is good.

Three years ago I wrote about using the AFL fuzzer to find bugs in several NetSurf libraries. I have repeated this exercise a couple of times since then and thought I would summarise what I found with my latest run.

I started by downloading the latest version of AFL (2.52b) and compiling it. This went as smoothly as one could hope for and I experienced no issues although having done this several times before probably helps.

libnsbmp

I started with libnsbmp which is used to render windows bmp and ico files which remains a very popular format for website Favicons. The library was built with AFL instrumentation enabled, some output directories were created for results and a main and four subordinate fuzzer instances started.

vince@workshop:libnsbmp$ LD=afl-gcc CC=afl-gcc AFL_HARDEN=1 make VARIANT=debug test
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: src/libnsbmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 633 locations (64-bit, hardened mode, ratio 100%).
      AR: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/libnsbmp.a
 COMPILE: test/decode_bmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 57 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: test/decode_ico.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 71 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_ico
afl-cc 2.52b by <lcamtuf@google.com>
Test bitmap decode
Tests:1053 Pass:1053 Error:0
Test icon decode
Tests:609 Pass:609 Error:0
    TEST: Testing complete
vince@workshop:libnsbmp$ mkdir findings_dir graph_output_dir
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f02 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f02.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f03 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f03.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f04 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f04.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f05 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f05.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -M f01 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null

The number of subordinate fuzzer instances was selected to allow the system in question (AMD 2600X) to keep all the cores in use with a clock of 4GHz which gave the highest number of
AFL master instance after six days
executions per second. This might be improved with better cooling but I have not investigated this.

After five days and six hours the "cycle count" field on the master instance had changed to green which the AFL documentation suggests means the fuzzer is unlikely to discover anything new so the run was stopped.

Just before stopping the afl-whatsup tool was used to examine the state of all the running instances.

vince@workshop:libnsbmp$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 26 days, 5 hours
         Total execs : 2873 million
    Cumulative speed : 6335 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

Just for completeness there is also the graph of how the fuzzer performed over the run.

AFL fuzzer performance over libnsbmp run

There were no crashes at all (and none have been detected through fuzzing since the original run) and the 78 reported hangs were checked and all actually decode in a reasonable time. It seems the fuzzer "hang" detection default is simply a little aggressive for larger images.

libnsgif

I went through a similar setup with libnsgif which is used to render the GIF image format. The run was performed on a similar system running for five days and eighteen hours. The outcome was similar to libnsbmp with no hangs or crashes.


vince@workshop:libnsgif$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 28 days, 20 hours
         Total execs : 7710 million
    Cumulative speed : 15474 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

libsvgtiny

AFL fuzzer results for libsvgtiny
I then ran the fuzzer on the SVG render library using a dictionary to help the fuzzer cope with a sparse textural input format. The run was allowed to continue for almost fourteen days with no crashes or hangs detected.

In an ideal situation this run would have been allowed to continue but the system running it required a restart for maintenance.

Conclusion

The aphorism "absence of proof is not proof of absence" seems to apply to these results. While the new fuzzing runs revealed no additional failures it does not mean there are no defects in the code to find. All I can really say is that the AFL tool was unable to find any failures within the time available.

Additionally the AFL test corpus produced did not significantly change the code coverage metrics so the existing set was retained.

Will I spend the time again in future to re-run these tests? perhaps, but I think more would be gained from enabling the fuzzing of the other NetSurf libraries and picking the low hanging fruit from there than expending thousands of hours preforming these runs again.

Thursday 11 July 2019

We can make it better than it was. Better...stronger...faster.

It is not a novel observation that computers have become so powerful that a reasonably recent system has a relatively long life before obsolescence. This is in stark contrast to the period between the nineties and the teens where it was not uncommon for users with even moderate needs from their computers to upgrade every few years.

This upgrade cycle was mainly driven by huge advances in processing power, memory capacity and ballooning data storage capability. Of course the software engineers used up more and more of the available resources and with each new release ensured users needed to update to have a reasonable experience.

And then sometime in the early teens this cycle slowed almost as quickly as it had begun as systems had become "good enough". I experienced this at a time I was relocating for a new job and had moved most of my computer use to my laptop which was just as powerful as my desktop but was far more flexible.

As a software engineer I used to have a pretty good computer for myself but I was never prepared to spend the money on "top of the range" equipment because it would always be obsolete and generally I had access to much more powerful servers if I needed more resources for a specific task.

To illustrate, the system specification of my desktop PC at the opening of the millennium was:
  • Single core Pentium 3 running at 500Mhz
  • Socket 370 motherboard with 100 Mhz Front Side Bus
  • 128 Megabytes of memory
  • A 25 Gigabyte Deskstar hard drive
  • 150 Mhz TNT 2 graphics card
  • 10 Megabit network card
  • Unbranded 150W PSU
But by 2013 the specification had become:
    2013 PC build still using an awesome beige case from 1999
  • Quad core i5-3330S Processor running at 2700Mhz
  • FCLGA1155 motherboard running memory at 1333 Mhz
  • 8 Gigabytes of memory
  • Terabyte HGST hard drive
  • 1,050 Mhz Integrated graphics
  • Integrated Intel Gigabit network
  • OCZ 500W 80+ PSU
The performance change between these systems was more than tenfold in fourteen years with an upgrade roughly once every couple of years.

I recently started using that system again in my home office mainly for Computer Aided Design (CAD), Computer Aided Manufacture (CAM) and Electronic Design Automation (EDA). The one addition was to add a widescreen monitor as there was not enough physical space for my usual dual display setup.

To my surprise I increasingly returned to this setup for programming tasks. Firstly because being at my desk acts as an indicator to family members I am concentrating where the laptop was no longer had that effect. Secondly I really like the ultra wide display for coding it has become my preferred display and I had been saving for a UWQHD

Alas last month the system started freezing, sometimes it would be stable for several days and then without warning the mouse pointer would stop, my music would cease and a power cycle was required. I tried several things to rectify the situation: replacing the thermal compound, the CPU cooler and trying different memory, all to no avail.

As fixing the system cheaply appeared unlikely I began looking for a replacement and was immediately troubled by the size of the task. Somewhere in the last six years while I was not paying attention the world had moved on, after a great deal of research I managed to come to an answer.

AMD have recently staged something of a comeback with their Ryzen processors after almost a decade of very poor offerings when compared to Intel. The value for money when considering the processor and motherboard combination is currently very much weighted towards AMD.

My timing also seems fortuitous as the new Ryzen 2 processors have just been announced which has resulted in the current generation being available at a substantial discount. I was also encouraged to see that the new processors use the same AM4 socket and are supported by the current motherboards allowing for future upgrades if required.

I Purchased a complete new system for under five hundred pounds, comprising:
    New PC assembled and wired up
  • Hex core Ryzen 5 2600X Processor 3600Mhz
  • MSI B450 TOMAHAWK AMD Socket AM4 Motherboard
  • 32 Gigabytes of PC3200 DDR4 memory
  • Aero Cool Project 7 P650 80+ platinum 650W Modular PSU
  • Integrated RTL Gigabit networking
  • Lite-On iHAS124 DVD Writer Optical Drive
  • Corsair CC-9011077-WW Carbide Series 100R Silent Mid-Tower ATX Computer Case
to which I added some recycled parts:
  • 250 Gigabyte SSD from laptop upgrade
  • GeForce GT 640 from a friend
I installed a fresh copy of Debian and all my CAD/CAM applications and have been using the system for a couple of weeks with no obvious issues.

An example of the performance difference is compiling NetSurf from a clean with empty ccache used to take 36 seconds and now takes 16 which is a nice improvement, however a clean build with the results cached has gone from 6 seconds to 3 which is far less noticeable and during development a normal edit, build, debug cycle affecting only of a small number of files has gone from 400 milliseconds to 200 which simply feels instant in both cases.

My conclusion is that the new system is completely stable but that I have gained very little in common usage. Objectively the system is over twice as fast as its predecessor but aside from compiling large programs or rendering huge CAD drawings this performance is not utilised. Given this I anticipate this system will remain unchanged until it starts failing and I can only hope that will be at least another six years away.

Tuesday 19 February 2019

A very productive weekend

I just hosted a NetSurf Developer weekend which is an opportunity for us to meet up and make use of all the benefits of working together. We find the ability to plan work and discuss solutions without loosing the nuances of body language generally results in better outcomes for the project.

NetSurf Development build
Due to other commitments on our time the group has not been able to do more than basic maintenance activities in the last year which has resulted in the developer events becoming a time to catch up on maintenance rather than making progress on features.

Because of this the July and November events last year did not feel terribly productive, there were discussions about what we should be doing and bugs considered but a distinct lack of commuted code.

As can be seen from our notes this time was a refreshing change. We managed to complete a good number of tasks and actually add some features while still having discussions, addressing bugs and socialising.

We opened on the Friday evening by creating a list of topics to look at over the following days and updating the wiki notes. We also reviewed the cross compiler toolchains which had been updated to include the most recent releases for things like openssl, curl etc.

As part of this review we confirmed the decision to remove the Atari platform from active support as its toolchain builds have remained broken for over two years with no sign of any maintainer coming forward.

While it is a little sad to see a platform be removed it has presented a burden on our strained resources by requiring us to maintain a CI worker with a very old OS using tooling that can no longer be replicated. The tooling issue means a developer cannot test changes locally before committing so testing changes that affected all frontends was difficult.

Saturday saw us clear all the topics from our list which included:
  • Fixing a bug preventing compiling our reference counted string handling library.
  • Finishing the sanitizer work started the previous July
  • Fixing several bugs in the Framebuffer frontend installation.
  • Making the Framebuffer UI use the configured language for resources.
The main achievement of the day however was implementing automated system testing of the browser. This was a project started by Daniel some eight years ago but worked on by all of us so seeing it completed was a positive boost for the whole group.

The implementation consisted of a frontend named monkey. This frontend to the browser takes textural commands to perform operations (i.e. open a window or navigate to a url) and generates results in a structured text format. Monkey is driven by a python program named monkeyfarmer which runs a test plan ensuring the results are as expected.

This allows us to run a complete browsing session in an automated way, previously someone would have to manually build the browser and check the tests by hand. This manual process was tedious and was rarely completed across our entire test corpus generally concentrating on just those areas that had been changed such as javascript output.

We have combined the monkey tools and our test corpus into a CI job which runs the tests on every commit giving us assurance that the browser as a whole continues to operate correctly without regression. Now we just have the task of creating suitable plans for the remaining tests. Though I remain hazy as to why, we became inordinately amused by the naming scheme for the tools.

Google webp library gallery rendered in NetSurfWe rounded the Saturday off by going out for a very pleasant meal with some mutual friends. Sunday started by adding a bunch of additional topics to consider and we made good progress addressing these. 

We performed a bug triage and managed to close several issues and commit to fixing a few more. We even managed to create a statement of work of things we would like to get done before the next meetup.

My main achievement on the Sunday was to add WEBP image support. This uses the Google libwebp library to do all the heavy lifting and adding a new image content handler to NetSurf is pretty straightforward.