Showing posts with label Netsurf. Show all posts
Showing posts with label Netsurf. Show all posts

Wednesday, 27 November 2019

Twice and thrice over, as they say, good is it to repeat and review what is good.

Three years ago I wrote about using the AFL fuzzer to find bugs in several NetSurf libraries. I have repeated this exercise a couple of times since then and thought I would summarise what I found with my latest run.

I started by downloading the latest version of AFL (2.52b) and compiling it. This went as smoothly as one could hope for and I experienced no issues although having done this several times before probably helps.

libnsbmp

I started with libnsbmp which is used to render windows bmp and ico files which remains a very popular format for website Favicons. The library was built with AFL instrumentation enabled, some output directories were created for results and a main and four subordinate fuzzer instances started.

vince@workshop:libnsbmp$ LD=afl-gcc CC=afl-gcc AFL_HARDEN=1 make VARIANT=debug test
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: src/libnsbmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 633 locations (64-bit, hardened mode, ratio 100%).
      AR: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/libnsbmp.a
 COMPILE: test/decode_bmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 57 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: test/decode_ico.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 71 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_ico
afl-cc 2.52b by <lcamtuf@google.com>
Test bitmap decode
Tests:1053 Pass:1053 Error:0
Test icon decode
Tests:609 Pass:609 Error:0
    TEST: Testing complete
vince@workshop:libnsbmp$ mkdir findings_dir graph_output_dir
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f02 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f02.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f03 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f03.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f04 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f04.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f05 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f05.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -M f01 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null

The number of subordinate fuzzer instances was selected to allow the system in question (AMD 2600X) to keep all the cores in use with a clock of 4GHz which gave the highest number of
AFL master instance after six days
executions per second. This might be improved with better cooling but I have not investigated this.

After five days and six hours the "cycle count" field on the master instance had changed to green which the AFL documentation suggests means the fuzzer is unlikely to discover anything new so the run was stopped.

Just before stopping the afl-whatsup tool was used to examine the state of all the running instances.

vince@workshop:libnsbmp$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 26 days, 5 hours
         Total execs : 2873 million
    Cumulative speed : 6335 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

Just for completeness there is also the graph of how the fuzzer performed over the run.

AFL fuzzer performance over libnsbmp run

There were no crashes at all (and none have been detected through fuzzing since the original run) and the 78 reported hangs were checked and all actually decode in a reasonable time. It seems the fuzzer "hang" detection default is simply a little aggressive for larger images.

libnsgif

I went through a similar setup with libnsgif which is used to render the GIF image format. The run was performed on a similar system running for five days and eighteen hours. The outcome was similar to libnsbmp with no hangs or crashes.


vince@workshop:libnsgif$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 28 days, 20 hours
         Total execs : 7710 million
    Cumulative speed : 15474 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

libsvgtiny

AFL fuzzer results for libsvgtiny
I then ran the fuzzer on the SVG render library using a dictionary to help the fuzzer cope with a sparse textural input format. The run was allowed to continue for almost fourteen days with no crashes or hangs detected.

In an ideal situation this run would have been allowed to continue but the system running it required a restart for maintenance.

Conclusion

The aphorism "absence of proof is not proof of absence" seems to apply to these results. While the new fuzzing runs revealed no additional failures it does not mean there are no defects in the code to find. All I can really say is that the AFL tool was unable to find any failures within the time available.

Additionally the AFL test corpus produced did not significantly change the code coverage metrics so the existing set was retained.

Will I spend the time again in future to re-run these tests? perhaps, but I think more would be gained from enabling the fuzzing of the other NetSurf libraries and picking the low hanging fruit from there than expending thousands of hours preforming these runs again.

Tuesday, 7 January 2014

Healthy discontent is the prelude to progress

Back in the mists of Internet time (2002) when spirits were brave, the stakes were high, men were real men, women were real women and small furry creatures from Alpha Centauri were real small furry creatures from Alpha Centauri there were few hosting options for a budding open source project.

Despite the meteoric rise of many dot com companies in the early noughties a project either had to go it alone by running everything themselves or pick from a small selection of companies that had not yet worked out how to turn free hosting into money.

One such company, arguably the first, was VA Research with their SourceForge system. Many projects used their platform including a small niche web browser called NetSurf. For years the service was great, there were rough edges but nothing awful.

Netsurf issue tracker in the new SourceForge interface
Over time the NetSurf projects requirements grew beyond what SourceForge could provide and service after service was migrated away eventually all that was left was the bug tracking system. This remained the state of affairs until mid 2013 when SourceForge forced a migration to their new platform which made them unsuitable for the projects use case.

Aside from the issue trackers questionable user interface SourceForge had started aggressively placing advertising throughout their platform. Some of the placements so inadvisable that projects started taking the decision to leave.

While I appreciate that SourceForge had to make money to provide a service they appear to have sown discontent within a large part of their user base without understanding that there are a number of alternatives solutions with a much less onerous funding model.

NetSurf used this as an opportunity to move the remaining issue tracking service to our own infrastructure. Rob Kendrick proceeded to evaluate several solutions and in December 2013 I finally found the time to migrate an XML dump of the old data from SourceForge into MantisBT.

Migrating data from one database into another via incompatible formats took me back to my roots. My early career started with programming tasks moving historical business data from ancient large systems which were about to be scrapped to modern Sparc based systems. Later I would be in a role where financial data needed to be retrieved from obsolete proprietary systems and moved into databases on x86 servers.

My experience in this field was not really stretched as it turns out that modern systems can process a few tens of megabytes in seconds rather than the days for a run in my youth! So some ugly perl scripts and a few hours later I had a nice shiny SQL database filled with NetSurfs bugs and MantisBT instance configured to use them.

NetSurf MantisBT instance showing most recently updated open bugs
Then came the hard bit, triaging all the open bugs, fixing up all the bugs submitted as an anonymous user but with email addresses in, removing duplicates and checking every open bug was still valid took almost two weeks of tedious drudge.

I set up an initial bug workflow within the system that the project developers are still fine tuning to better suit their needs but overall Mantis is proving a very flexible tool. The main deficiencies center around configuration for the projects useage, especially removing unused fields from filters and making the workflow more intuitive..

The resulting system is now getting bug reports submitted again where the sourceforge system had had three in the six months since the forced migration.

The issue tracker is once more a useful tool for the developers allowing us to focus on areas actually causing problems for our users and allowing us to see progress we are making fixing issues.

Overall this was a successful migration and provides a platform the NetSurf project can control where we can offer guarantees to our users about their personal information usage and having a clean rapid interface with no advertisements.

Monday, 6 January 2014

NetSurf Developer Workshop Redux

The NetSurf Developers bringing you alternative LOLcat viewing software since 2002Once again the NetSurf developers congregated in Cambridge at the Collabora offices where we were made welcome in a nice environment for the event.

Five developers managed to attend from around the UK Rob Kendrick, Vincent Sanders, Daniel Silverstone, John-Mark Bell and Michael Drake. We also had Chris Young and François Revol providing some bug fixes remotely.

This was the first time we had all met since the previous event towards the end of 2012 and we took full advantage of this to discuss a pretty extensive agenda in addition to the practical programming tasks.

From Friday lunchtime through to Sunday evening we managed 30 hours of work consisting of over 70 commits to over 100 files.

The whiteboard of our notes

Our main focus was working towards a 3.1 release which is scheduled for early April. Along with the source the release will have binary builds for RISC OS, AmigaOS, Windows and Mac OS X (x86 and ppc). Although the NetSurf project will not be directly releasing binaries for the GTK and Framebuffer frontends we will be ensuring the Debian packages are updated which is our prefered method of distribution for those targets.

We analysed the 3.0 release and formulated an improved process for the future. The 3.1 release will be generated automatically by the CI system ensuring constant results and removing the problems we encountered previously.

A set of release blocking issues was derived which we used as a task list during the workshop.The majority of these were completed including:
DOM based forms
Web forms are a feature Netsurf has supported for a long time and their implementation has not kept up with the rest of the browser. This is a long standing problem area which has resulted in numerous strange bugs with form submission. With this change the form system has been reworked to correctly operate directly from the DOM resulting in the squashing of a large number of bugs and a much improved user experience.

DOM based image loading
Up to now image fetching was performed only during the rendering of a page. With this change when the image link is placed into the DOM during the page parse it is scheduled to be fetched, this should give an improved user experience as images should be available earlier in a pages render.

Removal of MNG support
NetSurf has supported MNG since the 1.0 release, indeed the MNG library used to provide the PNG support too though we have long ago transitioned to libPNG. Alas the web has moved on and MNG has been largely forgotten, the libMNG library that performs the image decoding is old and generally unsupported specifically lacking security updates.

The build issues with libMNG (lack of pkg-config, reliance on libcms1 etc.) were causing maintenance issues in code nobody was actually using (there were crash bugs discovered during its removal!). Because of these issues it was decided to join the vast majority of browsers and remove support for this format.
The developers also addressed several issues with toolchain construction and a number of annoying usability bugs.

Plans for how to improve printing support we made. Initially we intend to fix the existing haru based pdf generation using this to print via pdf and in future have correct css styled page paginated printing render output.

The perennial issue of javascript was discussed however, while efforts to improve the existing support are ongoing, our usage of the spidermonkey library continues to raise various challenges including platform support and API changes between versions.

Due to these issues it has been suggested that we might add support for using the duktape JS engine instead, initial results are promising but given the size of the task of implementing an additional javascript engine binding further investigation is necessary before making a commitment.

Amongst the other discussions the group has also agreed that we will once again apply to be a GSoC organisation for a single student with some very focused projects:
  • Improving our HTML5 parser (hubbub)
  • Improving the DOM library implementing missing functionality.
While neither of these projects are as fashionable as some of our previous proposals they are well defined enough that as a group we believe we could offer enough support to the student to make their experience a pleasurable one and get the resulting code reviewed and merged promptly.

This event was very successful with a great deal achieved, the project is much more likely to be in a good shape to release 3.1 by April now and the meeting has given the developers a much welcomed boost.

I would like to extend the groups thanks to Robert McQueen for letting us use the Collabora offices, Dorée Carrier for organising all the administrative things and to Vivek Dasmohapatra for coming out on his Sunday afternoon to let us back in after locking ourselves out.


Thursday, 7 March 2013

Man cannot discover new oceans unless he has the courage to lose sight of the shore

Continuing with my whirlwind introduction to NetSurf Development now is the time to start examining the code, how its arranged and how to interact with the existing developers.

The way the NetSurf source is structured is around the idea of the frontends each being a native browser. While this implies that there are nine separate browsers that happen to share a common code base the separation is not quite that well defined.

Each frontend provides the OS entry point (the main() function in a c program) and calls out to standard browser initialisation entry function netsurf_init() and then starts running the browsers main dispatch loop with  netsurf_main_loop() when that exits the frontend cleans up with netsurf_exit().

The frontends provide a large selection of functions which are called from the core code. These routines run from running the event scheduler through to rendering graphics and text.

Finding your way around

The browsers directory layout is fairly shallow consisting of some Makefiles, the nine frontend directories and eleven others.

The Makefiles are GNU make and represent a pretty straightforward linear build system. We do not use recursive make or autotools. There are plans to use the core buildsystem that all the other components use.

The frontend directories contain the code for the frontend and the makefile fragments to build them which are included by the top level Makefile.

In addition there is:
desktop
This contains the non frontend specific code that actually behaves like a browser. For example desktop/netsurf.c contains the three primary functions we outlined in the introduction. You will also find much of the function and data structures interfaces the frontend must provide. It is unlikely someone new to the project will need to change anything in here (there be dragons) but is an important set of routines.
utils
Here you will find the utility and compatibility interfaces, things like url handling, logging, user messages, base64 handling. These are utility interfaces that do not justify splitting out their functionality to a separate library but are useful everywhere. Changing an interface in here would likely result in a major refactor.
For example one quirk is the logging macro was created before varadic C preprocessor macros were universal so it must be called as LOG(("hello %s world", world_type)) e.g. with double brackets. Fixing this and perhaps improving the logging functionality would be "nice" but the changes would be massive and potentially conflict with ongoing work.
content
This contains all the core code to handle contents i.e. html, css, javascript, images. The handling includes retrieval of the resources from URI , correct caching of the received objects and managing the objects state. It should be explicitly stated that the content handlers are separate and use this core functionality. The actual content handlers to deal with object contents such as the routines to decode image files or render html are elsewhere.
image
This is where the content handlers for the various image types are kept. The majority of these image types (jpeg, png, webp, gif, bmp) use a standard library to perform the actual decode (libjpeg, libpng etc.). One special feature used by most image handlers is that of a decoded image cache which is distinct and separate from the content cache.
The decoded image cache manages decoding of the source images (the jpegs and pngs) into frontend specific "render" bitmaps. For example the gtk frontend keeps the decoded images as a cairo surface ready for immediate plotting.
The cache uses a demand based (when the browser actually displays the image) just in time strategy which has been carefully balanced, with real world input, to reduce the overhead of unnecessary image decoding and processing against memory usage for the render bitmaps.
css
The css content handlers provide for the processing of css source text and use the NetSurf libcss library to process into a bytecode suitable for applying style selection at render time.
javascript
The javascript handlers (strictly speaking this should be named the "script" directory as all types of script are handled here) provide basic functionality to bind a javascript engine to the rest of the browser, principally the Document Object Model (DOM) accessed with libdom. The only engine that currently has bindings is Spidermonkey from the Mozilla foundation.
render
This is the heart of the browser containing the content handler for html (and plain text). This handler deals with:
  • Acquiring the html.
  • Running the base parser as data arrives which generates the DOM and hence DOM events from which additional resource (stylesheets etc.) fetches are started.
  • Deal with script loading
  • Constructing the box model used for layout
  • Performing the Layout and rendering of the document.
Because this module has so many jobs to do it has inevitably become very complex and involved, it is also the principle area of core development . Currently NetSurf lacks a dynamic renderer so changes made by scripts post document load event are not visible. This also has the side effect that the render is only started after the DOM has finished construction and all the resources on a page have completed their fetch which can lead to undesirable display latency.
Docs
Documentation about building and using NetSurf. If anyone wants a place to start improving NetSurf, this is it, it is very incomplete. It must be noted this is not where dynamically generated documentation is found. For the current Doxygen output the best place to look is the most recent build on the Continuous Integration system.
resources
These are runtime resources which are common to all frontends. To be strictly correct they may be the sources which get converted into runtime resources e.g. The FatMessages file which is teh message text for all frontends in all languages, this gets processed at build time into separate files ready.
!NetSurf
This is another resources directory and technically the resources for the riscos frontend. The naming and reliance on this directory are historical. To allow the RISC OS frontend to be run directly from the source directory and an inability of RISC OS to process symbolic links most common runtime resources end up in here and linked to from elsewhere.
test
These are some basic canned test programs and files, principally to test elements of the utils and perform specific exercise of various javascript components.

Getting started

Once a developer has a checked out working build environment and can run the executable for their chosen frontend (and maybe done some web browsing :-) it is time to look at contributing. 

If a developer does not have a feature or bug in mind when they begin the best way to get started is often to go bug hunting. The NetSurf bugtacker has lots to choose from unfortunately. Do remember to talk to us (IRC is the best bet if you are bug hunting) about what you are up to but do not be impatient. Some of those bugs are dirty great Shelob types and are not being fixed because even the core developers are stumped!

When first getting going I cannot recommend reading the code enough, this seems to be a skill that many inexperienced open source developers have yet to acquire, especially if they are from a predominately proprietary development background. One wonderful feature of open source software is you get to see all of it, all the elegant nice code and all the "what the hell were they thinking" code too.

One important point is to use your tools well the source is in git, if you learn how to use git well you will gain a skill that is readily portable, not just for NetSurf. And not just revision control tools, learn to use your debugger well and tools like valgrind. Those skills will replay the time spent learning them themselves many times over.

When using git one thing to remember is commit early and often, it does not matter if you have lots of junk commits and dead ends, once you have something viable you can use
git rebase --interactive
and rewrite it all into a single sensible set of commits. Also please do remember to develop on a branch, or if you did not
git pull --rebase
is your friend to avoid unnecessary merges.

Playing nicely with others

The NetSurf community is a small band of unpaid volunteers. On average we manage to collectively put in, perhaps, ten hours a week with the occasional developer weekend where we often manage over twenty hours each.

The result is that developer time is exceptionally valuable, add in a mature codebase with its fair share of technical debt and you get a group who, when they get to work on NetSurf, are incredibly busy. To be fair we are just like every other small open source project in that respect.

What this means to a new contributor is that they should try and do their homework before asking the obvious questions. The documentation is there for a reason and in spite of its obvious shortcomings please read it!

When asking questions it should be noted that currently the majority of active contributors are in the Europe so if you visit the IRC channel or post questions to lists the time difference is something to keep in mind.

I carefully said contributor above and not developer, users trying the CI builds and reporting results are welcome...as long as they report useful bugs to the bug tracker. Simon Tatham has produced an excellent resource on this subject.

Also we are always happy to receive translations to new languages (diff against the FatMessages file would be outstandingly useful but anything is welcome), artwork, documentation. Just recall what I mentioned about busy developers. Surest way to get us to see something is the development mailing list, you will probably get a reply, though I will not promise how fast!

Some of the more common mistakes when interacting with the community are:
  • Demanding we fix or add a feature. At best we will ignore you...though merciless sarcasm is not an unusual response to this. Perhaps a polite suggestion to the users mailing list would get better response? This is simple case of misunderstanding the relationship with the developers, you got the software for free so demanding we spend our leisure time to change it for you is impolite, or at least that is how I see it (and I am British, we do polite to excess).
  • Request write access to the git repository without a proven track record. We are fairly open to new developers once they have a track record but initially contributions should be via patch series on the mailing list we can feed to git-am. Eventually we may give you commit access to your own personal branch space and from there extend to the rest of the repository.
  • Developing a feature without talking to the team first and then being upset when we reject it. This is especially aggravating for all concerned as effort is wasted all around. If you have a great idea for a feature talk to us first! And if we indicate in our typically polite way that it is not going to be accepted listen to us! Of course you are free to ignore us, just please do not be upset later on.
  • Non-constructive criticism. What I refer to here is finding fault in our software without logging a bug or otherwise providing something to respond to. We try to provide the best software we can and by extension have a great deal of pride in our project. This antisocial behaviour helps no one but can have a large negative impact on developer productivity.

In conclusion

Hopefully this has been of some use although I had hoped to cover more and provide deeper insights and advice on the codebase but there is only so much generalisation to be done before it is just easier for the developer to go read the code for themselves. 

I look forward to lots of new contributions :-) though I fear this may all end up as more of a crib sheet for next time we do GSOC, time will tell.

The way to get started is to quit talking and begin doing

When Walt Disney said that he almost certainly did not have software developers in mind. However it is still good advice, especially if you have no experience with a piece of software you want to change.

Others have written extensively on the topic of software as more than engineering and the creative aspects, comparing it to a craft, which is a description I am personally most comfortable with. As with any craft though you have to understand the material you have to work with and an existing codebase is often a huge amount of material.

While developing NetSurf we get a lot of people who talk a lot about what we "ought" or "should" do to improve the browser but few who actually get off their collective posteriors and contribute. In fact according to the ohloh analysis of the revision control system there are roughly six of us who contribute significantly on a regular basis and of those only four who have interests beyond a specific frontend.

It has been mentioned by a couple of new people who have recently compiled the browser from source that it is somewhat challenging to get started. To address this criticism I intend to give a whirlwind introduction to getting started with the NetSurf codebase, perhaps we can even get some more contributors!

This first post will cover the mechanics of acquiring and building the source and the next will look at working with the codebase and the Netsurf community.

This is my personal blog, the other developers might disagree with my approach, which is why this is in my blog and not on the NetSurf website. That being said comments are enabled and I am sure they will correct anything I get wrong.

Resources

NetSurf has a selection of resources which are useful to a new developer:

Build environment

The first thing a new developer has to consider is their build environment. NetSurf supports nine frontends on several Operating Systems (OS) but is limited on the build environment that can be used.

The developer will require a Unix like system but let's be honest, we have not tried with anything other than Linux distributions in some time or MAC OS X for the cocoa frontend because its a special snowflake.

Traditionally at this point in this kind of introduction it would be traditional to provide the command line for various packaging systems to install the build environment and external libraries. We do have documentation that does this but no one reads it, or at least it feels like that. Instead we have chosen to provide a shell fragment that encodes all the bootstrap knowledge in one place, its kept in the revision control system so it can be updated.

To use: download it, read it (hey running random shell code is always a bad idea), source it into your environment and run ns-apt-get-install on a Debain based system or ns-yum-install on Fedora. The rest of this posting will assume the functionality of this script is available, if you want to do it the hard way please refer to the script for the relevant commands and locations.

For Example:
$ wget http://git.netsurf-browser.org/netsurf.git/plain/Docs/env.sh
$ less env.sh
$ source env.sh
$ ns-apt-get-install

Historically NetSurf built on more platforms natively but the effort to keep these build environments working was extensive and no one was prepared to do the necessary maintenance work. This is strictly a build setup decision and does not impact the supported platforms.

Since the last release NetSurf has moved to the git version control system from SVN. This has greatly improved our development process and allows for proper branching and merging we previously struggled to implement.

In addition to the core requirements external libraries NetSurf depends on will need to be installed. Native frontends where the compiled output is run on the same system it was built on are pretty straightforward in that the native package management system can be used to install the libraries for that system.

For cross building to the less common frontends we provide a toolchain repository which will build the entire cross toolchain and library set (we call this the SDK) direct from source. This is what the CI system uses to generate its output so is well tested.

External Libraries

NetSurf depends upon several external development libraries for image handling, network fetching etc. The libraries for the GTK frontend are installed by default if using the development script previously mentioned.

Generally a minimum of libcurl, libjpeg and libpng are necessary along with whatever libraries are required for the toolkit.

Project Source and Internal Libraries

One important feature of NetSurf is that a lot of functionality is split out into libraries. These are internal libraries and although technically separate projects, releases bundle them all together and for development we assume they will all be built together.

The development script provides the ns-clone function which clones all the project sources directly from their various git repositories. Once cloned the ns-make script can be used to build and install all the internal libraries into a local target ready for building the browser.

For Example:

$ source env.sh
$ ns-clone
$ ns-make-libs install

Frontend selection

As I have mentioned NetSurf supports several windowing environments (toolkits if you like) however on some OS there is only one toolkit so the two get conflated together.

NetSurf currently has nine frontends to consider:
  • amiga
    This frontend is for Amiga OS 4 on the power PC architecture and is pretty mature. It is integrated into the continuous integration (CI) system and has an active maintainer. Our toolchain repository can build a functional cross build environment, the target is ppc-amigaos.
  • atari
    This frontend is for the m68k and m5475 (coldfire) architecture. It has a maintainer but is still fairly limited principally because of the target hardware platform. It is integrated into the continuous integration system. Our toolchain repository can build a functional cross build environment for both architectures.
  • beos
    This frontend is for beos and the Haiku clone. It does have a maintainer although they are rarely active. It is little more than a proof of concept port and there is no support in the CI system because there is currently no way to run the jenkins slave client or to construct a viable cross build environment. This frontend is unusual in that it is the only one written in C++ 
  • cocoa
    NetSurf Mac OS X build boxes for PPC and X86This frontend supports the cocoa, the windowing system of MacOS X, on both PPC (version 10.5) and X86 (10.6 or later). The port is usefully functional and is integrated into the CI system, built natively on Mac mini systems as a jenkins slave. The port is written in objective C and currently has no active maintainer. 
  • framebuffer
    This frontend is different to the others in that it does not depend on a system toolkit and allows the browser to be run anywhere the projects internal libnsfb library can present a linear framebuffer. It is maintained and integrated into the CI system.
  • gtk
    This frontend uses the gtk+ toolkit library and is probably the most heavily used frontend by the core developers.  The port is usefully functional and is integrated into the CI system, there is no official maintainer. 
  • monkey
    This frontend is a debugging and test framework. It can be built with no additional library dependencies but has no meaningful user interface. It is maintained and integrated into the CI system.
  • riscos
    This frontend is the oldest from which the browser evolved. The port is usefully functional and is integrated into the CI system. There is an official maintainer for this frontend although they are not active very often. Our toolchain repository can build a functional cross build environment for this target.
  • windows
    This frontend would more accurately be called the win32 frontend as it specifically targets that Microsoft API. The port is functional but suffers from a lack of a maintainer. The port is integrated into the CI system and the toolchain repository can build a functional cross build environment for this target.

Building and running NetSurf

For a developer new to the project I recommend that the gtk version be built natively which is what I describe here.

Once the internal libraries have been installed, building NetSurf itself is as simple as running make.

For Example:
$ source env.sh
$ ns-make -C ${TARGET_WORKSPACE}/${NS_BROWSER} TARGET=gtk

Though generally most developers would change into the netsurf source directory and run make there. The target (frontend) selection defaults to gtk on Linux systems so that can also be omitted  Once the browser is built it can be run from the source tree to test.

For Example:
$ source env.sh
$ cd ${TARGET_WORKSPACE}/${NS_BROWSER}
$ ns-make
$ ./nsgtk

The build can be configured by editing a Makefile.config file. An example Makefile.config.example can be copied into place and the configuration settings overridden as required. The default values can be found in Makefile.defaults which should not be edited directly.

Logging is enabled with the command line switch -v and user options can be specified on the command line, options on the command line will override those sourced from a users personal configuration, generally found in ~/.netsurf/Choices although this can be compile time configured.

Monday, 26 March 2012

NetSurf Developer Workshop

NetSurf DevelopersOver the weekend we held the NetSurf developer workshop. The event was kindly hosted by Collabora at their Cambridge offices. The provided facilities were agreed to be excellent and contributed to the success of the event.

Five developers managed to attend from around the UK John-Mark Bell, Vincent Sanders, Michael Drake, Daniel Silverstone and Rob Kendrick. In addition James Shaw, one of the project founders, made a special appearance on Saturday.

Starting from Friday afternoon we each put in over 25 hours of actual useful work and made almost 170 commits changing over 350 files, added 10,000 new lines of code and altered another 18,000.

The main aim of the event was to make the transition from using libxml2 to our own libdom library for the browser DOM tree. This was done to improve the browsers performance and size when manipulating the DOM but also gives us the ability to extend the browsers features to include dynamic rendering and Javascript

We also took the opportunity to discuss and plan other issues including:
  • User interface message handling and translation.
  • User preference handling.
  • Toolchain support.
  • Disc caching.
  • Javascript binding
  • Electronic book content handler.
Rob tackled the first parts of the messages conversion from numerous separate files into a single easy to handle file which will in future allow for easier translation and reduce message proliferation.

We made decisions on the ongoing rework of user preference handling which will be implemented in future.

The decision on the toolchain was slightly changed to be that any core or library code (non frontend/toolkit specific parts) are required to conform to the C99 standard. Frontends are permitted to recommend and use whatever tools their maintainer selects but they cannot enforce those restrictions on core code. This issue is principly because the BeOS maintainer is compiling NetSurf with g++ 2.95 which is missing several important language features we wish to use.

Developers at workThe recurring issue of disc caching was raised again and we have come up with what we hope is a reasonably elegant solution to be implemented over the forthcoming months.

Now there is a suitable DOM to bind against, the existing Javascript engine support will be properly integrated and should result in basic script support before the 3.0 release. This support will remain a build time option so NetSurf can continue to be used on platforms where the interpreter is too resource intensive to be used.

A short discussion about the possibility of integrating a basic page based content handler for epub and mobi type documents was discussed and while the idea was well received no decisions on implementation were made.

Overall the event was a resounding success and we are left with only a small handful of regressions which appear straightforward to solve. We also have a clear set of decisions on what we need to do to improve the browser.