Thursday 7 March 2013

Man cannot discover new oceans unless he has the courage to lose sight of the shore

Continuing with my whirlwind introduction to NetSurf Development now is the time to start examining the code, how its arranged and how to interact with the existing developers.

The way the NetSurf source is structured is around the idea of the frontends each being a native browser. While this implies that there are nine separate browsers that happen to share a common code base the separation is not quite that well defined.

Each frontend provides the OS entry point (the main() function in a c program) and calls out to standard browser initialisation entry function netsurf_init() and then starts running the browsers main dispatch loop with  netsurf_main_loop() when that exits the frontend cleans up with netsurf_exit().

The frontends provide a large selection of functions which are called from the core code. These routines run from running the event scheduler through to rendering graphics and text.

Finding your way around

The browsers directory layout is fairly shallow consisting of some Makefiles, the nine frontend directories and eleven others.

The Makefiles are GNU make and represent a pretty straightforward linear build system. We do not use recursive make or autotools. There are plans to use the core buildsystem that all the other components use.

The frontend directories contain the code for the frontend and the makefile fragments to build them which are included by the top level Makefile.

In addition there is:
desktop
This contains the non frontend specific code that actually behaves like a browser. For example desktop/netsurf.c contains the three primary functions we outlined in the introduction. You will also find much of the function and data structures interfaces the frontend must provide. It is unlikely someone new to the project will need to change anything in here (there be dragons) but is an important set of routines.
utils
Here you will find the utility and compatibility interfaces, things like url handling, logging, user messages, base64 handling. These are utility interfaces that do not justify splitting out their functionality to a separate library but are useful everywhere. Changing an interface in here would likely result in a major refactor.
For example one quirk is the logging macro was created before varadic C preprocessor macros were universal so it must be called as LOG(("hello %s world", world_type)) e.g. with double brackets. Fixing this and perhaps improving the logging functionality would be "nice" but the changes would be massive and potentially conflict with ongoing work.
content
This contains all the core code to handle contents i.e. html, css, javascript, images. The handling includes retrieval of the resources from URI , correct caching of the received objects and managing the objects state. It should be explicitly stated that the content handlers are separate and use this core functionality. The actual content handlers to deal with object contents such as the routines to decode image files or render html are elsewhere.
image
This is where the content handlers for the various image types are kept. The majority of these image types (jpeg, png, webp, gif, bmp) use a standard library to perform the actual decode (libjpeg, libpng etc.). One special feature used by most image handlers is that of a decoded image cache which is distinct and separate from the content cache.
The decoded image cache manages decoding of the source images (the jpegs and pngs) into frontend specific "render" bitmaps. For example the gtk frontend keeps the decoded images as a cairo surface ready for immediate plotting.
The cache uses a demand based (when the browser actually displays the image) just in time strategy which has been carefully balanced, with real world input, to reduce the overhead of unnecessary image decoding and processing against memory usage for the render bitmaps.
css
The css content handlers provide for the processing of css source text and use the NetSurf libcss library to process into a bytecode suitable for applying style selection at render time.
javascript
The javascript handlers (strictly speaking this should be named the "script" directory as all types of script are handled here) provide basic functionality to bind a javascript engine to the rest of the browser, principally the Document Object Model (DOM) accessed with libdom. The only engine that currently has bindings is Spidermonkey from the Mozilla foundation.
render
This is the heart of the browser containing the content handler for html (and plain text). This handler deals with:
  • Acquiring the html.
  • Running the base parser as data arrives which generates the DOM and hence DOM events from which additional resource (stylesheets etc.) fetches are started.
  • Deal with script loading
  • Constructing the box model used for layout
  • Performing the Layout and rendering of the document.
Because this module has so many jobs to do it has inevitably become very complex and involved, it is also the principle area of core development . Currently NetSurf lacks a dynamic renderer so changes made by scripts post document load event are not visible. This also has the side effect that the render is only started after the DOM has finished construction and all the resources on a page have completed their fetch which can lead to undesirable display latency.
Docs
Documentation about building and using NetSurf. If anyone wants a place to start improving NetSurf, this is it, it is very incomplete. It must be noted this is not where dynamically generated documentation is found. For the current Doxygen output the best place to look is the most recent build on the Continuous Integration system.
resources
These are runtime resources which are common to all frontends. To be strictly correct they may be the sources which get converted into runtime resources e.g. The FatMessages file which is teh message text for all frontends in all languages, this gets processed at build time into separate files ready.
!NetSurf
This is another resources directory and technically the resources for the riscos frontend. The naming and reliance on this directory are historical. To allow the RISC OS frontend to be run directly from the source directory and an inability of RISC OS to process symbolic links most common runtime resources end up in here and linked to from elsewhere.
test
These are some basic canned test programs and files, principally to test elements of the utils and perform specific exercise of various javascript components.

Getting started

Once a developer has a checked out working build environment and can run the executable for their chosen frontend (and maybe done some web browsing :-) it is time to look at contributing. 

If a developer does not have a feature or bug in mind when they begin the best way to get started is often to go bug hunting. The NetSurf bugtacker has lots to choose from unfortunately. Do remember to talk to us (IRC is the best bet if you are bug hunting) about what you are up to but do not be impatient. Some of those bugs are dirty great Shelob types and are not being fixed because even the core developers are stumped!

When first getting going I cannot recommend reading the code enough, this seems to be a skill that many inexperienced open source developers have yet to acquire, especially if they are from a predominately proprietary development background. One wonderful feature of open source software is you get to see all of it, all the elegant nice code and all the "what the hell were they thinking" code too.

One important point is to use your tools well the source is in git, if you learn how to use git well you will gain a skill that is readily portable, not just for NetSurf. And not just revision control tools, learn to use your debugger well and tools like valgrind. Those skills will replay the time spent learning them themselves many times over.

When using git one thing to remember is commit early and often, it does not matter if you have lots of junk commits and dead ends, once you have something viable you can use
git rebase --interactive
and rewrite it all into a single sensible set of commits. Also please do remember to develop on a branch, or if you did not
git pull --rebase
is your friend to avoid unnecessary merges.

Playing nicely with others

The NetSurf community is a small band of unpaid volunteers. On average we manage to collectively put in, perhaps, ten hours a week with the occasional developer weekend where we often manage over twenty hours each.

The result is that developer time is exceptionally valuable, add in a mature codebase with its fair share of technical debt and you get a group who, when they get to work on NetSurf, are incredibly busy. To be fair we are just like every other small open source project in that respect.

What this means to a new contributor is that they should try and do their homework before asking the obvious questions. The documentation is there for a reason and in spite of its obvious shortcomings please read it!

When asking questions it should be noted that currently the majority of active contributors are in the Europe so if you visit the IRC channel or post questions to lists the time difference is something to keep in mind.

I carefully said contributor above and not developer, users trying the CI builds and reporting results are welcome...as long as they report useful bugs to the bug tracker. Simon Tatham has produced an excellent resource on this subject.

Also we are always happy to receive translations to new languages (diff against the FatMessages file would be outstandingly useful but anything is welcome), artwork, documentation. Just recall what I mentioned about busy developers. Surest way to get us to see something is the development mailing list, you will probably get a reply, though I will not promise how fast!

Some of the more common mistakes when interacting with the community are:
  • Demanding we fix or add a feature. At best we will ignore you...though merciless sarcasm is not an unusual response to this. Perhaps a polite suggestion to the users mailing list would get better response? This is simple case of misunderstanding the relationship with the developers, you got the software for free so demanding we spend our leisure time to change it for you is impolite, or at least that is how I see it (and I am British, we do polite to excess).
  • Request write access to the git repository without a proven track record. We are fairly open to new developers once they have a track record but initially contributions should be via patch series on the mailing list we can feed to git-am. Eventually we may give you commit access to your own personal branch space and from there extend to the rest of the repository.
  • Developing a feature without talking to the team first and then being upset when we reject it. This is especially aggravating for all concerned as effort is wasted all around. If you have a great idea for a feature talk to us first! And if we indicate in our typically polite way that it is not going to be accepted listen to us! Of course you are free to ignore us, just please do not be upset later on.
  • Non-constructive criticism. What I refer to here is finding fault in our software without logging a bug or otherwise providing something to respond to. We try to provide the best software we can and by extension have a great deal of pride in our project. This antisocial behaviour helps no one but can have a large negative impact on developer productivity.

In conclusion

Hopefully this has been of some use although I had hoped to cover more and provide deeper insights and advice on the codebase but there is only so much generalisation to be done before it is just easier for the developer to go read the code for themselves. 

I look forward to lots of new contributions :-) though I fear this may all end up as more of a crib sheet for next time we do GSOC, time will tell.

45 comments:

  1. Thanks Very much! This was very helpfull

    ReplyDelete
    Replies
    1. The exact directory structure of a browser can vary significantly depending on the specific browser (Chrome, Firefox, Edge, Safari, etc.), its version, and the operating system. However, we can provide a general overview of the common components and their potential locations.

      IEEE projects for cse

      Core Components and Potential Locations
      While the specific structure might differ, most modern browsers share common components:

      Executable: The main application file, usually found in the installation directory.
      Plugins: Additional software components that extend browser functionality (e.g., Flash, Java).
      Profiles: User-specific settings, bookmarks, history, cookies, and other data.
      Cache: Stores temporary files for faster page loading.
      Extensions: User-installed extensions or add-ons.

      Delete
  2. All the points you described so beautiful. Every time i read your i blog and i am so surprised that how you can write so well.
    Data Science training in Chennai
    Data science online training

    ReplyDelete
    Replies
    1. The way the NetSurf source is structured is around the idea of the frontends each being a native browser. While this implies that there are nine separate browsers that happen to share a common code base the separation is not quite that well defined.

      IEEE projects for cse

      final year projects for computer science


      Final Year Project Centers in Chennai

      Delete
  3. Looking to add extra style to your content, make use of our strikethrough Text Generator to add amazing line through text in desired platform. Strikethrough in Google Docs.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Alot of blogs I see these days don't really provide anything that I'm interested in, but I'm most definitely interested in this one. Just thought that I would post and let you know. Nice! thank you so much! Thank you for sharing.
    samsung mobile repair
    samsung mobile service center near me
    samsung service centres in chennai

    ReplyDelete
  6. Thanks for your great and helpful presentation I like your good service. I always appreciate your post. That is very interesting I love reading and I am always searching for informative information like this.Also Checkout: blockchain technology training chennai | blockchain training institute in chennai | cryptocurrency training in chennai | blockchain courses in chennai

    ReplyDelete
  7. Thankyou for providing the information, I am looking forward for more number of updates from you thank you

    Check out : big data hadoop training in chennai
    big data training in chennai chennai tamilnadu
    spark training in chennai

    ReplyDelete
  8. Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.

    And indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had.
    MATLAB TRAINING IN CHENNAI | Best MATLAB TRAINING Institute IN CHENNAI
    EMBEDDED SYSTEMS TRAINING IN CHENNAI |Best EMBEDDED TRAINING Institute IN CHENNAI
    MCSA / MCSE TRAINING IN CHENNAI |Best MCSE TRAINING Institute IN CHENNAI
    CCNA TRAINING IN CHENNAI | Best CCNA TRAINING Institute IN CHENNAI
    ANDROID TRAINING IN CHENNAI |Best ANDROID TRAINING Institute IN CHENNAI

    ReplyDelete
  9. Thanks for sharing such a Wonderful blog. This is such a exact information i am been searching for. Keep post

    Check Out:
    react js tutorial
    it courses in chennai
    react js classes near me

    ReplyDelete
  10. Deep Learning Course in Bangalore with 100% placement. We are the Best Deep Learning Course Institute in Bangalore. Our Deep Learning course and Certification courses are taught by working professionals who are experts in Deep Learning.

    Deep Learning Training in Bangalore

    Deep Learning course in bangalore

    Deep Learning in bangalore

    Deep Learning classes in bangalore

    Deep Learning course institute in bangalore

    Deep Learning course and Certification course syllabus

    best Deep Learning course

    Deep Learning course centers

    ReplyDelete
  11. Heartily Thanks for Sharing Your Knowledge's With Us.It's Insightful Information's...Looking Towards More Like this.
    Java training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery

    ReplyDelete
  12. After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.

    SAP MM Online Training

    SAP MM Classes Online

    SAP MM Training Online

    Online SAP MM Course

    SAP MM Course Online

    ReplyDelete
  13. I recently came across your article and have been reading along. I want to express my admiration of your writing skill and ability to make readers read from the beginning to the end. I would like to read newer posts and to share my thoughts with you.

    SAP SD Online Training

    SAP SD Classes Online

    SAP SD Training Online

    Online SAP SD Course

    SAP SD Course Online

    ReplyDelete
  14. Very interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.

    Power BI Training in Bangalore

    Best Power BI Training Institutes in Bangalore

    ReplyDelete
  15. This comment has been removed by the author.

    ReplyDelete
  16. Good Post! , it was so good to read and useful to improve my knowledge as an updated one, keep blogging. After seeing your article I want to say that also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.

    htwtps://www.3ritechnologies.com/course/data-science-online-training/

    Data Science Online Training

    ReplyDelete
  17. This comment has been removed by the author.

    ReplyDelete
  18. This comment has been removed by the author.

    ReplyDelete
  19. Keep share such valuable Content. Very helpful and knowledgeable, After Seeing this article it was awesome and very valuable.



    <a href="https://www.3ritechnologies.com/course/tableau-online-training-certification/>tableau online training</a>

    ReplyDelete
  20. This comment has been removed by the author.

    ReplyDelete
  21. Keep share such valuable Content. Very helpful and knowledgeable, After Seeing this article it was awesome and very valuable.



    tableau online training

    ReplyDelete
  22. Good Post! , it was so good to read and useful to improve my knowledge as an updated one, keep blogging. After seeing your article I want to say that also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.

    Data Science Online Training

    ReplyDelete
  23. Great article with very unique and useful information,,,
    Thank you,,,,Keep Updating,,,

    Power BI Course

    ReplyDelete
  24. almost all platforms have increased capabilities through intensive and shorter development cycles. virtual edge and thank you letter after networking event

    ReplyDelete

  25. This post is so helpfull and informative.keep updating with more information...
    Artificial Intelligence Systems
    A.I Technique

    ReplyDelete
  26. C Language Course
    IFDA is India's No 1 C Language Training Institute in India
    IFDA is Located in Delhi, Badarpur and Kalkaji
    IFDA Offer's Wide Range of Professional Courses in India
    IFDA Provides Both Online and Offline Courses in India

    ReplyDelete
  27. This comment has been removed by the author.

    ReplyDelete