Skip to main content
 

Parts needed for Addon infrastructure for Firefox for iOS

10 min read

I've been thinking a lot about addons recently.

This post is exploratory and technical in nature.

Context: I work on the Firefox for iOS team; we have an inkling that Firefox for iOS addons may be a Good Thing™, but have not got anything concrete on our current road map. This particular post is just finding the things that will be hard or impossible and/or what to ask the Webkit developers next time they come to speak to us.

Disclaimer: We may want to adopt the proto-spec that is WebExtensions, supporting some or all of it; we may want to contribute to the spec by adding our own innovations. We may want to ignore the spec altogether. It is too early even for these decisions.

Arches My assumptions on this matter for this post are:

  • We do want to support extensions (otherwise, it's a short conversation, and not go-faster).
  • We are convinced that Apple won't reject the app, or we know what to do if they do.
  • We'd like to be able to follow along with the rest of the Firefox addons ecosystem, for porting reasons.
  • Tradeoffs mean that we can or will never support 100% all platforms Firefox has a presence.

Instead, here I'm going to focus on sketching out the major assumptions that any addons need and the pile of technology that iOS provides.

WebExtensions non-functional requirements start with:

  • Porting add-ons to and from other browsers should be easier.
  • Reviewing add-ons for addons.mozilla.org should be easier.
  • WebExtensions must be compatible with multiprocess Firefox (Electrolysis).
  • Changes to Firefox's internal code should be less likely to break add-ons.
  • WebExtensions should be easier to use than the existing Firefox XPCOM/XUL APIs.

Additionally:

  • Install and uninstall should be done without a restart. The concept of 'restart' doesn't really exist on mobile.
  • Given that it's our stated aim that we want to ship major components as addons, and we may want to extend those major components with other addons, we should allow limited and controlled communication between addons.
  • Addon authors should have a reasonable expectation of secure separation between addons.
  • Speed is always a feature.

Some interesting addons worth thinking about:

  • WhimseyPro
  • µblock Origin
  • PDF.js (it doesn't need to be done, but /could it/).

From an architectural view, my assumptions are:

  • an addon has a manifest.json which declares the components of the addon.
    • html pages loaded in a new tab or popup
    • browser chrome implemented in swift with event listeners in javascript.
    • addon reviewers should have a strong expectation that the addons capabilities is limited to what it declares in manifest.json.
  • an addon can have javascript and css that runs in the content – in the WKWebView
    • it is conditionally run based on the page's location.
    • the js can be run before or after the page is loaded
    • the js can fetch resources from domains specified
  • an addon can have javascript that runs in the background
    • the background can interact with the browser chrome,
  • javascript from the two different contexts can communicate, but by message passing.
  • Wiring javascript to chrome events and access to the addon api is done through the chrome object.
    • the chrome object has api objects added to it, via permissions declared in each addon's manifest.
    • non-persistent background scripts are said to be event scripts.
  • The actual api that developers will use is not being discussed here: just the foundations on which that API will be built.

The pile of bits provided by iOS8 consist of:

  • JSVirtualMachine, which can host multiple JSContext objects, which must all run in the same thread, which doesn't have to be the main thread.
    • Each context has its own global object, but not much else. JSON is provided, but XHR isn't.
    • Objects can be shared between JSContexts in the same JSVirtualMachine, including functions, functions called by functions in the same shared object behave are run in the context they are declared in. Side-effects stay with whichever context the function is declared in. I have not experimented with this twiddling. From a javascript pov, it feels almost exactly like the seperation provided by CommonJS modules.
    • There is a rich API for sharing Swift mutable objects with Javascript, and vice versa. This is upto and including passing function literals back and forth.
  • webView.configuration.userContentController WKUserContentController, WKUserScript, WKScriptMessage and its associated WKScriptMessageHandler.
    • User scripts can be executed each time the content is loaded, before the content or after the document is loaded: there is no sharing of state across reloads.
    • evaluateJavascript: completionHandler: provides a bridge from swift to js.
    • WKScriptMessage and WKScriptMessageHandler provide a bridge from js to swift.
    • WKUserContentController can add WKUserScripts one at a time, but can only remove all scripts at once.
    • Out-the-box, I haven't found a way to allow user-scripts to XHR a domain other than allowed by the server from where the content came. There are workarounds, but none of them are pretty, or cheap. Research and help needed.
  • GCDWebServer. This would be very useful if content scripts could load resources via HTTP, but since we can't mutate the single origin policy of webcontent, it is limited to only loading addon pages into content. If we can, then we could
    • implement chrome.runtime.connect with WebSockets into the GCDWebServer
    • load addon CSS asynchronously by writing script and style tags

Wrong ATM

Our job now is to decide how to make the first list of things out of the second list of things.

Having done a lot of this research I'm pretty convinced that we should run all background and chrome event scripts in a single virtual machine, preferably off the main thread. This would mean all addons run on the same thread, but the extra thread seperation would come at the cost of implementation speed and flexibility.

Since the manifest is meant to be the central auditable glue that ties the addon together, it should almost certainly be parsed in Swift, and the Source of Truth About Addons to be kept there. However, it should be accessible from Javascript, both for WebExtensions spec and to allow us to implement the chrome object in terms of javascript as well as swift. Collectively, the JS and Swift access to the source of truth, I am referring to as the registry.

Given how easy it is to move objects between JSContexts, and the extra seperation it provides it should be obvious to have a JSContext object per addon (shared between all background and event scripts).

However, given that manifest.json access is expected by the addons, and we'll be constructing some of the chrome object in Javascript, we should also have a tying-the-js-knot JSContext that manages the JS side of manifest semantics, and the JS-implemented bits of the chrome object.

I haven't looked at this yet, but we may be able to extend WKUserContentController to allow finer grain control over which WKUserScripts are loaded: I do not know if these can be changed after the new location is known (after redirects) and before the scripts are loaded. This would be the ideal situation.

If not, we have to load all content scripts for all addons for all pages, later becoming low hanging fruit for optimization. The worst case will be passing javascript over the bridge to an AddonContentHelper.js user script; this would be horrible, especially to simulate the AtDocumentStart.

CSS contributed by addons could go either route.

There needs to be some addon runtime made of two WKUserScripts loaded at run AtDocumentStart and AtDocumentEnd.

This would, minimally, construct the chrome object and metadata for each addon before running it. Maximally, it needs to be able to receive javascript and CSS or coordinate which of the addon scripts are going to run, and CSS applied.

We have enough bridges between Swift and the various Javascript contexts that chrome.runtime.sendMessage can be done with a little bookkeeping. I'm not sure about chrome.runtime.connect.

Eye test

Javascript Build tools

I can't decide whether to go balls out bleeding edge and futuristic or so simple its painful but won't scale. This would be primarily for the more capable background and event javascript runtimes.

The JS side of addon management can be cleanly locked away in a different context to the addons, so can structured how we want.

A require function would be a perequisite for nice code and developer productivity, and once we have a require step running, there is essentially no limit to the tooling JS and npm can bring to the table.

For the lay reader: require is the CommonJS way of doing imports. Namespacing and encapsulation in the webdev javascript before ES6 modules has always been shit show.

In fantasy dream land, I'd like to be writing this code in ES6, with Flow Types.

Of course, we are not the first to be aggressively using JSContext: react-native packager may be of use here, which does the ES6->plain old JS->Flowtypes->Packaging, ready for execution in JSContext.

Getting require to work in JSContext may take some work, but would yield benefits almost immediately: by not having 1000 line javascript files.

There will be a bunch of 'standard' APIs we will need to shim, including local storage and XHR.

UX Considerations

Addons bring a lot of choice for the user, and there isn't the same real estate that desktop has.

Addons can contribute to the UIView UI, and can be filtered agressively (url, mimetype?) though it is possible (likely?) that there isn't a clear winner (e.g. pageAction). Additionally, browserActions can come from zero or more addons, so we should expect multiples of these choosable by the user – either by preferences or habits, or pickers.

I have been thinking in terms of a OneOfSome pattern of OneOfSome<T?> views, with UserPreferencedOneOfSome, URLFilteredOneOfSome, FrecencyOneOfSome; perhaps with a submenu to select from when there's more than one, or an ordered list of choices (toggleable or not) maintained in a preference page.

We are already seeing this play out when extending about:home tiles.

Additionally, in the same way Android doesn't force each app to invent its own preference pages, we may want to be able to offer a generic preferences template which localized content can be squirted in from the addon registry.

Further more, addons are exciting! We get to concrete the cowpaths and invent different places to put other people's content, all while handing off most of the work to other people!

Summary

I've explored some of the iOS APIs which we can implement a WebExtensions-like API, and sketched out a possible route to get there. I've identified some places which need further research, and problems that need to be solved.

I am optmistic that this can be done.

Flowers

Links