A new browser for the internet

How might we design a browser for creation, not consumption?

This question haunts me now for years. The first Hypertext systems and web browsers like Mosaic were tools for both creation and consumption. Yet modern web browsers do little to support the creation of knowledge other than rendering HTML. The last interface innovation in web browsers was adding Tabs back in 2003.

In the physical world thinking, wandering through a space or having a conversation happen seamless & instantaneously. Searching and capturing something in a “modern” web browser is like running through a maze with silos in every corner.

What if we revisit the idea of embedding basic tools for thinking like “Word Processing” or “Knowledge Management” inside the web browser? From editing videos, brainstorming ideas, to designing and developing Apps, we already do most of our creative work inside the browser. A modern browser should be more like an operating system, that provides the basic tools for capturing, editing, sharing and discussing thoughts (not documents!).

@tyler I saw many similar, but also new ideas in your post :deciduous_tree:Information Forest. I find the idea of “Super Command-F” with natural language queries similar to RegEx fascinating, as well as the idea of “Predictive search paths”. This reminded me of the “Associative trails” Vannevar Bush proposed in As We May Think.

Curious what others think about a creation-first web browser.

what do you think about the idea of “inline” browsing? the more im thinking about this, the more i think that a really great “creation first” browser might resemble a writing environment or jupyter notebook more than current browsers. I dont really think its just about annotation-- it’s really about extraction / analysis / synthesizing that into new info

1 Like

Exactly, I think inline browsing creates a paradigm shift: From push to workspace (google → website → copy → workspace → paste), to pull from workspace (workspace → /search → select). Not having to leave your workspace for doing research will allow us to do more focused work and create personal browsing narratives that are easier to share with others. It will reduce the cognitive load of navigating ever changing environments on the web. If there’s one thing the human brain doesn’t like, it’s surprise. That’s why all-in-one when done well (Notion) is such a powerful idea.

For effective inline browsing I think we need content to be:

  • structured as a graph
  • atomic addressable
  • meaningful by its own (standalone like a Tweet or Building Block in Notion)
  • dependent blocks of meaning should not be statically “baked” into another meaning context, but interlinked/ quoted and if necessary supplemented with an alias label that works better for the new context (e.g. grammar adjustments).

So I imagine the future browser consisting of a powerful natural language command line/space, that pops up wherever ones hit # or /. It will in the long run make the traditional browser & web obsolete. Because all we really do is remixing existing stuff and giving things a new name.

An inspiration here is Textio’s augmented writing experience:

love the idea of pulling from the workspace and reducing overall cognitive load! I imagine it being really easy to zoom in and out of how much of a web page you’re seeing, from the title/description, to a simplified reader view, to expanding the whole page, and being able to open it right at the cursor.

can you explain what you mean by:

not really following this! can you give an example?

but yes, i agree generally. can you expand more on what you mean by “content” has to be meaningful on its own? I think many tweets are meaningful on itws own because of the nature of how people tweet; however, if you take a tweet from a middle of a thread it doesn’t make sense unless you read the previous ones. I imagine in a system like this most content actually wouldn’t be meaningful on its own because of how it would connect to other information. but I think I might be misunderstanding what you mean.

Baking a cake

Diagram of zooming into the universe of a cake: 
🎂→ 🍰→ 👩‍🍳+🥛+🧂+🧈+🍒(🔍🍒→ 🌳(🔍🌳→ 🌍))

I think the metaphor of baking a cake could explain my vision here:
When we’re presented with a typical website or PDF it resembles what I’d call a “baked cake :cake:”. Form, structure, content, style is presented to us as a baked whole (due to the nature of HTML). We can’t see its parts/ subparts/ … or relations to the system the cake is part of (JSON/ graphs).

Now if I want to understand how that cake was created, I have invest a lot of cognitive & willpower to break the cake from its baked state into its ingredient state. I’ll probably need to look at many other cakes to fully understand all ingredients of my cake. What’s almost impossible to extract from the baked cake in front of me is the process and all the agents that were involved in making it.

In comparison, if a document was structured as a composition of raw blocks, i.e. a block made of other blocks, with “baked views”* that can be broken down into the “ingredient blocks”, we will not only understand the document aka cake better, we will be able to build upon it and remix its ingredients with ease.

*(the sum is, of course, greater than its parts, e.g. spatial or temporal properties when composing blocks)

When we remix blocks, there’s a great value if the link to the origin is preserved (RE: Transclusion/ n-directional links). It’s almost like we should ban “copy & paste” and only offer “quote & redefine” (or from “pointer down, select start & end, pointer up” to “drag & drop” only).

The action redefine can have several levels of change, from just adjusting the grammar to using synonymous vocabulary / an alias, to summarising or expanding a concept, to completely redefining the meaning of a word or a series of words. However all these changes should be treated as interlinked versions, not as overwrites.

Blocks & Meaning

The point you made about the nature of tweeting is a very interesting one. It sparked some ideas for me, about how artificial restrictions can drastically shape the many nature(s) of communication.

I’m thinking here about a restriction to the scope of the content of a block, similar to the character limit on twitter, but instead of characters, it would limit the number of concepts to a composition that can be understood in isolation.

There’re probably various levels of understanding depending on who reads it.
For example we probably all have a clear idea what’s meant when we read the word “Transclusion”. But to someone who hasn‘t developed a mental map of this concept a definition like on Wikipedia is recommended, with the option to zoom in and expand to even more in-depth definitions. So I expect this “limitation” to be a visual UI limitation that adapts based on who’s looking at it, e.g. considering their culture, language, domain or interest.

The purpose of such limitations to the scope of a block is to increase the ability to remix. For the individual, it’s more work to express something very concisely (like summarising my answer in <100 words will take ~10min), but I think for the collective consuming the information it’s worth it.


Tyler mentions in :deciduous_tree:Information Forest:

If there could be one product that would satisfy all of these requirements, I’m not totally confident about what it would look like. Would it have to be 3D? Does this beg for VR? Should it be super minimal and focused on search as the main interaction method?

It can be interesting to think about what technologies and mediums would be best fitted for exploring highly interconnected “forests” of information. However I imagine the browser/creation tool of the future to not be constrained to any particular form of display or workflow. Everyone may have their own particular ideal workflow that fits their needs.

Tools like Notion and Roam are primarily oriented to working with textual information; and tools like Figma are specifically oriented towards design. Every tool however works to manipulate underlying data. And it is that data, and data manipulation that should be the primary focus of any tool which strives to be a “seamless thinking system”, how this data is displayed should be interchangeable and customizable to each user’s individual preferences and workflow.

Let me give you an example:

Video Editing

The process of making a video is a long one. From writing a script and storyboarding, to collecting footage and stitching it all together. As such the tools for video editing come at the very end of the process; they are not tools for thought, you cannot jump into premiere pro and start “brainstorming” a video.

So I wondered, “how could a video editor support every step of the process of creating a video?”, here is the workflow I imagined:

  1. Start with text, at this point the editor looks just like any other text editor

  2. While writing I might imagine visuals or sound effects, I can start linking these to the text; whether this includes drawing little storyboards, or pulling videos or sound effects from the internet (to quote @johannesmutter “workspace → /search → select”)

  3. At this point I will have a draft, of the script AND the video itself. All these storyboards, images, and videos I have collected can be played back, and the script could even be narrated by a synthesized voice (after all, this is the video equivalent of a sketch, it doesn’t have to be perfect)

  4. Transform from vertical text-layout into something which closer resembles a traditional video editor (still the same underlying data, just a different representation). Here we might start replacing the stand-in videos and storyboarding with some more finalized footage

  5. Publish; although this will really be a transcluded video-player-representation of your data. Any quotes or borrowed footage are linked to their original sources; text in the video can be quoted or commented on; etc.

Here’s a drawing to visualize each step I describe

This is not one “app” but rather data linked together and displayed in different formats. Each storyboard drawing for example is a rendering/representation which, if zoomed into, gives tools for drawing and erasing; or if you zoom into a sound block you can trim and edit that sound. I could even take a 3d model (itself a set of interconnected data) and drop it in my video directly, and then keyframe its rotation?

If instead of treating text arranged in graphs or sequences as the base building block of content, you treat data and links (in Ted Nelson’s sense of a link) as the base building blocks which can be displayed and rendered in various ways, the browser/creation tool of the future can be infinitely more powerful and sophisticated as a tool for thought.

A basic function like commenting would not be a manipulation of text but a manipulation of data. The process to comment on a block of data should be the same whether you’re adding a comment to an essay, a video, a CAD model, a picture, or even commenting on another comment. The same feature does not have to be re-implemented across every program but could simply be customized to display differently across different workflows.

I don’t know if this is something you have all considered deeply before but I thought I’d throw it out there. I think this kind of interoperability could be really, really powerful, and I hope have managed to communicate a fragment of this to you in this post to further the discussion around a new kind of modern browser. Would love to hear your thoughts.


I’m enjoying every bit of this conversation, so thank you all.

I see the problem here, how that interruption you face when you need to find the right reference could become a pit you fall in, a short term memory overflow in a way.

Yet, I think it’s unlikely that you easily find what you’re looking for after typing a few words. I know this is the promise of Google. But it seems to be ignoring that /search is a multi-step process as Amy Hoy suggests (see the “berry picking” paper).

When we search we are forced to figure out the search terms and we cannot easily benefit from hyperlinking (found wrong thing but yay the right thing is linked to it) without getting in the hyperweeds. We may also just accept that most searches will be somewhat tedious, multi-step processes and that we’ll need some sort of external help not to get lost in the side-mission.

To give some alternatives, these are the strategies I see to conserve focus:

  • Finding ways in which you save your original intent to make it easier to get back to it (the saving the game strategy).
  • Writing down the forks in the trail but deferring taking them until later (you write a placeholder like “add paper about monkeys here” and it becomes a todo item for later).

@hanbzu wow thank you so much for these links!

your point about being “forced to figure out the search terms” is probably the worst when you’re just starting out in a search process. search becomes easier the more context you have and the more information you uncover through the process; unfortunately it’s the first few steps in the beginning where people get slowed down the most IMO.

i like your strategies for conserving focus. maybe I would also add when you are down in a fork in the search trail (in the rabbit hole), getting suggestions of where to go back to. this is similar to saving your original intent.

more concretely, i think these problems can be summarized as primarily UX, UI, and data science problems. how do you effectively cluster and suggest the right information at the right time, and how do you build a UI that isn’t overwhelming around it.

Browsers should exist at all? That is a question that I think will lead to more innovative outcomes than revisiting the status quo.

If a browser application exists, then applications exist. If applications exist, our future systems might be no better then our current systems.

If a browser is extensible enough, then the browser becomes a system. Should not a system be its own browser?

1 Like