Proposal: A common spec

The category of tools for thought is a very broad one. I’m assuming we will be building more things to test different workflows and pin down our principles and concepts.

I see shared aesthetics and complicit mindsets around here, but I suspect we also enjoy the freedom of pursuing alternate rabbit holes, which will help explore the territory. And yet, we’d like to help build a collective alternate reality where “copy and paste” and “platform and app” are not the end of the trail.

For this reason I propose we attempt a common spec, minimal enough so that we can explore in parallel different approaches and still benefit from developing the basics together, leaving the door open for future convergence.

Here’s what my extremely rough sketch for such a spec looks like:

  • Structured like a graph. Content is a node. Edges are directed but can be traversed in both directions. You can think of a document like a series of content nodes laid out by traversing the graph.

  • Content is atomic addressable. I’m thinking unique ids with no collisions on creation into some sort of URL like thing. but @johannesmutter is there anything else you have in mind?

  • Content is self-contained, meaningful on its own, to ease reusability. Containerising the building blocks makes it easier to build with them. If I start a paragraph with “on the other hand…” that will make it harder for me to reuse that paragraph elsewhere. This does not mean that one shouldn’t be able to create such paragraph, but rather that there’s a native way of creating content that unlocks the benefits of this category of tools. Maybe affordances could to help users have that in mind.

  • Encourages mashup over edition. When you transclude content over time you assume the content will not suddenly disappear or change in meaning after an aggressive edition. Thus manipulation systems should at least try to encourage creation and linking over edition. Maybe we don’t need to go all the way to immutability (?) but one thing is to fix a typo and another to change the meaning of a shared building block.

As you see, there’s a big overlap with

What do you say?

  • Do you see any of them problematic?

  • Do you see ways of making these more clear?

  • Are these minimal, basic enough or have I polluted the list with ideas about the implementation that I have in mind?

  • Do you see any other basic element that absolutely needs to be here?

I’d love to hear your opinions.

2 Likes

I love the idea of a minimal commons spec with shared & compatible principles.

  • Structured like a graph: What do you think is the most suitable graph structure for general purpose thinking tools? Should it support many-to-many relationships like a hypergraph (Demystifying Graph Databases)? Should there be a difference between a relationship and an entity?
  • Addressing content: Besides machine readable UIDs that are globally collision free (across systems), I think URLs should be more like natural language and permit optional ambiguity (scope might be wider than one concept only and change over time). So instead of protocols (https/ ftp/ …), reserved namespaces and parameters (?index=2) I’m thinking of chains of concepts in natural language that point towards a targeted resource, similar to Triangulation, with cool filter features like negation ("… not: XXX" …), synonyms, antonyms, hypernyms, etc.
  • Mutation, Versioning & Variations: I agree there should be the option to transclude the latest version and a specific version. If a concept changes its meaning connected concepts should receive a notification, smaller changes like grammar, typos, more in depth details or summaries should have a different “change” status.

This is great. Well yes, at least that’s what I have in mind, but I hadn’t realised that it’s called a hypergraph!

The simplest structure I can imagine for the use cases I’m considering is that trails are an EDL or Edit Decision List. This is very Ted Nelson. Basically, you write in a list (the EDL) the addresses of the pieces you walk through when traversing. That’s a trail. This means it’s a hypergraph and trails are hyperedges.

I’m not sure I fully understand.

In my prototype Lotu, I’m not allowing referencing a trail in another trail because I want to know how expressive you can get without the power of referencing trails within trails. And also because I’m thinking about content as tending to immutability but trails being aggressively edited over time. So from that sense, I’m considering trails to be a different thing to pieces of content.

Regarding trails the metaphor I like to use is a very spacial one. In a way, more like Vannevar Bush, you directly walk them. In a way you’re dynamically building a new trail at every fork, at every crossroads. As it happens in urbanism, trails can be side trails if they are seldom walked. Even disappear behind bushes. One former side-trail can also become a main trail if it’s pragmatic to take that route often.

This is the metaphor I choose to use when I think about Zettelkasten-like systems. What was once a busy junction or trail can become forgotten because new places and new trails between those places emerge over time.

Going to more concrete land, one of these trails (or hyperedges), content-wise, could be the Wikipedia article for Vannevar Bush. The document itself would be a trail, its EDL transcluding different paragraphs. Then another trail could connect the 1st paragraph (written in a concise summary style) with descriptions of other people, in a ride through, say, the “originators of hypertext” (you get a people list with key info). Another trail could connect elements within the article to other elements in a narrative way (maybe a Manhattan Project trail). So when you’re walking any of these trails you can switch to any other. This seems a bit like WWW doesn’t it?

Well, the key element here is: trails take you through atomic pieces of content that can be easily navigated, previewed, and transcluded because of their atomicity. You contribute by adding new atomic pieces and by creating trails to and through them.

loving this already, have a lot of thoughts and will get back to this later today :slight_smile:

I was just thinking about this today. But in a slightly different context.

I was wondering how to capture the context when trying to capture a thought (see How to capture thoughts efficiently? 🐢). And I realised I’d like to be able to link to a physical book, or a verbal mention of someone, or some file somewhere, or some physical object somewhere, etc. Then I thought, it’s not a complicated URL I need, it’s a note saying how to evoke the context (whether that means reopening a website or thinking about what someone said when they spilled wine on the floor).

So what if we stored just text, and we interpreted that text (NLP, or predefined structure such as the claims/wishes in Dynamicland’s Realtalk). So I could write:

  • book “The Death and Life of Great American Cities” by Jane Jacobs, page 14"
  • Proposal: A common spec excerpt “Should it support many-to-many relationships like a hypergraph?”
  • looking for chocolate at the supermarket
  • feeling good

To me this works for specifying context but it does not work for the references in an EDL… or yes? I’m not sure.

What do you think of URL portability? Considering URLs must behave and look different when mixed with other text, and that they would also need to be easily shareable in cases copying is not possible – think URLs printed on products…

The problem with natural language URLs is that different languages require different structuring and as such, some URLs might be unrecognizable for foreign users and might be even hard for computers to understand.

Should, for instance, the same book in different languages have different addresses?

I find it interesting how the Bible does it, though. You have concepts like books, chapters, and verses. The books being in order also allow positional addressing, beyond named addressing.

I do not have many insights regarding addressing for now, but my take on it is that it might be acceptable to have short, hexadecimal URLs, being a subset of the document hash, just like git commits. They might be easier to use, remember, and transport than a full line of text.

Bob: What is the address for this quarter’s report, again?
Alice: Let me see…it is B39CE

Later on some print document:

[…] as seen in the B39CE report.

Whenever it is contained in a digital medium, it can be replaced by a widget having the name and preview of it.

If the hash of two documents starts with the same characters, it is just a matter of displaying both at acquisition time. If there is a single document where its hash starts with D3, then its absolute address is D3

It should also be better for safety-critical and for audit purposes. It should also be fine for writing on small pieces of paper and even on the hand.

A proposal: From hyperlinks (1965/1993) → to keyword‑only hashtags (2007) → to explorable hashlinks with UUIDs (2020)

#text #metatext #hypertext #multi-line-hashstags #whitespace-hashtag #hashlink


What follows are some of my thoughts on how links (~ URLs, addresses, …) should be designed in 2020.


The design challenge is to make URLs human readable, writable (natural language) and explorable, machine readable (failsafe), space-efficient (short), portable to different contexts (digital ←→ analog) and democratic. Democratic means, there’s no central authority controlling the namespace and in consequence an address is no longer a tradable object like domains are today ($$). In most cases we also want an address to be unique / precise, but in some we want them to be dynamic, to point to the latest version or to a system of related versions. UUIDs are here to help, they guarantee uniqueness across space and time and “support high allocation rates (for use as transaction IDs)”.


I imagine these hashlinks to be rendered like breadcrumbs. I also imagine a simplified single source of truth for navigation structure in browsers / the OS. What if we combine the browser address bar, webpage navigation and subpage navigation in one interface?


Let’s take a look at the proposed structured of link for the english version of a book:

# The Making of Prince of Persia / EN

To create the link, a user would write it exactly like as shown above:

  1. To add an address type ‘#’ (triggers the address input of App; press ‘ESC’ / ‘arrow keys’ to abort)
  2. Followed by writing a label
  3. To specify a browsable relation, segment or hierarchy write ‘ / ’ or ‘ → ’ or ‘ ( ) ’
  4. To confirm & finish editing the address, press enter or spacebar twice.

The handwritten process would be the same.

If multiple items/ versions exist related to the current context (different publishers/ mediums/ …) a dropdown with suggested choices allows the user to select the exact existing resource, otherwise a new resource can be created.


Hidden by default from the GUI we append a UUID to the address.

# The Making of Prince of Persia / EN / F5F58D1B-E617-4EFC-BA15-FBE8935632E9

'#' (or ‘@’?) and 'UUID' would mark the start and end of an address in plain text. A much simpler syntax compared to:
< a href = " https: // www. example .com / ? param = EN " > The Making of Prince of Persia < / a >

The address would be rendered differently depending on its typographic context:
In digital:
– In a paragraph: e.g. underlined
– In a list or when in isolation: with a rich meta-data-preview of the resource
– In all formats: segmented, so parts of the URL are browsable individually (like a breadcrumb navigation)

In analog:
I assume displaying the UUID for an address is in most analog cases unnecessary. Especially if your APP / OS for thinking and opening an address stores a history of how previous addresses were labeled. In case of multiple resources for that label, a recommendation algorithm could then prioritise those labels for you and your closer network of friends, colleagues and followers.

Also, if we hash the label (“The making of …”), the appended UUID could be shorter (contextual label + short UUID).

# The Making of Prince of Persia / F5F58D1B-…


Intermezzo

Incomplete list of UX problems with URLs/ hashtags:

Interfaces for URLs are disconnected:

  • The browser’s address bar (URLs)
  • The website navigation (breadcrumps, menus),
  • Subpage navigation (sections, page numbers, in sentence references)
    They are disconnected because:
  • Different location, appearance and interaction in interface (Application UI vs. Website UI)
  • Not synchronised by default (it’s up to the developer to update the address bar)

URLs are not human readable.

  • Cause: Browsers don’t decode URLs in a readable format (chrome is formatting only ‘https://www’)
  • Cause: Limited to 95 charachters as defined in ASCII (reserved: / ? # [ ] @ : $ & ’ ( ) * + , ; =)
  • Effect: E.g. whitespaces are encoded as %20, all concepts are chained together, no formatting/ typography.
  • Effect: #hashtagsOnlyWorkWithoutWhitespacesAndAreTerribleToRead
  • Effect: URLs are not direclty portabe to e.g. analog contexts, QR-Codes or URL shorteners are necessary.
  • Effect: Handwriting URLs is inconvenient. Reading handwritten URLs is error-prone (URLs are case-sensitive, misinterpretation of whitespaces).

URLs are long and single line only.

  • Cause: New line character or whitespaces not allowed (or encoded but not decoded for display)
  • Effect: Long URLs get cut off in the adress bar. Getting an overview is impossible.
  • Effect: Manual editing is error-prone, especially on touch screens where the cursor is difficult to position.
  • Effect: Depedency on third-party URL shorteners like tinyURL/ bitly or QR-Codes in analog contexts.

URLs brake (404). URLs are stupid. URLs are one-directional.

  • Cause: Changing a single character in a URL brakes it, there’s no tolerance for errors.
  • A change in one place is not pushed to every other place the URL is embedded. No bi-directionality.
  • Effect: Content gets lost, becomes undiscoverable, users need to start a manual investigation.

URLs are not explorable.

  • Segments or hiearchies of an URL in the browsers address bar cannot be explored on their own. On Website breadcrumps solve this issue.

URLs are strings, but should behave more like objects.

  • URLs point to one resource only
  • What if we could add a variety of resources as a list to an address?

URLs are not democratic.

  • Domains are tradable goods. So are IPs. We need equal access for all, without a centralised authority managing the URLs (therefor UUIDs).
  • Writing a link in HTML or any WordProcessing tool requires either programming or advanced application knowledge, which the majority of users lack. URLs are not encrypted / always reveal the resource location (e.g. domain)

If the meaning of a piece of content is different from another one, its address should differ (# Lost in Translation). Ideally the only thing we have to change for such an address would be the language code (ISO-639-1),

1 Like

Great insights @johannesmutter!

I wonder one thing about UUIDs. UUIDs usually can not be obtained from content, as opposed to hashes.

If that is the case, it will be harder to enforce certain properties of authorship and intellectual property in a given information system.

Considering a truly distributed and universal knowledge system, every meaningful piece of content should be independent of context. So any absolute address should point, exclusively, to a unique entity. In that case, paths are necessary only for contextualizing the entity.

Let’s say we have an entity like this:

To be or not to be?

The sha512 hash of this entity is

4390e019365fc87534141e15cf4424abd308a1cfff8b12985b65af0a612f6e339dd33e2c523b934b354eb2e461a1443f248a19dc289b393ad23a041962a55941

Suppose the hash is the absolute address of the entity. Then, no matter where the entity is included, we would be able to track down its origins through a central registration authority.

We would be able to know at which point in time this hash first appeared, where it originated, and who claimed it. We also would be able to track how many references are being made to this entity and, more importantly, where it is referenced. Effectively giving us bidirectional references.

Even further, if you are the registered owner of this entity, you could choose to be notified every time this hash is reconstructed, visited or referenced.

Thus, we could build a P2P network where the same entity could be stored anywhere in a network, with proper integrity verification.

On top of all this, the author might choose to allow plagiarism analysis (which will require the content to be read by an authority) on their entity to further enforce intellectual property checks.

Now if you access the entity by the full absolute address you would always acquire the same entity. But if you want to acquire it under a specific context, then paths are needed.

@Hamlet, Sayings, 4390

This would give the entity in the context of the site Hamlet, where:

@QuoteInvestigator, 4390

Would give the entity as analyzed by the site Quote Investigator.

Since contexts will rarely have entities with digests starting with the same characters, hashes of contextualized entities could have its size reduced to only a very narrow subset of the hash.

Now, of course, the contextualized URLs could also be interpreted as search terms when presented entirely in natural language, but it will not effectively qualify as a proper URL:

@Hamlet, Sayings, To be or

That would be a valid candidate URL for a search operation input. But, ultimately, this should be a partial URL that must resolve to a hash in order to give the exact location of the entity, since there could be many entities that agree on the same terms on the partial URL.

Furthermore, if the entity is a binary, it would be hard to describe a URL to it in natural language. The user would be obligated to give a name to every non-text entity in order for full NL URLs to work. Which might be inconvenient.

Maybe hashes would also facilitate the tracking of fake-news and illegal material, for instance.

What are your takes on this?

Regarding domain names, of course they should not exits. Though I believe that site identifications are required.

Anyone could register any name for any site. Repetition should not be a problem.

If your site has a name equal to other sites, your site would be indexed in a list of sites which have the same name. Maybe the index should be ordered by registration dates: first registers are displayed first. That way plagiarist would never appear first.

Unique site names will have an advantage because the visitor would not need to choose the desired site from an index of names, so if your site has a unique name, your visitors would not be presented with the index before reaching your site.

In any case, your site also would have an absolute address, with the name being a simple alias for it.

@Jon is the first site to register the name ‘Jon’
@Jon#2 is the second site to registers the name ‘Jon’ and so on…

Acessing @Jon might return a list of sites, but maybe there should be short timeout in the index, so that, at the end of this timeout, the user is directed automatically to the first site. The user might also cancel the timeout in order to look for the site she intends to visit.

Maybe a syntax for going directly to the first site:

@Jon!

And to go to the index without the timeout

@Jon

1 Like

These are different things:

  • Identity
  • Name
  • Location

I assume that’s why we have URLs, URNs, and URIs. (It doesn’t make it any simpler, that both URLs and URNs are URIs.)

Identity can be established through:

  • a name,
  • a location (i.e. an address; absolute or relative),
  • but also through content (e.g. hash functions),
  • or arbitrarily (e.g. with UUIDs).

If we don’t look closely enough, they tend to blur and it’s easy to see them as one and the same.
But they serve different purposes.

In response to the initial idea of “A common spec”:

I love the idea of learning about what each of our values are and which direction we’re trying to take things. Talking about what we believe a solution looks like is one approach. I totally support that and think it’ll be super interesting to learn about all the different visions represented in this community.

Another useful exercise could be to talk more directly about something that will heavily influence the kinds of solutions you are open to and which ones you will likely reject:
What are your values that you want to see reflected in the solutions you want to build and why?

For myself, I did that here: https://stefan-lesser.com/2019/12/24/a-future-of-programming/

If you feel motivated to comment on values in general or my values in particular, we should probably move that to a new thread so we don’t derail this one further…?

2 Likes

I believe that the distinction between location and identity should be blurred. Having one less channel to deal with information offers several benefits at the implementation level of large systems. As you can see, I am very fond of immutability, particularly regarding content-addressable systems.

My view is that location-identity distinction is a limitation of physical reality. Computers do not impose these categories. Names, on the other hand, are useful in both realities.

My brief notes (bold) on your list:

There can be, of course, multiple ways to achieve information acquisition at the interface level but ultimately a system should have its own internal/implicit way of implementing it. I believe hashes offers good usability in both contexts: for interfaces and internally.

I think you answered your own question really well in the blog post, It is hard to think of other fundamental values that you did not already underline. If you decide to create a new thread I will definitely try to expand.

Just came across this article, which seems to fit perfectly into this conversation:

https://blog.ilyasterin.com/your-artificial-brain-6735e2366790