Skip to content →

neverendingbooks Posts

The enriched vault

In the shape of languages we started from a collection of notes, made a poset of text-snippets from them, and turned this into an enriched category over the unit interval $[0,1]$, following the paper paper An enriched category theory of language: from syntax to semantics by Tai-Danae Bradley, John Terilla and Yiannis Vlassopoulos.

This allowed us to view the text-snippets as points in a Lawvere pseudoquasi metric space, and to define a ‘topos’ of enriched presheaves on it, including the Yoneda-presheaves containing semantic information of the snippets.

In the previous post we looked at ‘building a second brain’ apps, such as LogSeq and Obsidian, and hoped to use them to test the conjectured ‘topos of the unconscious’.

In Obsidian, a vault is a collection of notes (with their tags and other meta-data), together with all links between them.

The vault of the language-poset will have one note for every text-snipped, and have a link from note $n$ to note $m$ if $m$ is a text-fragment in $n$.

In their paper, Bradley, Terilla and Vlassopoulos use the enrichment structure where $\mu(n,m) \in [0,1]$ is the conditional probablity of the fragment $m$ to be extended to the larger text $n$.

Most Obsidian vaults are a lot more complicated, possibly having oriented cycles in their internal link structure.



Still, it is always possible to turn the notes of the vault into a category enriched over $[0,1]$, in multiple ways, depending on whether we want to focus on the internal link-structure or rather on the semantic similarity between notes, or any combination of these.

Let $X$ be a set of searchable data from your vault. Elements of $X$ may be

  • words contained in notes
  • in- or out-going links between notes
  • tags used
  • YAML-frontmatter

Assign a positive real number $r_x \geq 0$ to every $x \in X$. We see $r_x$ as the ‘relevance’ we attach to the search term $x$. So, it is possible to emphasise certain key-words or tags, find certain links more important than others, and so on.

For this relevance function $r : X \rightarrow \mathbb{R}_+$, we have a function defined on all subsets $Y$ of $X$

$$f_r~:~\mathcal{P}(X) \rightarrow \mathbb{R}_+ \qquad Y \mapsto f_r(Y) = \sum_{x \in Y} r_x$$

Take a note $n$ from the vault $V$ and let $X_n$ be the set of search terms from $X$ contained in $n$.

We can then define a (generalised) Jaccard distance for any pair of notes $n$ and $m$ in $V$:

$$ d_r(n,m) = \begin{cases}
0~\text{if $f_r(X_n \cup X_m)=0$} \\ 1-\frac{f_r(X_n \cap X_m)}{f_r(X_n \cup X_m)}~\text{otherwise} \end{cases}$$

This distance is symmetric, $d_r(n,n)=0$ for all notes $n$, and the crucial property is that it satisfies the triangle inequality, that is, for all triples of notes $l$, $m$ and $n$ we have

$$d_r(l,n) \leq d_r(l,m)+d_r(m,n)$$

For a proof in this generality see the paper A note on the triangle inequality for the Jaccard distance by Sven Kosub.

How does this help to make the vault $V$ into a category enriched over $[0,1]$?

The poset $([0,1],\leq)$ is the category with objects all numbers $a \in [0,1]$, and a unique morphism $a \rightarrow b$ between two numbers iff $a \leq b$. This category has limits (infs) and colimits (sups), has a monoidal structure $a \otimes b = a \times b$ with unit object $1$, and an internal hom

$$Hom_{[0,1]}(a,b) = (a,b) = \begin{cases} \frac{b}{a}~\text{if $b \leq a$} \\ 1~\text{otherwise} \end{cases}$$



We say that the vault is an enriched category over $[0,1]$ if for every pair of notes $n$ and $m$ we have a number $\mu(n,m) \in [0,1]$ satisfying for all notes $n$

$$\mu(n,n)=1~\quad~\text{and}~\quad~\mu(m,l) \times \mu(n,m) \leq \mu(n,l)$$

for all triples of notes $l,m$ and $n$.

Starting from any relevance function $r : X \rightarrow \mathbb{R}_+$ we define for every pair $n$ and $m$ of notes the distance function $d_r(m,n)$ satisfying the triangle inequality. If we now take

$$\mu_r(m,n) = e^{-d_r(m,n)}$$

then the triangle inequality translates for every triple of notes $l,m$ and $n$ into

$$\mu_r(m,l) \times \mu_r(n,m) \leq \mu_r(n,l)$$

That is, every relevance function makes $V$ into a category enriched over $[0,1]$.

Two simple relevance functions, and their corresponding distance and enrichment functions are available from Obsidian’s Graph Analysis community plugin.

To get structural information on the link-structure take as $X$ the set of all incoming and outgoing links in your vault, with relevance function the constant function $1$.

‘Jaccard’ in Graph Analysis computes for the current note $n$ the value of $1-d_r(n,m)$ for all notes $m$, so if this value is $a \in [0,1]$, then the corresponding enrichment value is $\mu_r(m,n)=e^{a-1}$.



To get semantic information on the similarity between notes, let $X$ be the set of all words in all notes and take again as relevance function the constant function $1$.

To access ‘BoW’ (Bags of Words) in Graph Analysis, you must first install the (non-community) NLP plugin which enables various types of natural language processing in the vault. The install is best done via the BRAT plugin (perhaps I’ll do a couple of posts on Obsidian someday).

If it gives for the current note $n$ the value $a$ for a note $m$, then again we can take as the enrichment structure $\mu_r(n,m)=e^{a-1}$.



Graph Analysis offers more functionality, and a good introduction is given in this clip:

Calculating the enrichment data for custom designed relevance functions takes a lot more work, but is doable. Perhaps I’ll return to this later.

Mathematically, it is probably more interesting to start with a given enrichment structure $\mu$ on the vault $V$, describe the category of all enriched presheaves $\widehat{V_{\mu}}$ and find out what we can do with it.

(tbc)

Previously in this series:

Next:

The super-vault of missing notes

Leave a Comment

Loading a second brain

Before ChatGPT, the hype among productivity boosters was a PKMs or Personal knowledge management system.

It gained popularity through Tiago Forte’s book ‘Building a second brain’, and (for academics perhaps a more useful read) ‘How to take smart notes’ by Sönke Ahrens.



These books promote new techniques for note-taking (and for storing these notes) such as the PARA-method, the CODE-system, and Zettelkasten.

Unmistakable Creative has some posts on the principles behing the ‘second brain’ approach.

Your brain isn’t like a hard drive or a dropbox, where information is stored in folders and subfolders. None of our thoughts or ideas exist in isolation. Information is organized in a series of non-linear associative networks in the brain.

Networked thinking is not just a more efficient way to organize information. It frees your brain to do what it does best: Imagine, invent, innovate, and create. The less you have to remember where information is, the more you can use it to summarize that information and turn knowledge into action.

and

A network has no “correct” orientation and thus no bottom and no top. Each individual, or “node,” in a network functions autonomously, negotiating its own relationships and coalescing into groups. Examples of networks include a flock of birds, the World Wide Web, and the social ties in a neighborhood. Networks are inherently “bottom-up” in that the structure emerges organically from small interactions without direction from a central authority.

-Tiago Forte, Tagging for Personal Knowledge Management

There are several apps you can use to start building your second brain, the more popular seem to be Roam Research, LogSeq, and Obsidian.

These systems allow you to store, link and manipulate a large collection of notes, query them as a database, modify them in various ways via plugins or scripts, and navigate the network created via graph-views.

Exactly the kind of things we need to modify the simple system from the shape of languages-post into a proper topos of the unconscious.

I’ve been playing around with Obsidian which I like because it has good LaTeX plugins, powerful database tools via the Dataview plugin, and one can execute codeblocks within notes in almost any programming language (python, haskell, lean, Mathematica, ruby, javascript, R, …).

Most of all it has a vibrant community of users, an excellent forum, and a well-documented Obsidian hub.

There’s just one problem, I’m a terrible note-taker, so how can I begin to load my ‘second brain’?

Obsidian has several plugins to import data, such as your Kindle highlights, your Twitter feed, your Readwise-data, and many others, but having been too lazy in the past, I cannot use any of them.

In fact, the only useful collection of notes I have are my blog-posts. So, I’ve uploaded NeverEndingBooks into Obsidian, one note per post (admittedly, not very Zettelkasten-like), half a million words in total.

Fortunately, I did tag most of these posts at the time. Together with other meta-data this results in the Graph view below (under ‘Files’ toggled tags, under ‘Groups’ three tag-colours, and under ‘Display’ toggled arrows). One can add colour-groups based on tags or other information (here, red dots are posts tagged ‘Grothendieck’, the blue ones are tagged ‘Conway’, the purple ones tagged ‘Connes’, just for the sake of illustration). In Obsidian you can zoom into this graph, place a pointer on a node to highlight the connecting dots, and much more.



Because I tend to forget such things, and as it may be useful to other people running a WordPress-blog making heavy use of MathJax, here’s the procedure I followed:

1. Follow the instructions from Convert wordpress articles to markdown.

In the wizard I’ve opted to go only for yearly folders, to prefix posts with the date, and to save all images.

2. This gives you a directory with one folder per year containing markdown versions of your posts, and in each year-folder a subfolder ‘img’ containing all images.

Turn this directory into an Obsidian-vault by opening Obsidian, click on the ‘open another vault’ icon (third from bottom-left), select ‘Open folder as vault’ and navigate to your directory.

3. You will notice that most of your LaTeX cannot be parsed because during the markdown-process backslashes are treaded as special character, resulting in two backslashes for every LaTeX-command…

A remark before trying to solve this: another option might be to use the wordpress-to-hugo-exporter, resulting in clean LaTeX, but lacking the possibility to opt for yearly-folders (it dumps all posts into one folder), and it makes a mess of the image-files.

4. So, we will need to do a lot of search&replaces in all files, and need a convenient tool for this.

First option was the Sublime Text app, which is free and does the search&replaces quickly. The problem is that you have to save each of the files, one at a time! This may take hours.

I’ve done it using the Search and Replace app (3$) which allows you to make several searches/replaces at the same time (I messed up LaTeX code in previous exports, so needed to do many more changes). It warns you that it is dangerous to replace strings in all files (which is the reason why Sublime Text makes it difficult), you can ignore it, but only after you put the ‘img’ folders away in a safe place. Otherwise it will also try to make the changes to these files, recognise that they are not text-files, and drop them altogether…

That’s it.

I now have a backup network-version of this blog.



As we mentioned in the previous post a first attempt to construct the ‘topos of the unconscious’ might be to start with a collection of notes (the ‘conscious’) and work on the semantics of text-snippets to unravel (a part of) the unconscious underpinning of these notes. We also mentioned that the poset-structure in that post should be replaced by a more involved network structure.

What interests me most is whether such an approach might be doable ‘in practice’, and Obsidian looks like the perfect tool to try this out.

What we need is a sufficiently large set of notes, of independent interest, to inject into Obsidian. The more meta it is, the better…

(tbc)

Previously in this series:

Next:

The enriched vault

Leave a Comment

The shape of languages

In the topology of dreams we looked at Sibony’s idea to view dream-interpretations as sections in a fibered space.

The ‘points’ in the base-space and fibers consisting of chunks of text, perhaps connected by links. The topology and shape of this fibered space is still shrouded in mystery.

Let’s look at a simple approach to turn a large number of texts into a topos, and define a loose metric on it.

There’s this paper An enriched category theory of language: from syntax to semantics by Tai-Danae Bradley, John Terilla and Yiannis Vlassopoulos.

Tai-Danae Bradley is an excellent communicator of everything category related, so probably it is more fun to read her own blogposts on this paper:

or to watch her Categories for AI talk: ‘Category Theory Inspired by LLMs’:

Let’s start with a collection of notes. In the paper, they consider all possible texts written in some language, but it may be a set of webpages to train a language model, or a set of recollections by someone.

Next, shred these notes into chunks of text, and point one of these to all the texts obtained by deleting some words at the start and/or end of it. For example, the note ‘a red rose’ will point to ‘a red’, ‘red rose’, ‘a’, ‘red’ and ‘rose’ (but not to ‘a rose’).

You may call this a category, to me it is just as a poset $(\mathcal{L},\leq)$. The maximal elements are the individual words, the minimal elements are the notes, or websites, we started from.



A down-set $A$ of this poset $(\mathcal{L},\leq)$ is a subset of $\mathcal{L}$ closed under taking smaller elements, that is, if $a \in A$ and $b \leq a$, then $b \in A$.

The intersection of two down-sets is again a down-set (or empty), and the union of down-sets is again a downset. That is, down-sets define a topology on our collection of text-snippets, or if you want, on language-fragments.

For example, the open determined by the word ‘red’ is the collection of all text-fragments containing this word.

The corresponding presheaf topos $\widehat{\mathcal{L}}$ is then just the category of all (set-valued) presheaves on this topological space.
As an example, the Yoneda-presheaf $\mathcal{Y}(p)$ of a text-snippet $p$ is the contra-variant functor

$$(\mathcal{L},\leq) \rightarrow \mathbf{Sets}$$

sending any $q \leq p$ to the unique map $\ast$ from $q$ to $p$, and if $q \not\leq p$ then we map it to $\emptyset$. If $A$ is a down-set (an open of over topological space) then the sections of $\mathcal{Y}(p)$ over $A$ are $\{ \ast \}$ if for all $a \in A$ we have $a \leq p$, and $\emptyset$ otherwise.

The presheaf $\mathcal{Y}(p)$ already contains some semantic information about the snippet $p$ as it gives all contexts in which $p$ appears.

Perhaps interesting is that the ‘points’ of the topos $\widehat{\mathcal{L}}$ are the notes we started from.

Recall that Connes and Gauthier-Lafaey want to construct a topos describing someone’s unconscious, and points of that topos should be the connection with that person’s consciousness.

Suppose you want to unravel your unconscious. You start by writing down a large set of notes containing all relevant facts of your life. Then you construct from these notes the above collection of snippets and its corresponding pre-sheaf topos. Clearly, you wrote your notes consciously, but probably the exact phrasing of these notes, or recurrent themes in them, or some text-combinations are ruled by your unconscious.

Ok, it’s not much, but perhaps it’s a germ of an potential approach…



(Image credit)

Now we come to the interesting part of the paper, the ‘enrichment’ of this poset.

Surely, some of these text-snippets will occur more frequently than others. For example, in your starting notes the snippet ‘red rose’ may appear ten time more than the snippet ‘red dwarf’, but this is not visible in the poset-structure. So how can we bring in this extra information?

If we have two text-snippets $p$ and $q$ and $q \leq p$, that is, $p$ is a connected sub-string of $q$. We can compute the conditional probability $\pi(q|p)$ which tells us how likely it is that if we spot an occurrence of $p$ in our starting notes, it is part of the larger sentence $q$. These numbers can be easily computed and from the rules of probability we get that for snippets $r \leq q \leq p$ we have that

$$\pi(r|p) = \pi(r|q) \times \pi(q|r)$$

so these numbers (all between $0$ and $1$) behave multiplicative along paths in the poset.

Nice in theory, but it requires an awful lot of computation. From the paper:

The reader might think of these probabilities $\pi(q|p)$ as being most well defined when $q$ is a short extension of $p$. While one may be skeptical about assigning a probability distribution on the set of all possible texts, it’s reasonable to say there is a nonzero probability that cat food will follow I am going to the store to buy a can of and, practically speaking, that probability can be estimated.

Indeed, existing LLMs successfully learn these conditional probabilities $\pi(q|p)$ using standard machine learning tools trained on large corpora of texts, which may be viewed as providing a wealth of samples drawn from these conditional probability distributions.

It may be easier to have an estimate $\mu(q|p)$ of this conditional probability for immediate successors (that is, if $q$ is obtained from $p$ by adding one word at the beginning or end of it), and then extend this measure to all arrows in the poset by taking the maximum of products along paths. In this way we have for all $r \leq q \leq p$ that

$$\mu(r|p) \geq \mu(r|q) \times \mu(q|p)$$

The upshot is that this measure $\mu$ turns our poset (or category) $(\mathcal{L},\leq)$ into a category ‘enriched’ over the unit interval $[ 0,1 ]$ (suitably made into a monoidal category).

I’ll spare you the details, just want to flash out the corresponding notion of ‘enriched presheaves’ which are the objects of the semantic category $\widehat{\mathcal{L}}^s$ in the paper, which is the enriched version of the presheaf category $\widehat{\mathcal{L}}$.

An enriched presheaf is a function (not functor)

$$F~:~\mathcal{L} \rightarrow [0,1]$$

satisfying the condition that for all text-snippets $r,q \in \mathcal{L}$ we have that

$$\mu(r|q) \leq [F(q),F(r)] = \begin{cases} \frac{F(r)}{F(q)}~\text{if $F(r) \leq F(q)$} \\ 1~\text{otherwise} \end{cases}$$

Note that the enriched (or semantic) Yoneda presheaf $\mathcal{Y}^s(p)(q) = \mu(q|p)$ satisfies this condition, and now this data not only records the contexts in which $p$ appears, but also measures how likely it is for $p$ to appear in a certain context.

Another cute application of the condition on the measure $\mu$ is that it allows us to define a ‘distance function’ (satisfying the triangle inequality) on all text-snippets in $\mathcal{L}$ by

$$d(q,p) = \begin{cases} -ln(\mu(q|p))~\text{if $q \leq p$} \\
\infty~\text{otherwise} \end{cases}$$

So, the higher $\mu(q|p)$ the closer $q$ lies to $p$, and now the snippet $p$ (example ‘red’) not only defines the open set in $\mathcal{L}$ of all texts containing $p$, but now we can structure the snippets in this open set with respect to this ‘distance’.



In this way we can turn any language, or a collection of texts in a given language, into what Lawvere called a ‘generalized metric space’.

It looks as if we are progressing slowly in our, probably futile, attempt to understand Alain Connes’ and Patrick Gauthier-Lafaye’s claim that ‘the unconscious is structured like a topos’.

Even if we accept the fact that we can start from a collection of notes, there are a number of changes we need to make to the above approach:

  • there will be contextual links between these notes
  • we only want to retain the relevant snippets, not all of them
  • between these ‘highlights’ there may also be contextual links
  • texts can be related without having to be concatenations
  • we need to implement changes when new notes are added
  • … (much more)

Perhaps, we should try to work on a specific ‘case’, and explore all technical tools that may help us to make progress.

(tbc)

Previously in this series:

Next:

Loading a second brain

Leave a Comment