Thread by @JoshuaLelon, I love @RoamResearch, but if I were to make it *JUST* for [...]

I love @RoamResearch, but if I were to make it *JUST* for me, I'd let the entire product be constructed on two principles:

- inline, recursive tags as manifestations of set theory (think: multi-dim venn diagrams)
- Taking DRY (Don't Repeat Yourself) to an absurd extreme

1/

2/ Here's my wishlist for a notes database:

- have I written something like this before? What're the most similar things to this?
- This is similar to thing x I wrote 3 months ago, can I merge their commonalities and branch their differences into new ideas?

3/

- I want structure to emerge organically, but I also want to enforce structure here and there when I know it exists.
- I want my notes to coalesce into common themes and general principles, with a long tail of fragmented thoughts. I want entropy to be my friend.

4/

- I want combining my database with someone else's to be like taking a set union e.g. {1, 2, 3} U {3, 4} = {1, 2, 3, 4}

So the question is.. how the heck would you make something that satisfies this?

5/ Have you ever heard the debate: Is Math created or discovered?

We obviously created languages, but once a language is created, new permutations of it are discovered constantly.

Those permutations are static. We rewrite the same words all the time.

6/ If I find myself re-writing the same thing again 6 months later, I want to know. The trouble is: it will almost never be the exact same thing.

Let's assume anything you want to find in your database is instantly find-able. Let's abstract that away for now.

7/ You write thing B.

You know that thing A that you wrote earlier is similar to B.

We want to fight duplication, so you coalesce A and B, and now you have 3 things:

- Things in A not in B (A - B)
- Things in B not in A (B - A)
- Things in A and B (B intersect A)

/8 You now replace A with (A intersect B) + (A - B), and you replace B with (A intersect B) + (B - A).

Mathematically, A and B are unchanged.

But physically, you've removed all redundancy (the A intersect B parts of A and B are replaced with a reference)

Did I lose you?

/9 Long story short, the idea is to have a single source of truth for everything you ever write.

The moment you find yourself writing something similar to what you previously wrote, the commonality between the two entries gets coalesced.

/10 If you want a better idea of what this looks like, imagine writing your 100th note in this imaginary database.

By the time you're finished, lines 2-5 will be a chunk from note 33.

Lines 18-19 will be a chunk from note 29 + a chunk from note 68.

etc.

/11 Your entire note database is a giant multidimensional venn diagram.

Okay, so how do you manage this exponentially growing mess?

Tags!

But not just any tags.

Inline, recursive tags! Let's break that down.

/12 If you've ever used a highlighter, that's what I mean. Imagine multiple colored highlighters.

When you write your first note, you'll "highlight" it (i.e. tag it) with various tags. Make it really granular.

Here's another area where I think people get tags wrong:

/13 Tags don't have to be flimsy and broad. Like, "education" or "physics" or "recipes". That doesn't really help anybody.

You can start that way. But I'm removing the assumption that tags have to only be a word or 2.

I want them to be as specific as possible (sentences?)

/14 I also want them to be documents ***themselves***!

That way, you can go through your tags, and you can TAG YOUR TAGS!

And so on, recursively!

Ok, what fresh Hell is this?

/15 Let's say you write your first note.

You tag stuff.

- Some broad tags ("Math")
- some more specific tags ("Almost the Bolzano–Weierstrass theorem except without the assumption of boundedness.")

/16 Then you go through and tag your tags.

(The second tag gets tagged as "B-W Theorem" and "Math")

Okay, now you go write your second note.

That note gets some tags, which get tagged as "B-W Theorem", which get tagged as "Math"

/17 Rinse and repeat 100 times, and you end up with this weird ontology of thoughts. Your thoughts.

And if you've done a lot of tagging, it's going to be a huge, weird, interconnected mess.

Which is great for serendipitous discoveries!

But bad for navigation. Or is it?

/18 We can divide searching into two classes of intent:

- Seeking
- Wandering

If you're seeking a specific entry, or a specific document, there's always the old ways, like keyword search and CTRL + F.

Now, you can *also* start with general tags, and get more granular.

/19 But, EVEN BETTER, it also kind of works like Roam as well, in the bi-directional linking sense.

How?

Well, if you haven't heard of something called Hamming distance, or Edit distance, it's essentially the number of changes you need to make to change word X to word Y.

/20 Hamming distance between 101 and 111 is 1. Between 000 and 111 is 3.

Imagine a list of all the tags you've ever used in your database.

Every chunk of text in your database either has that tag, or it doesn't.

Hence, each chunk has a "tag string":

/21 If your chunk has no tags, then its tag string is:

0000... (for however many tags you have)

If your chunk has 3 tags, it looks something like:

111000..

or maybe 10010100000...

(some string that only has 3 one's, which corresponds to those 3 tags)

/22 If it has *all* the tags, then it's 11111111...

What does all this mean?

Well, if we changed one thing in our tag string, that's equivalent to either adding a tag or removing a tag.

This would give us a whole new set of chunks that are pretty similar to our previous chunk!

/23 If we have a robust, inline tag system, then changing what tags we filter for means we're effectively using the Hamming distance as a "nearest neighbor" query.

This is how we would "wander" our database! One tag change at a time.

/24 Okay, so we've solved search. We've solved wandering. No duplication.

What else is left?

What happens when you edit or delete something? Does it break the system?

Well, if you edit, then you do a Venn Diagram synthesis.

/25 If you delete it, then you don't have to worry about dangling connections. Since traversing is dependent on tags (not content), it won't matter.

What about UX? I want to make this *magical* to use.

I want to be like a surgeon with nurses handing me things as I need them.

/26 Still working on that (I mean, this is all just an idea at the moment anyhow), but my present solution is:

A word bank. Specifically: a tag bank. Floating on the right, or something.

As you write stuff, some magnificent form of NLP model throws tag recommendations at you.

/27 The more I write this, the more I just want to build it myself. I kind of hope someone just does it... but then again, that's Schlep blindness.

Reference: http://www.paulgraham.com/schlep.html

*cracks knuckles*

Latest Threads Unrolled: