• 0 Posts
  • 36 Comments
Joined 1 year ago
cake
Cake day: June 16th, 2023

help-circle
  • I have dozens of projects in varying levels of completion and maybe like 2 finished projects. Here’s my list, steal to your liking because I come up with ideas I want to see in the world, and clearly I’m not a great medium for that:

    • Philotic - p2p network of Python servers based on a generalization of process forking. Every server runs the same file (global scope is initialization) but some id-based guard like an annotation lets them do different things. I designed this to act as a lower layer for eventually splitting an LLM across multiple computers, something very obnoxious to do manually but relatively easy to code.
    • Servitor - Actually probably will continue working on this, it’s a library which makes it easy to use LLMs as “semantic functions”, effectively letting you describe a natural language task and execute it as if it were a normal function.
    • TAO - Type Annotated Objects, more or less CBOR with some personal improvements. 0-0x7f are tiny ints, the first nibble is a type, and the second nibble is usually an embedded length prefix (or signal for a longer length prefix). Being nibble-based and having a dedicated DEBUG type makes it a lot easier to read in hexdumps, and gives twice as many type prefixes to work with. I rearranged the types a bit to be more sane than CBOR (which eg has separate types for negative and positive integers), and also added streaming and varint support.
    • STM/Snow - Structured Text Markup (in-progress name, “Snow” is a bit too informal?), a text serialization format where all data is (un)quoted text, “tags” which are { followed by data or text: data pairs, then }, or “markup” which is [ followed by plaintext interspersed with tags and ending with ]. The mixed positional/named attribute model of tags makes its object model a generalization of XML using JSON-like syntax and I’ve found it to be very easy to implement parsing.
      • My “pie in the sky” dream is to completely overhaul HTML/CSS/JS for STM/simplified CSS/WASM, but that’s never going to happen 😞
    • Munchy - IDL-like language for representing file formats as an executable schema. Eventual goal was for it to be powerful enough to parse textual formats like JSON, which tend to be more contextual. At some point I found a similar project using YAML to define the schemas, but not being a DSL made it more confusing IMO.
    • RetroArch file - A common file format for RetroArch to combine ROMs, patches, cheats, saves, etc into one cohesive format. Never got far with this one.
    • Binary MIME aka contype. I even wrote an RFC apparently? Adorable.
    • LLM modification - A paper I wrote about a potential LLM modification replacing the FF layers with a shared vector database to decouple memorization objectives from semantic and syntactic objectives, resulting in smaller foundation models. Predictably no one cared and probably no one should care, but it might be an interesting experiment to implement.
      • Probably a more useful modification I haven’t seen yet would be to implement kv caching using a prefix tree rather than a per-request cache. That would make semantic functions a lot faster, since it could cache the prompt between requests and it would only have to process the data.
    • Preference vectors - Simple stochastic updating of “preference” and “feature” vectors to transparently associate preferences with content. This would allow people to essentially build their own “The Algorithms”, since the update operation can be designed to create a linear space so you can eg request content close to “my preferences + my mood + heavy metal + randomness”, and share feature vectors on social media. I think when I tested it I made a weird modular space where d(0, 255) = 1, and it still worked. Stochastic updates work, even in a distributed context, because it’s a kind of “simulated annealing”.
    • Wika - Simplified and standardized WikiText parsing (which is surprisingly not actually standardized and MediaWiki essentially defines it as “whatever our PHP parser can read”). Follow-up is naturally a wiki written in anything other than PHP.
    • i2cec - ATtiny11 firmware for bridging the i2c and CEC lines of an HDMI cable so you can send remote control commands via your SMBus to an attached monitor (I accidentally got a TV instead of a normal computer monitor). Never got it to work quite right, timing was very tight.
    • U413 - A unix terminal themed BBS forum with a looong history of makes and remakes and a community getting whittled down to a handful of people.

    And finally then there’s my magnum opus, Espresso, my favorite project I keep coming back to time and time again and bikeshedding refining over many years. If anyone else takes it up I’d be ecstatic.

    • Influences: TypeScript, Python, Lua, Rust, C/++, Julia
    • Self-hosted prototype-based scripting language with its JIT written in itself (eventually)
    • Emphasis on simple rules which build up arbitrary complexity, a pathological avoidance of repetition, conciseness, and near Lispian customizability. SMOL.
    • ASCII lexing with unicode support deferred to other stages (compare to Lua, which treats > 0x7e as an identifier - I also treat <= 0x20 as whitespace).
    • PDA tokenization (used to be FSA/regex but nested format-strings required more power).
    • LR(1) parsing with concurrent bytecode emission (ala Lua), AST is built up until it can be converted to bytecode. The most extreme case is extensive destructuring assignment (Rust, Python, [P2392]) which shares a syntax with object literals, but can be treated as LR(1) by keeping track of a distinction between “lvalue” and “rvalue” compatible syntax.
    • All types defined with proto[...T] Super Sub(...args) { ... }, eg proto class Animal { ... } and proto Animal Monkey { ... }
      • The higher-order types include class, enum, union, interface, struct, etc. Compare to [P0707]
      • Note that these kinds are objects, not keywords. They define how to convert the body to a type constructor and prototype chain(s).
      • It took a few months to figure out this is possible by maintaining separate prototype chains for classes and instances.
    • Statements implicitly end, ; is an optional operator to explicitly disambiguate their ending.
      • after operator, x() after y() == (var t = x(); y(); t) - surprisingly useful for conciseness.
      • “Everything is an expression” - loops are generators, eg list comprehension works like [...for(var x in 10) x] == list(range(10))
    • Operator overloads use the operator itself as the method name, eg proto class { +(rhs) { console.log("\{this} + \{rhs}"); } }
    • Type annotations define compiler semantics: Type objects have a delegate() method which define how to represent the variable on the stack. Untyped variables use an inferred type or an implicit any, which wraps a dynamic object. This lets you create objects like int32 while still using the prototype semantics.
    • Recently I thought the syntax itself can be partially dynamic by using static scoping and macros which hook into the compiler when they’re encountered, but I’ve tried similar things a few times and it tends to lead to failure. This would need something like C#'s unsafe compilation distinction to avoid catastrophic vulnerabilities.
    • “Initialization is compilation” - When you run a file, executing the global scope (“initialization”) is treated as a stage of compilation, and the resulting module object is what is saved, not just its bytecode. Compare this to Python, which saves just the bytecode of the module.
    • Lifetime semantics (ala Rust and friends) are still a WIP.
    • Based on [P0709] and Rust’s try! semantics, exceptions are actually wrapped in a returned Result[T, E] type: try is an operator which unwraps the result and returns if it’s an error. Thus you get var value = try can_fail();. Using type object operator overloading, the Result type doesn’t need to be explicitly annotated because Result[T, E] == T | Result[T, E] == T | fail E.
      • fail keyword instead of throw/raise.
    • Really want proper coroutines using PyPy’s continulet/stacklet abstraction. Also maybe delimited continuations as the implementation for panics.
    • Structured process forking.
    • GC based on ideas from this LuaJIT 2.0 GC document.

    I could go on for hours with all of the stuff I’ve thought of for this language. If you want to know more, the README.md and ideas.md are usually the most authoritative, and specification.md is a very formal description of a subset of the stuff that is absolutely 100% decided (mostly syntax). I’ve written turing complete subsets of it before. First usable implementation to arrive by sometime in 2200 lmao. 🫠 I can also unpack the other projects if you want to know more.


  • The key search term is “userChrome” (userChrome.css and userChrome.js) and XUL, which is the HTML-like language FF uses to define its chrome. “Chrome” is a term that predates Google’s browser, referring to the interface surrounding the displayed web content and Firefox still uses that internally.

    Right now mine is pretty minimal, but there’s a lot you can change. Essentially, the interface is a kind of HTML page which can use the same features as normal HTML and can even run custom JavaScript. Also look into BetterFox for how to remove Mozilla’s own telemetry and bloat.

    My userChrome.css for reference;

    spoiler
    /* Move findbar to the top */
    .browserContainer > findbar {
        -moz-box-ordinal-group:0 !important; /*  for 112 and older */
        order: -1 !important; /* for 113 and newer */
        border-top: none !important;
        border-bottom: 1px solid ThreeDShadow !important;
    }
    
    /* Remove "Open All In Tabs" button in bookmarks folders */
    #PlacesToolbarItems .openintabs-menuitem,
    #placesContext>menuitem[id="placesContext_openContainer:tabs"],
    #placesContext>menuitem[id="placesContext_openContainer:tabs"]:not([hidden])+menuitem+#placesContext_openSeparator {
        visibility: collapse !important;
    }
    
    /* Tabs are attached on the bottom */
    .tab-background {
      border-radius: var(--tab-border-radius) var(--tab-border-radius) 0 0 !important;
      margin-top: 1px !important;
      margin-bottom: 0 !important;
      padding-bottom: 31px !important;
    }
    .tabbrowser-tab[multiselected=true]:not([selected=true]) .tab-background {
      border-radius: var(--tab-border-radius) !important;
      margin-top: 2px !important;
      margin-bottom: 1px !important;
      padding-bottom: 29px !important;
    }
    .tabbrowser-tab[selected=true] .tab-background ,
    .tabbrowser-tab[multiselected=true] .tab-background {
      background-color: var(--toolbar-bgcolor) !important;
      background-image: var(--toolbar-bgimage) !important;
    }
    

    I found this repo which is supposed to apply Chromium styles in line with Google’s Material Design guidelines.

    Here’s an article I found with some simple tweaks.






  • Let me flip it around again - humans regularly “hallucinate”, it’s just not something we recognize as such. There’s neuro-atypical hallucinations, yes, but there’s also misperceptions, misunderstandings, brain farts, and “glitches” which regularly occur in healthy cognition, and we have an entire rest of the brain to prevent those. LLMs are most comparable to “broca’s area”, which neurological case studies suggest naturally produces a stream of nonsense (see: split brain patients explaining the actions of their mute half). It’s the rest of our “cognitive architecture” which conditions that raw language model to remain self-consistent and form a coherent notion of self. Honestly this discussion on “conceptualization” is poorly conceived because it’s unfalsifiable and says nothing about the practical applications. Why do I care if the LLM can conceptualize if it does whatever subset of conceptualization I need to complete a natural language task?

    AI is being super overhyped right now, which is unfortunate because it really is borderline miraculous, yet somehow they’ve overdone it. Emergent properties are empirical observations of behaviors they’re able to at least semi-consistently demonstrate - where it becomes “eye of the beholder” is when we dither on about psychology and philosophy about whether or not they’re some kind of “conscious” - I would argue they aren’t, and the architecture makes that impossible without external aid, but “conscious(ness)” is such a broad term that it barely has a definition at all. I guess to speedrun the overhype misinformation I see:

    • “They just predict one token at a time” is reductive and misleading even though it’s technically true - the loss function for language modeling inevitably requires learning abstract semantic operations. For instance, to complete “The capital of France is” a language model must in some way “know” about countries, cities, and the ontology of France.
    • “It’s just a chatbot” - ChatGPT is a chatbot, GPT-4 is a language model. Language models model how the likelihood of words and language changes over time. When I said “causal” before, this is an arbitrary restriction of the math such that the model only predicts the “next” word. If you remove this restriction, you can get it a sentence with a hole in it and it’ll tell you what words are most likely to be in that hole. You can think of it as being like a physics model, which describes how objects change over time. Putting these into a “generative” context allows you to extract latent semantic information generalized from the training corpus, including higher-order relationships. tl;dr “chatbot” is the first and least interesting application - anything which relates to “understanding” natural language is a potential application.
    • “Hallucinations show that they’re broken” - Hallucinations are actually what you’d expect from these sorts of models. If I had to broadly class the sorts of hallucinations I see, they would be:
      1. Model inaccuracy - Inevitable, but not the only reason. Essentially it failed to generalize in that specific way, like SD and hands.
      2. Unlikely sampling - It’s possible the code which picks the next word given the probability distribution accidentally picks one (or a series) with a very low chance. When this happens, the LLM has no way to “undo” that, which puts it in a very weird position where it has to keep predicting but it’s already in a space that shouldn’t really be possible. There are actually some papers which attempt to correct that, like adding an “undo token” (unfortunately can’t find the paper) or detecting OOD conditions
      3. Extrapolation - Especially for the earlier models with small context windows, if it needs information which is now outside that window it’s still modeling language, just without the necessary context. Without this context, it will instead pick one at random and talk about something unrelated. Compare this to eg dementia patients.
      4. Imagination - When you give it some kind of placeholder, like “<…>”, “etc etc etc” or “## code here ##”, most text in the training data like that will continue as if there was information in that place. Lacking context, just like with “extrapolation”, it picks one at random. You can mitigate this somewhat by telling it to only respond to things that are literally in the text, and GPT-4 doesn’t seem to have this problem much anymore, probably from the RLHF.
      5. Priming - If you prompt the LLM authoritatively enough, eg “find me a case that proves X” which implies such a case exists, if it doesn’t know of any such case, it will create one at random. Essentially, it’s saying “if there was a case that proved X it would look like this”. This is actually useful when properly constrained, eg if you want it to recursively generate code it might use an undefined function that it “wishes” existed.
    • “GPT-5 could be roko’s basilisk!” - No. This architecture is fundamentally incapable of iterative thought processes, for it to develop those itself would require trillions more parameters, if it’s even possible. What’s more, LLMs aren’t utility-maximizers or reinforcement learning agents like we thought AGI would be; they do whatever you ask and have no will or desires of their own. There’s almost 0 chance this kind of model would go rogue, offset only slightly by people using RLHF but that’s human-oriented so the worst you get is the model catering to humans being dumb.
    • “They tek er jerbs!” - Yes, but not because they’re “as good as humans” - they are better when given a specific task to narrowly focus on. The models are general, but they need to be told exactly what to do, which makes them excellent for capitalism’s style of alienated labor. I would argue this is actually be desirable if working wasn’t tied to people’s privilege to continue living - no living human should have to flip burgers when a robot can do it better, otherwise you’re treating the human like a robot.

    I’ll add more if I see or think of any. And if you have any specific questions, I’d be happy to answer. Also I should note, I’m of course using a lot of anthropomorphizing language here but it’s the closest we have to describing these concepts. They’re not human, and while they may have comparable behaviors in isolation, you can’t accurately generalize all human behaviors and their interactions onto the models. Even if they were AGI or artificial people, they would “think” in fundamentally different ways.

    If you want a more approachable but knowledgeable discussion on LLMs and their capabilities, I would recommend a youtuber named Dave Shapiro. Very interesting ideas, he gets a bit far into hype and futurism but those are more or less contained within their own videos.`


  • How about this, then. You’ve proposed that LLMs are not capable of conceptualizing, while I propose that the specifics of the internals don’t matter in this case because LLMs are made of dozens of layers which can easily explain higher orders of abstraction, and they exist as black boxes beyond the mechanics of the model. For the record, I personally know the mathematics and mechanics of how they work as I’ve written my own implementations (and I can answer any specific questions you might have). Is there an experiment you can propose which would falsify your assertion that LLMs cannot conceptualize? I’m taking for granted that they can as the null hypothesis because they can readily produce outputs that appear for all intents and purposes to conceptualize.


  • Did you try this with an LLM? Because GPT-4 analyzes it exactly the same way you did and then some:

    List everything wrong in the sentence “The king of Italy is completely bald because his hair is currently naturally green.”

    The sentence “The king of Italy is completely bald because his hair is currently naturally green” contains several issues:

    1. Factual Inaccuracy: As of my last update in September 2021, Italy is a republic and does not have a king.
    2. Logical Contradiction: The sentence states that the king is “completely bald” but then goes on to describe the color of his hair. If he is bald, he wouldn’t have hair to describe.
    3. Causality Issue: The sentence implies that the king is bald because his hair is green, which doesn’t make logical sense. The color of one’s hair doesn’t cause baldness.
    4. Ambiguity in “Naturally”: The word “naturally” could be interpreted in multiple ways. It could mean that the green color is natural to him (not dyed), or it could mean that the hair turned green on its own. The ambiguity could lead to confusion.
    5. Tense Mismatch: The sentence uses “is completely bald” (present tense) and “is currently naturally green” (also present tense) for the hair, which is contradictory.
    6. Redundancy: The word “currently” may be considered redundant if the sentence is understood to be describing a present condition.
    7. Clarity: The sentence overall is confusing and lacks clarity due to the contradictions and ambiguities mentioned.

    In summary, the sentence has issues ranging from factual inaccuracies to logical contradictions and ambiguities.


    Part of what is surprising about LLMs is they have emergent properties you wouldn’t expect from them being autocomplete on steroids. As it turns out, reducing the loss function for natural language readily generalizes to higher-order abstraction and conceptualization. There do need to be additional layers, for instance allowing an internal monologue, the ability to self-censor or self-correct, and mitigation for low-probability sampling (all of these being inherent limitations with the architecture), but apparently conceptualization is less special than we’d like to think.


  • LLMs are not created to chat, they’re literally what the name says - language models. They are very complex statistical models of the joint causal probability of all possible words given the previous words in the context window. There’s a common misconception that they’re “made for chat” by the wider public because ChatGPT was the first “killer application”, but they are much more general than that. What’s so profound about LLMs to AI and NLP engineers is that they’re general purpose. That is, given the right framework they can be used to complete any task expressible in natural language. It’s hard to convey to people just how powerful that is, and I haven’t seen software engineers really figure this out yet either. As an example I keep going back to, I made a library to create “semantic functions” in Python which look like this:

    @semantic
    def list_people(text) -> list[str]:
        '''List the people mentioned in the given text.'''
    

    That is the entire function, expressed in the docstring. 10 months ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. Here, I just tell the AI to do it and it… does. Just like that. But you can’t hype up an algorithm that does boring stuff like NLP, so people focus on the danger of AI (which is real, but laymen and news focus on the wrong things), how it’s going to take everyone’s jobs (it will, but that’s a problem with our system which equates having a job to being allowed to live), how it’s super-intelligent, etc. It’s all the business logic and doing things that are hard to program but easy to describe that will really show off its power.









  • I don’t think crypto is dead, I think fintech’s usage of crypto is dead. They came in and ruined what could’ve been a unique and revolutionary idea by making prospective currencies into speculative assets. We might see it reemerge in 10 years with capitalists and right-libertarians staying as far away as possible because they (hopefully) learned their lesson. The point of a currency is as a medium to store and exchange value, but the initial spike in fiat value turning 12 bitcoins from $0.12 to $12000 and attracted investors, get rich schemers, and scam artists (but I repeat myself). It doesn’t help that it was designed to have negative inflation, so people were incentivized to hoard and bet on the market’s volatility, and there was no organization dedicated to keeping it stable like the Fed. Then alternatives to PoW like PoS came about which further incentivized hoarding and centralization (you lose stake if you spend, so don’t spend).

    What people miss out on with all the hate about crypto (though the culture around it deserves a lot) is that the technology itself is potentially incredibly useful. Bitcoin was a first crack at the “Byzantine General’s Problem”, essentially how to coordinate a totally trustless and decentralized p2p network. Tying it to money was an easy way to get an incentive structure, but for applications like FileCoin it could just as easily allow for abstracted tit-for-tat services (in their case, “you host my file and I’ll host yours”). Stuff like NFTs have less obvious benefit, but the technology itself is a neutral tool that could see some legitimate use 20 years in the future like, say, a decentralized DNS system where you need a DHT mapping domains to IPNS hashes with some concept of ownership. Collectible monkeys are not and never were a legitimate use-case, at least not at that price point.


  • First I’d like to be a little pedantic and say LLMs are not chatbots. ChatGPT is a chatbot - LLMs are language models which can be used to build chatbots. They are models (like a physics model) of language, describing the causal joint probability distribution of language. ChatGPT only acts like an agent because OpenAI spent a lot of time retraining a foundation model (which has no such agent-like behavior) to model “language” as expressed by an individual. Then, they put it into a chatbot “cognitive architecture” which feeds it a truncated chat log. This is why the smaller models when improperly constrained may start typing as if they were you - they have no inherent distinction between the chatbot and yourself. LLMs are a lot more like broca’s area than a person or even chatbot.

    When I say they’re “general purpose”, this is more or less an emergent feature of language, which encodes some abstract sense of problem solving and tool use. Take the library I wrote to create “semantic functions” from natural language tasks - one of the examples I keep going to in order to demonstrate the usefulness is

    @semantic
    def list_people(text) -> list[str]:
        '''List the people mentioned in the given text.'''
    

    a year ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a massive dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. With LLMs, I just tell the AI to do it and it… does. Just like that. I can ask it to do anything and it will, within reason and with proper constraints.

    GPT-3 was the first generation of this technology and it was already miraculous for someone like me who’s been following the AI field for 10+ years. If you try GPT-4, it’s at least 10x subjectively more intelligent than ChatGPT/GPT-3.5. It costs $20/mo, but it’s also been irreplaceable for me for a wide variety of tasks - Linux troubleshooting, bash commands, ducking coding, random questions too complex to google, “what was that thing called again”, sensitivity reader, interactively exploring options to achieve a task (eg note-taking, SMTP, self-hosting, SSI/clustered computing), teaching me the basics of a topic so I can do further research, etc. I essentially use it as an extra brain lobe that knows everything as long as I remind it about what it knows.

    While LLMs are not people, or even “agents”, they are “inference engines” which can serve as building blocks to construct an “artificial person” or some gradiation therein. In the near future, I’m going to experiment with creating a cognitive architecture to start approaching it - long term memory, associative memory, internal thoughts, dossier curation, tool use via endpoints, etc so that eventually I have what Alexa should’ve been, hosted locally. That possibility is probably what techbros are freaking out about, they’re just uninformed about the technology and think GPT-4 is already that, or that GPT-5 will be (it won’t). But please don’t buy into the anti-hype, it robs you of the opportunity to explore the technology and could blindside you when it becomes more pervasive.

    What would AI have to do to qualify as “capable of some interesting new kind of NLP or can create something entirely new”? From where I stand, that’s exactly what generative AI is? And if it isn’t, I’m not sure what even could qualify unless you used necromancy to put a ghost in a machine…


  • It sounds simple but data conditioning like that is how you get scunthorpe being blacklisted, and the effects on the model even if perfectly executed are unpredictable. It could get into issues of “race blindness”, where the model has no idea these words are bad and as a result is incapable of accommodating humans when the topic comes up. Suppose in 5 years there’s a therapist AI (not ideal but mental health is horribly understaffed and most people can’t afford a PhD therapist) that gets a client who is upset because they were called a f**got at school, it would have none of the cultural context that would be required to help.

    Techniques like “constitutional AI” and RLHF developed after the foundation models really are the best approach for these, as they allow you to get an unbiased view of a very biased culture, then shape the model’s attitudes towards that afterwards.