• by simonw on 9/6/2023, 3:53:04 AM

    There's a video of a previous version of this tool here which I found really helped me understand what it was demonstrating: https://twitter.com/thesephist/status/1617747154231259137

    It's really neat to see how this sentence:

    > The first time I write this sentence, the model is quite confused about what token is about to come next, especially if I throw in weird words like pumpkin, clown, tweets, alpha, teddy bear.

    Shows that the words pumpkin, clown etc are considered really unlikely. But when the sentence is repeated a moment later, all of the words become extremely predictable to the model.

    Also worth noting: this demo runs entirely in the browser! It loads a 120MB ONNX version of GPT-2 using Transformers.js.

  • by didgeoridoo on 9/6/2023, 2:09:19 AM

    Really interesting! I wonder how well this syncs up with human intuition and general “information density”. If it’s a close match, maybe you could use this as a tool to help with skimming documents — the red (“hard to predict”) areas might be a good hint to slow down and read more carefully, while the green (“easy to predict”) areas might mean you could skim without losing too much unpredictable information.

  • by thomasfromcdnjs on 9/6/2023, 9:41:30 AM

    This is beautiful. Needs to be a standard tool for all models.

    Great work!

  • by netipulk on 9/8/2023, 9:30:19 PM

    The highlights being very similar in red and green is a complete nightmare for me because I have deuteranopia. You should probably fix that.

  • by skybrian on 9/6/2023, 2:40:09 PM

    I’m sure it’s neat but it shouldn’t start running on load, because some people are browsing on mobile.

  • by Scene_Cast2 on 9/6/2023, 4:36:13 AM

    Any chance of Llama2 support?

  • by atgctg on 9/6/2023, 5:52:30 AM

    It would be interesting to have attention visualized as well, similar to how it's done in BertViz:

    https://github.com/jessevig/bertviz