Hi, I’m Tim.
I am a researcher and engineer trying to figure out how machines learn. I am also interested in solving problems arising from the creation and adoption of artificially intelligent systems.
Currently, I am working on interpretability, red-teaming, and steering of language models with the Algorithmic Alignment Group. You can find works of mine below or in my projects.
Previous Work
The Effect of Activation Functions On Superposition in Toy Models
Timothy Kostolansky*, Vedang Lad*
Blog Post
Iterative Interactive Inverse Constitutional AI (I^3CAI)
Timothy Kostolansky*, Julian Manyika*
Class Project