I am a researcher and engineer trying to figure out how machines learn. I am also interested in solving problems arising from the creation and adoption of artificially intelligent systems.
Currently, I am working on interpretability, red-teaming, and steering of language models with the Algorithmic Alignment Group. You can find works of mine below or in my projects.
Timothy Kostolansky*, Vedang Lad*
Blog Post
Timothy Kostolansky*, Julian Manyika*
Class Project