One drawback of modern transformers is that each token uses the same amount of predictive compute. However, some tokens are much easier to predict than others. This work from DeepMind allows models to exit early during generation to spend less flops on certain tokens, effectively opening the door to dynamic compute - with a fixed maximum. The results are 50% fewer flops at generation time for equivalent performance.
Friday, April 5, 2024Google's DeepMind has open-sourced its differentiable fusion tokamak simulator written in Python-Jax. The simulator has strong auto-diff capabilities and covers a number of extremely powerful PDEs.
DeepMind is developing an AI technology called V2A to generate synchronized soundtracks for videos. It uses diffusion models trained on audio, dialogue transcripts, and video clips to create music, sound effects, and dialogue.
DeepMind has combined a pre-trained Gemini-style language model with an AlphaGo-style RL algorithm to train a model to solve International Mathematics Olympiad (IMO) problems at a silver medal level. The system solved 4/6 problems in this year's challenge.
DeepMind's AI systems, AlphaProof and AlphaGeometry 2, have achieved a significant breakthrough in mathematical reasoning by solving four out of six problems from the International Mathematical Olympiad, earning a silver-medal level score. AlphaProof, a reinforcement learning system, tackled problems in algebra and number theory, even solving the most challenging problem. AlphaGeometry 2, an improved version of its predecessor, did well in geometry, proving one of the problems within 19 seconds.
This post is a long in-depth summary of what DeepMind is working on in its AGI safety and alignment research efforts.