Cerebras' new wafer chip can train 24T parameter language models. It natively supports PyTorch.
Tuesday, March 26, 2024Cerebras, a California-based company, has demonstrated that its second-generation wafer-scale engine is significantly faster than the world's faster supercomputer in molecular dynamics calculations. It can also perform sparse large language model inference at one-third of the energy cost of a full model without losing any accuracy. Both achievements are possible due to the interconnects and fast memory access enabled by Cerebras' hardware. Cerebras is looking to extend the applications of its wafer-scale engine to a larger class of problems, including molecular dynamics simulations of biological processes and simulations of airflow around vehicles.
Cerebras, a California-based company, has demonstrated that its second-generation wafer-scale engine is significantly faster than the world's faster supercomputer in molecular dynamics calculations. It can also perform sparse large language model inference at one-third of the energy cost of a full model without losing any accuracy. Both achievements are possible due to the interconnects and fast memory access enabled by Cerebras' hardware. Cerebras is looking to extend the applications of its wafer-scale engine to a larger class of problems, including molecular dynamics simulations of biological processes and simulations of airflow around vehicles.
Cerebras' chipset has massive unified memory. As a result, it can sidestep bandwidth issues and serve models at thousands of tokens per second.