Meta to fast-track release of supercomputer in first half of 2022
Meta, formerly Facebook, said it could be well ahead of competitors when it completes the building of the first artificial intelligence-based supercomputer in the first half of 2022.
The company’s AI Research SuperCluster (RSC) is touted among the fastest AI supercomputers running today and will be the fastest AI supercomputer when it is fully completed in mid-2022.
Supercomputers that are capable of performing quadrillions (the number represented as one followed by 15 zeros) of calculations per second are central in the global technology and information race – one that is particularly fierce between China and the United States.
While a typical computer may have one central processing unit (CPU) with one to 16 cores – individual processing units – a supercomputer can contain thousands or even millions of cores, each allowing them to execute very specialised computer operations.
The Fugaku supercomputer located at RIKEN Centre for Computational Science in Kobe, Japan is projected as the world’s fastest supercomputer, according to Top50 which ranks computers around the world. However, Meta sees its AI supercomputer as being capable of carrying out operations never before done by any supercomputer.
The company says its researchers have started using RSC to train large models in natural language processing (NLP) and computer vision for research, with the aim of one-day training models with trillions of parameters.
“RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples; work across hundreds of different languages; seamlessly analyze text, images, and video together; develop new augmented reality tools; and much more. Our researchers will be able to train the largest models needed to develop advanced AI for computer vision, NLP, speech recognition, and more,” Meta noted in a statement.
The company’s quest for an AI supercomputer began in 2013 with the creation of the Facebook AI Research lab. AI supercomputers are built by combining multiple GPUs into compute nodes, which are then connected by a high-performance network fabric to allow fast communication between those GPUs.
The lab has made significant strides in AI thanks to our leadership in a number of areas, including self-supervised learning, where algorithms can learn from vast numbers of unlabeled examples, and transformers, which allow AI models to reason more effectively to focus on certain areas of their input.
To fully realize the benefits of self -supervised learning and transformer-based models, various domains, whether vision, speech, language, or for critical use cases like identifying harmful content, will require training increasingly large, complex, and adaptable models. Computer vision, for example, needs to process larger, longer videos with higher data sampling rates. Speech recognition needs to work well even in challenging scenarios with lots of background noise, such as parties or concerts. NLP needs to understand more languages, dialects, and accents. And advances in other areas, including robotics, embodied AI, and multimodal AI will help people accomplish useful tasks in the real world.
High-performance computing infrastructure is a critical component in training such large models, and Meta’s AI research team has been building these high-powered systems for many years. The first generation of this infrastructure, designed in 2017, has 22,000 NVIDIA V100 Tensor Core GPUs in a single cluster that performs 35,000 training jobs a day. Up until now, this infrastructure has set the bar for Meta’s researchers in terms of its performance, reliability, and productivity.
“We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together. Ultimately, the work done with RSC will pave the way toward building technologies for the next major computing platform — the metaverse, where AI-driven applications and products will play an important role,” the company noted.