Only a week after Nvidia’s new AI-focused Volta GPU architecture was announced, Google aims to steal some of its thunder with its new, second-generation, Tensor Processing Unit (TPU) that it calls a Cloud TPU. While its first generation chip was only suitable for inferencing, and therefore didn’t pose much of a threat to Nvidia’s dominance in machine learning, the new version is equally at home with both the training and running of AI systems.
A new performance leader among machine learning chips
At 180 teraflops, Google’s Cloud TPU packs more punch, at least by that one measure, than the Volta-powered Tesla V100 at 120 teraflops (trillion floating point operations per second). However, until both chips are available, it won’t be possible to get a sense of a real world comparison. Much like Nvidia has built servers out of multiple V100s, Google has also constructed TPU Pods that combine multiple TPUs to achieve 11.5 petaflops (11,500 teraflops) of performance.
For Google, this performance is already paying off. As one example, a Google model that required an entire day to train on a cluster of 32 high-end GPUs (probably Pascal), can be trained in an afternoon on one-eighth of a TPU Pod (a full pod is 64 TPUs, so that means on 8 TPUs). Of course, standard GPUs can be used for all sorts of other things, while the Google TPUs are limited to the training and running of models written using Google’s tools.
You’ll be able to rent Google Cloud TPUs for your TensorFlow applications
Google is making its Cloud TPUs available as part of its Google Compute offering, and says that they will be priced similar to GPUs. That isn’t enough information to say how they will compare in cost to renting time on an Nvidia V100, but I’d expect it to be very competitive. One drawback, though, is that the Google TPUs currently only support TensorFlow and Google’s tools. As powerful as they are, many developers will not want to get locked into Google’s machine learning framework.
Nvidia isn’t the only company that should be worried
While Google is making its Cloud TPU available as part of its Google Compute cloud, it hasn’t said anything about making it available outside Google’s own server farms. So it isn’t competing with on-premise GPUs, and certainly won’t be available on competitive clouds from Microsoft and Amazon. In fact, it is likely to deepen their partnerships with Nvidia.
The other company that should probably be worried is Intel. It has been woefully behind in GPUs, which means it hasn’t made much of a dent in the rapidly growing market for GPGPU (General Purpose computing on GPUs), of which machine learning is a huge part. This is just one more way that chip dollars that could have gone to Intel, won’t.
Big picture, more machine learning applications will be moving to the cloud. In some cases — if you can tolerate being pre-empted — it’s already less expensive to rent GPU clusters in the cloud than it is to power them locally. That equation is only going to get more lopsided with chips like the Volta and the new Google TPU being added to cloud servers. Google knows that key to increasing its share of that market is having more leading edge software running on its chips, so it is making 1,000 Cloud TPUs available for free to researchers willing to share the results of their work.