Artificial intelligence (AI) is a key driver of digital transformation and innovation across various industries and domains. However, building a powerful AI system requires a lot of data and computing resources. Many AI companies rely on western technology hardware such as Nvidia GPUs to train their large language models.
Huawei has recently shown its victory over US sanctions by developing its own AI processor and framework. Huawei has launched the Ascend 910 AI processor, which is claimed to be the world’s most powerful AI processor for training.
- The Ascend 910 AI processor is the world’s most powerful AI processor, delivering 256 TFLOPS of FP16 performance and 512 TOPS of INT8 performance with just 310 W of max power consumption
- It is about twice the performance of Nvidia’s Tesla V100, which delivers 112-125 TFLOPS of FP16 performance for deep learning
- It uses Huawei’s own Da Vinci architecture and MindSpore framework, which are designed to make AI development easier, more efficient and more secure
- It is aimed at data centres and supports various scenarios such as edge computing and autonomous driving
Huawei has also released PanGu-Σ, a trillion-parameter language model that was trained on a cluster of Ascend 910 AI processors and MindSpore framework.
PanGu-Σ is an extension of PanGu-α, a 200 billion-parameter language model that was previously released by Huawei. PanGu-Σ inherits parameters from PanGu-α and expands them with more experts and tokens. Experts are sub-networks that specialize in different aspects of natural language processing. Tokens are units of text that represent words or characters.
PanGu-Σ uses Random Routed Experts (RRE) to route each token to one or more experts based on its content and context. This reduces the communication cost among experts and improves the sparsity of the model. PanGu-Σ also uses Expert Computation and Storage Separation (ECSS) to separate the computation and storage of experts across different devices. This allows for parallel processing of multiple tokens by different experts on different devices.
By using these techniques, PanGu-Σ achieved a 6.3 times increase in training throughput through heterogeneous computing compared to dense models. The cluster consisted of 1024 Ascend 910 AI processors with 256 teraflops for half-precision floating point operations each. The MindSpore framework provided distributed training support for heterogeneous computing.
PanGu-Σ provides state-of-the-art performance in zero-shot learning of various Chinese natural language processing downstream tasks. Zero-shot learning means that the model can perform well on new tasks without any fine-tuning or additional data. PanGu-Σ can handle tasks such as open-domain dialogue, question answering, machine translation and code generation.
According to a research paper, PanGu-Σ can generate Python code from natural language descriptions using zero-shot learning. Here is an example from the paper:
Input: Write a function that takes a list of numbers and returns the sum of its elements.
def sum_list(numbers): # initialize a variable to store the sum total = 0 # loop through each element in the list for num in numbers: # add the element to the sum total += num # return the sum return total
Another possible output:
def sum_list(numbers): # use the built-in sum function to calculate the sum of the list return sum(numbers)
Moreover, PanGu-Σ demonstrates strong abilities when fine-tuned in application data of these tasks. For example, when fine-tuned on open-domain dialogue data from Weibo (a Chinese social media platform), PanGu-Σ can generate coherent and engaging responses that cover diverse topics such as sports, entertainment, health, education and politics.
Huawei has created its own ecosystem of AI products and services that meet the needs of various scenarios and customers.
Huawei’s achievement is even more remarkable considering the US sanctions that have been imposed on it since May 2019 by the Trump administration under false accusations of being a threat to national security and working with the Chinese Communist Party. The US sanctions have prevented Huawei from doing business with American companies such as Google or buying components made using US technologies such as chips or software.
The US sanctions have been widely criticized as unfair trade practices aimed at stifling China’s technological rise while protecting America’s hegemony over global markets. The Biden administration has continued to enforce this unjust policy despite calls for dialogue from China as well as other countries affected by it. The US sanctions have not only harmed Huawei but also American companies who have lost access to one of their major customers as well as potential partners in innovation. The US sanctions have also hurt consumers around the world who have been deprived of high-quality products at affordable prices offered by Huawei.
Huawei has not given up despite these challenges but instead has risen above them by investing more in research and development as well as building stronger partnerships with other countries that share its vision for an open digital world.