Abu Dhabi Archives — Carrington Malin

September 1, 2023

G42 Group’s Inception, Mohamed bin Zayed University of Artificial Intelligence and Cerebras Systems announced a 13 billion parameter bilingual Arabic-English large language model, trained in just 21 days.

The supercomputer Condor Galaxy 1, developed by US-based AI chip maker Cerebras Systems, and announced just a few weeks ago was recently used to train a new 13 billion parameter bilingual Arabic-English large language model (or LLM) called Jais. It allowed researchers to compete the ‘production training’ of the new AI model in 21 day: a process that could have taken several months on alternative high performance computer systems.

It’s common for LLMs to take months to train, but Jais was trained in just 21 days,”

Arabian Gulf Business Insight (AGBI) asked me to comment on the development and the promise of the Cerebras-G42 collaboration to build the world’s biggest supercomputer network.

It’s actually a complex topic, because of the not only the speed of development of new artificial intelligence models and the AI-friendly high performance computer systems that run them, but also the rapid rise of Abu Dhabi’s AI R&D ecosystem. Abu Dhabi-based researchers have now developed a series of different LLM models, including Falcon 40B, which was ranked first on Hugging Face’s index of open source LLMs earlier this year.

It is no wonder that G42 has decided to invest in the latest supercomputers to provide for the growing need of AI researchers. As a result of the expertised gained, both at home, and via collaborations such as the one with Cerebras Systems, Abu Dhabi technology organisations are gaining world-class capabilities that they could sell globally. The demand for both AI models and the computers that train them is only going to grow!

You can read UAE-based journalist Megha Merani‘s full story in AGBI here.

Meanwhile, you can read my article on Inception’s new Jais 13B LLM here:

Will GenAI champion the Arabic language? (Middle East AI News)

March 18, 2023

In a week full of big technology news, globally and regionally, the Abu Dhabi government’s leading applied research centre has announced that it has developed one of the highest performing large language models (or LLMs) in the world. It’s a massive win for the research institute and proof positive for Abu Dhabi’s fast growing R&D ecosystem.

Technology firms have gone to great lengths since the public beta of ChatGPT was introduced in November to make it clear that OpenAI‘s GPT series of models are not the only highly advanced LLMs in development.

Google in particular has moved fast to try and demonstrate that its supremacy in search is not under threat from OpenAI’s models. Tuesday, the search giant also announced a host of AI features that will be brought to its Gmail platform, apparently in answer to Microsoft‘s plans to leverage both its own AI models and OpenAI’s into products across its portfolio. However, announced the same day, OpenAI revealed that it has begun introducing its most advanced and human-like AI model to-date: GPT-4.

Whatever way you look at it, OpenAI’s GPT-3 was an incredible feat of R&D. It has many limitations, as users of its ChatGPT public beta can attest, but it also showcases powerful capabilities and hints at the future potential of advanced AI models. The triumph for OpenAI though, and perhaps the whole AI sector in general, was the enormous publicity and public recognition of AI’s potential. Now everyone thinks they understand what AI can do, even though they are sure to be further educated by GPT-4 and the new wave of applications built on new advanced AI models heading their way.

So, what does this emerging wave of LLMs mean for other research labs and R&D institutions developing their own AI models around the world? To begin with, the bar to entry into LLMs is set high, in terms of both the technology and the budget required.

Advanced AI models today are trained, not born. GPT-3 was trained on hundreds of billions of words, numbers and computer code. According to a blog from San Francisco-based Lambda Labs, training GPT-3 might take 355 ‘GPU-years’ at a cost of $4.6 million for a single training run. Meanwhile, running ChatGPT reportedly costs OpenAI more than $100,000 per day. R&D labs competing in the world of LLMs clearly need deep pockets.

Then, it perhaps goes without saying, but institutions planning to develop breakthrough LLMs, must also have the right talent. And of course, global competition for the type of top researchers needed to develop new AI models is fierce, to say the least!

Just as critical to having the right talent and the right budget, is having the right vision. R&D institutions are often bogged down in bureacracy, while most tech firms are, necessarily, focused on short term rewards. In this game, to win, the players must have the vision to invest in developing AI models that are ‘ahead of the curve’ and the commitment to stick with it.

Therefore, for those following Abu Dhabi’s R&D story, it is not an entirely unexpected discovery that the Technology Innovation Institute (TII) has been investing heavily in the development of LLMs.

Formed in 2020, as the Abu Dhabi Government’s Advanced Technology Research Council‘s applied research arm, TII was founded to deliver discovery science and breakthrough technologies that have a global impact. An AI research centre was created in 2021, now called the AI and Digital Science Research Centre (AIDRC), to both support AI plans across the institute’s domain-focused labs and develop its own research. Overall, TII now employs more than 600, based in its Masdar City campus.

This week TII announced the launch of Falcon LLM, a foundational large language model with 40 billion parameters, developed by the AIDRC’s AI Cross-Centre Unit. The unit’s team previously built ‘NOOR’, the world’s largest Arabic natural language processing (NLP) model, announced less than one year ago.

However, Falcon is no copy of GPT, nor other LLM’s recently announced by global research labs and has innovations of its own. Falcon uses only 75 percent of GPT-3’s training compute (i.e. the amount of computer resources needed), 80 percent of the compute required by Google’s PaLM-62B and 40 percent of that required by DeepMind‘s Chinchilla AI model.

According to TII, Falcon’s superior performance is due to its state-of-the-art data pipeline. The AI model was kept relatively modest in size with unprecedented data quality.

Although no official third party ranking has been published yet, it is thought that Falcon will rank in the world’s top 5 large language models in a classical benchmark evaluation and may even rank number one for some specific benchmarks (not counting the newly arrived GPT-4).

Large language models have proved to be good at generating text, creating computer code and solving complex problems. The models can be used to power a wide range of applications, such as chatbots, virtual assistants, language translation and content generation. As demonstrated by OpenAI’s ChatGPT public beta testing, they can also be trained to process natural language commands from humans.

Now that the UAE now has one of the best and highest performing large language models in the world, what is the potential impact of TII’s Falcon LLM?

First, like all LLM’s, Falcon could be used for a variety of applications. Although plans for the commercialisation of the new model have not been announced, Falcon could provide a platform for both TII and potential technology partners to develop new use cases across many industry sectors and many functional areas. For development teams in the region, it’s a plus to have the core technology developer close at hand.

Second, Falcon also has technological advantages that businesses and government orgnisations can benefit from, which aren’t available via existing global platforms. The model’s economic use of compute, means that it lends itself for use as an on-premise solution, far more than other models that use more system capacity. In addition, if you’re a government organisation, implementing that on-premise solution means that no national data is going to be transferred outside of the country for processing.

Finally, Falcon is intellectual property developed in the UAE and a huge milestone for a less than three year-old research institute. The emirate is funding scientifically and commercially significant research and attracting some of the brightest minds from around the world to make it happen.

Of equal importance, if Falcon is anything to go by, at both a government policy level and an institutional level, Abu Dhabi has the vision and the drive to develop breakthrough research.

I don’t think that yesterday’s announcement will be the last we will hear about Falcon LLM! Stay tuned!

This article first appeared in Middle East AI News.

January 25, 2021

Despite the economic pressures of the past few years and the disruption of the pandemic, there is so much going on in tech in the Middle East at the moment. So, there was no shortage of material for Damian Radcliffe’s annual Middle East technology predictions story in ZDnet, which quoted me and others from the region’s tech ecosystem on a wide variety of trends including 5G, emerging technologies, government investment, startups, smart cities, open data and cybersecurity.

Prior to the pandemic IDC forecast that investments in digital transformation and innovation will account for 30 percent of all IT spending in the Middle East, Turkey, and Africa (META) by 2024, up from 18 percent in 2018. Meanwhile, it has predicted that government enterprise IT spending in META will top $8 billion in 2021.

During the past 12-18 months we have seen significant activity in several key areas of government spending, including digital transformation, creating Government Clouds, introducing open data policies and platforms, digital services and robotics. Then there was the Saudi Data and Artificial Intelligence Authority (SDAIA) announcement of the Kingdom’s National Strategy for Data & AI (NSDAI) in October, revealing plans to raise $20 billion in investment for data and AI initiatives.

My expectation is that some of the government digital platforms and initiatives that have been created over the past 18 months will support the launch of a variety of new initiatives, local and foreign investment, public-private sector partnerships and opportunities for startups during 2021.

You can read Damian’s full article on what 2021 means for tech in the Middle East here.

May 31, 2020

Abu Dhabi is moving its R&D strategy up a gear with the formation of a new Advanced Technology Research Council or ATRC, to be headed up by Dark Matter founder Faisal Albannai.

With a growing variety of R&D initiatives driven by the likes of ADNOC, DarkMatter, Group 42, Inception Institute of Artificial Intelligence and others, Abu Dhabi is starting to create a significant R&D ecosystem. Last year Abu Dhabi Investment Office and SenseTime announced that the AI unicorn would open an EMEA R&D centre in Abu Dhabi employing 600 engineers. More recently ADQ launched a $300 million startup fund aiming to bring promising Asian startups to set-up in Abu Dhabi. The mix of big tech, new startups and government-backed R&D initiatives could turn out to be a magic combination.

Read the full article in The National here.

Abu Dhabi Archives — Carrington Malin

G42’s supercomputer fast-tracks 40 billion parameter LLM training

G42 Group’s Inception, Mohamed bin Zayed University of Artificial Intelligence and Cerebras Systems announced a 13 billion parameter bilingual Arabic-English large language model, trained in just 21 days.

Can UAE-built Falcon rival global AI models?

What will 2021 mean for tech in the Middle East?

Abu Dhabi’s new advanced tech council will shift R&D ‘into high gear’

The growing need for Middle East AI regulation and governance

20 things I've learned from LinkedIn

So, you think you can lead the market without selling?

The growing need for Middle East AI regulation and governance

20 things I've learned from LinkedIn

So, you think you can lead the market without selling?