In 2022, a new AI application called ChatGPT3.5 was released, marking the beginning of a new era for generative AI. This chatbot, developed by OpenAI, was able to answer questions, generate creative content, and even mimic the style and tone of famous people.

One year later, on December 6, 2023, Google CEO Sundar Pichai announced Gemini, a new large language model (LLM) that is the company's most powerful update since the launch of its own ChatGPT-like application, Bard. This announcement shows Google's ambition to compete with OpenAI and Microsoft in the LLM arms race.

In the past 12 months, the world has seen explosive growth of generative AI technology. Tech giants around the world have been racing to develop and release new LLMs, and new AI startups have kept popping up.

According to a report by Canalys, a global technology research and advisory firm, as of July 2023, a total of 268 LLMs had been released globally, with 130 from China and 138 from the rest of the world.

While generative AI has seen explosive growth in 2023, this pace of development is unlikely to be sustainable in the long term.

Jamyn Edis, a professor at New York University with over 25 years of experience in the technology and media industries, believes that the field will slow down as the amount of available data reaches its limits.

"You need data to train machines, and at some point, you're going to start hitting the edge of the horizon as we seek to ingest more and more text, images, video, and other media formats and datasets," he said.

As for the future of generative AI, the industry consensus is that multimodality is the way forward. Multimodality has become a key battleground for LLMs since the release of GPT-4V.

Other analysts believe that one of the battlefields for generative AI will shift to the creation and integration of AI ecosystems. According to an article on Nvidia's website that predicts AI trends for 2024, advances in LLM research will be increasingly applied to commercial and enterprise applications. AI capabilities such as RAG (retrieve-augment-generate), autonomous intelligent agents, and multimodal interaction will be deployed on almost any platform and be more accessible.


Editor: Alexander