Leader of Computas’ new AI department: “These are the most important AI news so far in 2024”

Simon Isaksen has, after nearly three years in the company, become the leader of Computas’ new AI department. Photo: Computas.

This article was first published on digi.no.

Every time you blink, something happens on the AI front that could change the way you work. This is especially true for generative AI, where Google is now challenging OpenAI’s ChatGPT-4 with its Gemini 1.5. Alongside the large language models, more of the smaller language models are emerging, tech stocks are soaring, and text-generated videos are leaping towards a new reality.

The IT company Computas has been working with AI since the mid-80s and has always had a strong expertise in the field. Recently, Computas decided to gather employees with specialized skills in AI into a separate department, which now consists of a total of 40 advisors and consultants.

This is a company-wide initiative based on an extremely strong interest in the market. Partly, this is due to the fact that AI models have improved, but also because AI as a tool has become easier to use.

“In due course, most digitalization projects will have an element of AI in them. At the very least, developers will use AI in their work. And even though we work on digitalization much in the same way as before, AI will be an increasingly larger part of our toolkit,” says Simon Isaksen, who leads Computas’ new AI department.

In Brazil, Simon Isaksen often chats with ChatGPT during his stroller walks. Photo: private

Isaksen has sorted through the AI news so far this year and believes there are several trends that will have significant importance.

Keywords: multimodality, larger context window, and increase in intelligence.

The Advantage of Small Language Models

The large language models are becoming increasingly powerful. Additionally, you have the emergence of the small language models.

“Google has launched a new series of these called Gemma. Lightweight models are not as general in their intelligence, but they are faster, cheaper, and more sustainable in the sense that they are less energy-intensive. For many use cases, you don’t need more”.

Another advantage of models like Gemma is that they can run locally on mobile phones and PCs.

“This opens up new possibilities — finally, a Siri variant that works, for example. Another possibility is if you have very sensitive data, then you can run these locally instead of sharing them with cloud providers”.

Nvidia’s New Stock Record

An illustration of the tremendous momentum AI currently has is how the value of graphics card manufacturer Nvidia’s shares has evolved.

At the end of February, Nvidia set a new record for single-day value increase; 277 billion dollars, representing a 16 percent increase. Just for comparison – Equinor’s total value is 78 billion dollars, says Isaksen.

“This is largely driven by the major technology companies. They have a lot of money and need something to spend it on. Investing in AI infrastructure is seen as a good investment”.

A much smaller company that has received a lot of attention is Groq. They have an architecture on their chips that allows you to use the large language models more efficiently.

“It’s not just resource-intensive to train the large language models, it’s also resource-intensive to use them. When new solutions emerge that can streamline their use, it gets noticed”.

A lot of new things are happening on the chip front in 2024. Photo:  AdobeStock

Video Faster than Expected

With Google’s new Lumiere, text-to-video and image-to-video have become eerily convincing. For now, Lumiere can only generate five-second video snippets, but with the way Lumiere handles time and space, new horizons will open up.

OpenAI’s answer to this is called Sora. The demo version can generate videos up to one minute long.

More lifelike generations make it increasingly difficult to distinguish what is real and what is AI-created. With better technology, the need for ethical guidelines and proper labeling of AI-generated content increases.

“This development has happened faster than I imagined. Recently, Google also launched the demo model Genie, which generates simple games.”

It will enable everyone to create their own content even in the gaming world.

“For example, I look forward to exploring different parts of the Star Wars universe,” Isaksen smiles.

This is Gemini Better at

Gemini 1.5 is currently only available to selected users, but this may change soon. Photo: Google

Like ChatGPT, Gemini exists in both a free version and a paid version. When comparing them, it’s worth noting a couple of fundamental differences.

“The Gemini model is more multimodal than ChatGPT. For example, if you ask ChatGPT-4 to generate an image, it uses DALL-E 3, which is another AI model. Gemini processes text, code, sound, image, and video all in one model, opening up some new possibilities”.

An example of use could be as a diagnostic tool for doctors. The model can take input in the form of patient records, blood tests, CT scans, etc., detect patterns across them, and thus provide more accurate diagnoses.

“From a developer’s perspective, multimodality means it becomes easier to use AI. Image recognition could be set up in a simpler way, and could be connected to predictive maintenance, among other things. In insurance cases, images could be interpreted into text, simplifying the work. The use cases are many”.

The size of the context window is another difference between the large language models of OpenAI and Google. Gemini 1.5 has a context window of 1 million tokens, which corresponds to 700,000 words. For comparison, a novel has around 90,000 words.

“This is five times as much as ChatGPT-4. And the most impressive thing is that Gemini has managed to achieve this without losing much precision. Previously, it has been a challenge that it doesn’t stay as accurate when trying to expand the context window”.

The result is that it becomes easier to use the model for larger tasks.

“A common use case we see today is using language models to search and chat with a company’s own data. If you have a small context window, you have to do a lot to work around it. The larger the context window you have, the less this need becomes”.

The Model That Understands Dialects

The National Library’s NB-Whisper, based on OpenAI’s Whisper, is a novelty that, unlike the large international language models, is capable of delivering accurate speech recognition in Norwegian — including dialects.

“ChatGPT-4 is very good in English, but it tends to respond in Swedish if I speak Norwegian to it,” says Isaksen in a distinctive Stavanger dialect.

He emphasizes that there is a trend towards localized models tailored to the language, culture, and norms of a specific area. More and more countries are getting their own version of a large language model. These are significantly better than the existing solutions.

“The National Library is among the world leaders in digitizing content. Whisper is especially useful for writing transcripts and summaries from audio recordings. Use cases could be related to journaling, police interrogations, job meetings, news coverage, and for people with reading and writing difficulties. Perhaps also voice control ‘in the field,’ when you need your hands for something else.”

Moreover, rumors have it that we will get a new version of Apple’s Siri this fall, with improved voice recognition in addition to upgraded AI.

“I assume that initially it will be best in English, and then it will take some time before it works as well in Norwegian. But this will probably happen considerably faster than we are used to.”

Predictive maintenance is one of many use cases for AI technology.  Photo:  AdobeStock

Moving forward in 2024

Isaksen believes that the high pace on the AI front will continue. These are his expectations for the rest of 2024:

  1. The models will improve on all dimensions. The level of intelligence will increase, they will become faster and cheaper to use.
  2. We will see increasingly more evidence of the efficiency gains of AI. Klarna is a good example of this, where their new AI assistant has achieved impressive results in just one month.
  3. 2024 could be the year where we get the first major AI-native application. What will be AI’s versions of Uber, Instagram, and WhatsApp?

At the same time, Isaksen emphasizes that impressive demos and presentations only show part of the truth about AI.

To succeed with AI, one must test and learn. It’s not just about understanding the possibilities of the technology, but also about having relevant data. Of course, one must also be careful regarding ethical and legal issues,” says Isaksen, adding that it may be wise to start with internal processes before going external.

It’s also not the case that AI solves everything. It must be put into context to be used properly. If you manage this, AI can provide significant benefits,” he concludes.

Want to know more?

Please fill out our contact form