A team of AI researchers at Apple recently said that the AI system they developed, Reference Parsing as a Language Model (ReALM), outperformed the GPT-4 on certain query tasks, and the team released a paper detailing the new system and its unique information acquisition capabilities. The results showcase Apple's latest advances in AI and could have a significant impact on future AI technologies and applications.
Over the past few years, large-scale language models (LLMs) like chat GPT-4 have become a dominant force in computing, with companies competing fiercely to enhance their own products and attract more users. In this process, Apple's pace has appeared relatively slow - its intelligent assistant Siri has not made significant enhancements to its AI capabilities. However, in this latest attempt, Apple's research team claims that their ReALM system isn't just trying to keep up with the competition - it outperforms even all other large-scale language simulations currently available to the public when dealing with certain specific types of query tasks.
In their paper, Apple's research team explains how its Large Language Model (LLM) is able to answer user questions more accurately due to the model's ability to utilize ambiguous on-screen references as well as access conversational and contextual information. In other words, ReALM is able to analyze the content of the user's screen during the search process, looking for clues to the user's actions prior to asking a question.
The model is also able to interrogate other running processes on the device as a basis for further understanding the trajectory of the user's pre-questioning thoughts. By incorporating this information into traditional LLM processing techniques, the team claims that their system is able to more effectively provide users with the information they need.
The team also stated that they have tested the ReALM system against several LLMs, including the GPT-4, and claimed that ReALM outperformed all other models on certain specific types of tasks. Additionally, they mention that Apple plans to integrate ReALM between its devices and Siri, thus allowing Siri to provide more accurate answers. However, this feature may be limited to users who have upgraded to the upcoming iOS 18 release this summer.
The main highlights of this AI model are as follows:
- ReALM is able to convert visual information and contextual content on the screen into a text format that the AI can understand. This process involves labeling various elements on the screen in order to provide the AI with explicit contextual information about the function and importance of each element.
- The ReALM model is significantly smaller than other visual language models, as is the largest ReALM-3B model. Even the smallest model, with only 80 million references, delivers a performance increase of more than 5% when processing referenced information on the screen. This modest model size allows them to be used on small devices such as smartphones.
- ReALM's smallest model significantly outperforms GPT3.5, and its performance matches that of GPT-4. The large model outperforms GPT-4 in all domains, building a more effective solution.
Apple has been committed to research and development in the field of AI, aiming to improve the intelligence of its products and services through continuous technological innovation. the successful development of the ReALM model is undoubtedly another important achievement of Apple in this field.