Technology Advisor Update – Summer 2024

A glimpse into the near future

Futuristic image of a person's eye and a mobile phone

In early summer people from all over the world gathered to attend two of the major developer conferences – Apple’s Worldwide Developers Conference (WWDC) and Google’s Google I/O. These events serve as the platform to announce what new advances we can expect to see on our devices in the near future. Perhaps unsurprisingly the advances and integration of artificial intelligence (AI) dominated both conferences. In this article I have highlighted some of the more interesting announcements.

WWDC 2024

A large number of people all seated watching two large screens at Apple Park in Cupertino, California

Apple’s Worldwide Developers Conference (WWDC) 2024 showcased an impressive array of technological advancements, with a clear emphasis on artificial intelligence (AI). However, Apple’s commitment to creating technology that is not only cutting-edge but also inclusive and adaptive to the needs of all users continues.

Accessibility Innovations

Accessibility has long been a cornerstone of Apple’s design philosophy, and WWDC 2024 was no exception. This year, Apple introduced several groundbreaking features aimed at enhancing the user experience for individuals with disabilities. Below are some of these features. I have included some that appeared in press releases prior to WWDC.

Eye Tracking

This revolutionary feature empowers users with limited mobility by enabling complete device control through eye movements. The iPad or iPhone’s front camera tracks eye positions, allowing users to navigate the interface, interact with apps, and even type using their eyes. This is a significant leap forward in providing independent device access for individuals with physical disabilities. In keeping with Apple’s emphasis on privacy all data used to set up and control this feature is kept securely on device and is not shared with Apple. How well this compares to dedicated eye tracking systems remains to be seen. But certainly, opens up another exciting way to interact with your device, assuming it supports this feature.

Music Haptics

Designed to broaden the musical experience for those who are deaf or hard of hearing, Music Haptics leverages the iPhone’s Taptic Engine to translate music into a series of vibrations. These vibrations correspond to the music’s rhythm and intensity, creating a new way to feel the music and appreciate its nuances. This innovative approach opens up music enjoyment for a wider audience.

Vocal Shortcuts

Going beyond traditional touch or voice commands, Vocal Shortcuts cater to users who might find them challenging e.g. those with atypical speech. This feature allows people to create custom sounds that trigger specific actions on their device. Imagine snapping your fingers to take a photo or uttering an indistinguishable word to activate voice control. Vocal Shortcuts open doors for a hands-free and potentially voice-free interaction method, empowering users in unique ways.

Vehicle Motion Cues

Depiction of Apple’s Vehicle Motion Cues

Vehicle Motion Cues aim to counteract motion sickness while using your iPhone or iPad in the car. This feature utilizes the device’s sensors to detect motion and subtly adjusts display settings to combat nausea and dizziness. By reducing on-screen motion, Vehicle Motion Cues creates a more comfortable in-car experience for passengers prone to motion sickness, allowing them to enjoy games, movies, or reading without feeling unwell.

VisionOS Advancements

A man wearing a yellow jumper and glasses. He is seated on a sofa, his arm streached out. He is explaining about getting food. What he is saying is appearing as live captions viewed through Apple’s Vision Pro

While specifics remain undisclosed, Apple indicated upcoming improvements to VisionOS, the operating system powering their assistive technology device, the Vision Pro. These enhancements aim to further empower users with visual impairments. It is anticipated that advancements in areas like screen narration, object recognition, and voice control. This will make the Vision Pro an even more valuable tool for daily living, allowing users with visual impairments to navigate their surroundings, access information, and perform everyday tasks with greater ease and independence.

Apple Vision Pro, now available in the UK is reported by some to be the of the most accessible device produced by Apple yet, and a testament to Apple’s commitment to accessible and inclusive design.

The Dawn of Apple Intelligence

Various examples of Apple Intelligence AI being shown on a MacBook, iPad and iPhone

Perhaps the most intriguing announcement was Apple Intelligence. While apple has utilised, it is unique AI in other forms (machine learning, powered by Apple’s neural engine) for years it has been slow to join the major tech companies in the AI boom. However, legal issues may mean it could be a while before Apple Intelligence appears on supported devices in Europe.

Apple has also taken the approach of working with partners to bring AI to their systems, in particular Open AI. It has been reported that this approach could allow for people to choose which AI (e.g., Google’s Gemini) they wish to use in future.

Irrespective of which LLM (large language model – the artificial intelligence) Apple Intelligence, is integrated with, it is an ambitious A.I. system designed to be more than just a digital helper. While specifics are still under development, Apple promises an A.I. experience that goes beyond basic tasks. Imagine an assistant that anticipates your needs, proactively suggesting actions, and seamlessly connects tasks across your Apple devices. This personalised approach to A.I. has the potential to significantly alter how we interact with technology in our daily lives.

Unlike virtual assistants that respond to specific commands, Apple Intelligence aspires to be proactive and anticipate your needs. Imagine an A.I. that scans your emails for upcoming travel plans and proactively suggests creating a packing list or currency converter app download. It might interact with your smart fridge, analysing your supplies and recommend adding items to your shopping list.

A major concern with A.I. assistants is your privacy. In keeping with Apple’s drive to ensure your privacy, Apple Intelligence addresses this by prioritizing on-device processing. This means your data stays on your iPhone or iPad, with only anonymised or encrypted information sent to Apple’s secure servers for more complex tasks. This focus on privacy allows you to leverage the power of A.I. with peace of mind.

Apple Intelligence goes beyond simply understanding your words; it aims to grasp your world. By analysing your emails, photos, messages, and even browsing history, it can build a contextual understanding of your life. Imagine asking “What time is mum’s train arriving?” Apple Intelligence, having gleaned “Mum” from your contacts and the train details from your inbox, can provide the answer without you needing to specify where you found the information. This contextual awareness could make interacting with your devices feel more natural and intuitive.

Apple Intelligence is not just about managing tasks; it aspires to be a creative partner. It boasts writing tools powered by A.I. that can help you rewrite sentences for clarity, summarize lengthy articles, or even generate different creative text formats like poems or code. This could be an advantage for students, writers, or anyone who wants to explore different creative avenues.

While specifics are still under development, Apple Intelligence is slated for a developer beta later in 2024 with a full launch in 2025. This glimpse into the future of A.I. assistants suggest a more personalised and helpful way to interact with technology. Apple Intelligence has the potential to become an indispensable partner in our daily lives, streamlining tasks, understanding our needs, and even fostering creativity.

At WWDC 2024 Apple unveiled several AI-driven features designed to enhance user experience across its ecosystem they include:

Image Playground

Apple’s Image Playground, an AI-powered tool that lets you create playful images directly within Apple’s existing apps. By describing concepts, choosing themes, or referencing people in your photos, Image Playground then generates unique illustrations, animations, or sketches. This user-friendly feature prioritizes fun and personalization, offering a range of artistic styles to match your creative vision. With Apple’s on-device processing for privacy, Image Playground empowers you to add a spark of AI generated flair to your messages, notes, presentations, and more.

Genmoji

While the not a standalone app, the Genmoji feature expected to be included in the Messages app and possibly elsewhere. It will allow you to generate your own custom emojis by entering a descriptive prompt. For example, “a t-rex wearing a tutu on a surfboard”.

AI-Enhanced Photos and Videos

The Photos app will now include advanced AI capabilities that automatically enhance images and videos, making them clearer and more vibrant. This feature is particularly useful for users with visual impairments, as it adjusts the content to be more distinguishable and enjoyable.

Siri 2.0

The latest iteration of Apple’s voice assistant, Siri 2.0, leverages advanced AI to provide more contextually aware and conversational interactions. Siri can now understand and process more complex queries, offering more accurate and relevant responses. This upgrade makes Siri not only more useful but also more accessible to users with varying needs.

Other announcements

While there were many more improvements and innovations announced at WWDC the last two I would like to mention are:

Calculator app for iPad

Apple’s Calculator for iPad in action, including Maths Notes

For years, there wasn’t native iPad Calculator app. It is reported that Steve Jobs was never satisfied with the calculator app for iPad, feeling it lacked something. However, Apple has finally announced that iPadOS 18 boasts a built-in Calculator app!

This addition is a game-changer for students, professionals, and anyone who needs to crunch numbers on the go. No more hunting for third-party apps or relying on web-based solutions. The built-in Calculator app puts essential calculations at your fingertips, seamlessly integrated into the iPadOS experience.

Apple is not simply porting a phone app to a larger screen. The Calculator app is designed to take advantage of the iPad’s spacious display. Expect a well-organized layout with clear buttons and ample space for calculations. This makes it easier to see what you are doing, reducing errors and improving overall usability.

While the core functionality focuses on addition, subtraction, multiplication, and division, the Calculator app offers additional features:

Scientific Mode: For those who need more advanced functions, a scientific mode could be included, providing access to trigonometry, logarithms, and other complex calculations.

Unit Conversion: Imagine easily converting between units of measurement like temperature, length, or currency right within the app. This eliminates the need for separate conversion tools, simplifying everyday tasks.

History Tape: Keep track of your calculations with a history tape feature. This allows you to review previous calculations, double-check your work, or pick up where you left off on a complex problem.

The built-in Calculator app might integrate with other iPadOS apps, allowing you to seamlessly copy and paste calculations between them. Imagine performing calculations in the Calculator app and then easily pasting the results into a spreadsheet or a notes document. This streamlines workflows and eliminates the need for manual data entry.

To compliment the Calculator app Apple’s announced the innovative Math Notes feature introduced in iPadOS 18. This built-in Calculator function goes beyond basic calculations. Simply write out your math problems with your Apple Pencil on the iPad screen and watch as Math Notes recognizes your handwriting and solves them in real-time! No more clunky typing or struggling with equations. Math Notes can handle everything from basic arithmetic to complex functions. It even understands variables, allowing you to explore different scenarios within your equations. Plus, the ability to solve problems directly on your notes keeps your work organized and eliminates the need for separate scrap paper. The experience is further enhanced by the new Smart Script feature which smooths and straightens your handwriting as you write, making it instantly neater and easier to read.

Standalone Passwords App

Screenshot showing Apple’s new Standalone password manager app

Managing passwords securely across a multitude of websites and apps can be a constant struggle. Apple addressed this with the introduction of a standalone Passwords app, a significant improvement on the previously buried functionality within Settings.

No more digging through menus! The Passwords app offers a centralized location to view, manage, and store all your login information. This includes website usernames and passwords, Wi-Fi network passwords, and potentially even passkeys, a new emerging secure login method.

The app categorizes your logins clearly, making it easy to find the specific credentials you need. Imagine separate sections for frequently accessed websites, social media accounts, and email logins, allowing for quick retrieval and organization.

Building on Apple’s existing security features like iCloud Keychain, the Passwords app is designed to keep your data safe. Features like strong password generation and automatic filling of login information across apps streamline the process while maintaining security.

The Passwords app integrates with other Apple products. You can expect features like:

Cross-device Syncing: Access your passwords from any Apple device, be it your iPhone, iPad, or Mac. Your login information stays up-to-date and readily available, no matter which device you’re using.

AutoFill on Browsers: The app integrates with Safari and other browsers, automatically filling in login information when you visit a website. This eliminates the need to remember complex passwords or manually type them in, saving you time and frustration.

Windows Compatibility: Even if you use a Windows PC alongside your Apple devices, you’re not left out. The Passwords app can be accessed through the iCloud for Windows app, ensuring you have your logins at your fingertips regardless of platform.

The Passwords app directly challenges third-party password managers like 1Password and LastPass. With its focus on simplicity, security, and integration within the Apple ecosystem, it has the potential to become a go-to solution for Apple users who want a secure and convenient way to manage their login credentials.

Google I/O

Similar to WWDC, Google’s annual developer conference, I/O, focused heavily on artificial intelligence, and its integration into Google products. The announcements focused more on the evolution of Google’s AI than new developments. That said, there have been significant advances to Google AI, Gemini. In fact, Gemini seemed to dominate the conference.

Gemini 1.5 signifies Google’s continued commitment to pushing the boundaries of AI. Powerful AI models use a Large Language Model (LLM), this means the model is fed massive amounts of text data to understand and generate human language. In context of large language models (LLMs) like Gemini 1.5. The latest version of has a 2 million Token Context Window. In simple terms an AI “token” means a unit of information that it has learned from. A “Context Window” is the amount of data the LLM considers when generating a response or completing a task. Imagine it like a window that the LLM uses to peek at the surrounding information to understand the current prompt or question.

One of the key strengths of Gemini 1.5 is its ability to understand and process information within a much larger context. Compared to its predecessor, Gemini 1.0, it boasts a significantly longer context window, allowing it to grasp the nuances of information spread across vast amounts of text, code, audio, or video. Unlike many AI models focused solely on text, Gemini 1.5 is a true multimodal powerhouse. It can process and understand information presented in various formats, including images, audio, and video. This versatility allows it to tackle a wider range of tasks. For example, imagine describing a scene you want to create in a video; Gemini 1.5 could analyse your description and generate visuals based on your input.

Gemini 1.5 is not a single entity, but rather a family of models with varying capabilities. Google offers a mid-sized “Pro” version optimized for a wide range of tasks and a “Flash” version focused on speed and efficiency. This allows developers to choose the Gemini model best suited for their specific needs. The Gemini family also includes Gemini Nano. This lightweight version allows Gemini to be used in the Chrome browser and could significantly enhance web browsing experiences by offering advanced capabilities like real-time translation, content summarisation, and code generation. It also allows for Gemini to be included on mobile devices.

In fact, Gemini will be integrated throughout Google’s products such as Gmail and Docs.

Revamped Search Engine

Screenshot of a web browser showing Google’s revamped AI powered search

The advances also mean a revamped Search Engine built using the AI. This could be a major game-changer in how people find information online. Google is also working on Gemini agents to complete tasks like meal or trip planning. You would be able to type queries like “Plan a meal for a family of four for three days”. The AI will then provide you with recipes and links for the three days.

Ask Photos

Gemini is also making its way into Google Photos. While still in the experimental phase the Ask Photos feature will allow users to search across their Google Photos collection using natural language queries that leverage an AI’s understanding of their photo’s content and other metadata. While it has been possible to search for specific people, places, or things in the photos, thanks to natural language processing, the AI upgrade will make finding the right content more intuitive and less of a manual search process.

Imagen 3

Imagen 3 is Google’s latest and most advanced text-to-image generation model. It builds upon its predecessors, offering a significant leap in image quality. It can generate incredibly realistic and detailed images that closely resemble photographs. Imagine describing a fantastical landscape with waterfalls cascading down mountains shrouded in mist, and Imagen 3 could generate an image that captures the scene with breathtaking detail.

Google would like this advanced AI model to be a tool that empowers everyone to unleash their creativity. By simply describing your concept in text, you can generate unique and visually captivating images. This opens possibilities for:

Storytelling and illustration: Bring your stories and ideas to life with stunning visuals. Generate illustrations for your blog post, create storyboards for your animation project, or visualize your next marketing campaign.

Design and Prototyping: Imagen 3 can be a valuable tool for designers and product developers. Quickly generate mock-ups and prototypes of your design ideas without needing to spend hours crafting them manually.

Education and Exploration: Imagine exploring historical events or scientific concepts through AI-generated visuals. Imagen 3 has the potential to revolutionize the learning experience by making abstract concepts more tangible and engaging.

Imagen 3 goes beyond just generating images based on simple text descriptions. Imagen 3 allows you to take an existing picture and add elements to it, change the background, or adjust the overall style. Imagine taking a vacation photo and adding a fantastical creature into the scene for a touch of whimsy.

Imagen 3 is designed to run entirely on your device, so your prompts and the generated images remain private and secure. This ensures you maintain control over your creative process and protects your data and privacy.

More about Imagen 3 on the Goodle Deepmind website.

Veo

One of the more exciting announces was Veo. Google DeepMind’s Veo, is a groundbreaking text-to-video generation model. This innovative AI tool takes your textual descriptions and transforms them into dynamic and visually stunning videos.

While other AI models excel at generating realistic images, Veo goes a step further by creating videos complete with motion, lighting effects, and even camera movements. Describe a bustling city street at night, and Veo might generate a video displaying the neon lights, moving cars, and bustling crowds.

This technology holds immense potential. You could bring your stories and ideas to life with captivating animated sequences. Imagine creating storyboards for your animation project or generating explainer videos for your blog post.

While details about Veo’s public availability are limited, its development signifies a significant leap in AI-powered video creation. As this technology continues to evolve, we can expect even more sophisticated and user-friendly tools that will revolutionize the way we create video content.

Google’s Veo paves the way for a future where creating videos becomes more accessible and intuitive. With the power of AI-powered text-to-video generation, anyone with a creative vision will have the potential to bring their ideas to life on screen.

LearnLM

Google unveiled LearnLM, this is an interesting use of AI to support education allowing questions to be asked about a YouTube video, or a quiz to be created. While this is still in the experimental phase, it is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

Project Astra

Finally, Google’s Project Astra, aimed and developing Google’s future vision for AI that combines multiple sensory inputs (sight, sound, voice, text) and has the potential to revolutionize human-computer interaction. It is well worth taking a moment to watch the videos showing Gemini Live. What is impressive of the video is not only the speed of processing but the fact that the system is able to capture, store and use the information to answer a question like, “Do you remember where you saw my glasses?”. This shows the huge potential future digital assistants could have.

Whether from Google, Apple, Microsoft, Amazon or elsewhere it is clear that AI will continue to permeate our lives. As always, I would like to hear about how you are using mobile, and other technology, and AI too. If you would like to have a particular topic covered in the next newsletter, please let me know. Finally, please feel free to contact me if you have a question or need technical help and support.

Martin Pistorius

Karten Network Technology Advisor

Article meta data

Clicking on any of the links in this section will take you to other articles that have been tagged in the same category.

Featured in the Karten Summer 2024 Newsletter
This article is listed in the following subject areas: Technology, Update from Technology Advisor

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.