Local Alt Text Generation with Firefox Nightly
Exciting Developments in Firefox Nightly
Firefox 130 is set to introduce a groundbreaking feature: the automatic generation of alt-text for images using an on-device AI model. This experimental capability will initially be accessible through Firefoxâs built-in PDF editor, with hopes to expand it for general browsing, specifically aiding users with screen readers.
The Importance of Alt Text
Web pages are designed to be adaptable, interpreting content differently based on user needs. This adaptability makes the web unique, allowing browsers to serve as user agents committed to making the web more accessible for everyone. Alt text is crucial for assistive technologies like screen readers, ensuring users have a comprehensive understanding of images on web pages. Despite its importance, about half of all web images lack proper alt text, as noted by the Web Almanac in 2022.
Browser-based inference has historically been impractical due to privacy concerns and the need for remote server data handling. However, advancements in AI now allow efficient on-device image analysis. Firefox Nightly will test alt text generation within its PDF editor to refine this local processing capability.
Utilizing Compact AI Models
Mozillaâs approach involves Transformer-based machine learning models for on-device alt text generation. These compact models efficiently operate on devices with limited resources, providing satisfactory descriptive capabilities without the need for expansive resources like cloud-based GPUs. These local models offer privacy and reduce server reliance, making them environmentally friendly.
Example of AI-Generated Alt Text
A model tested on an image from the COCO dataset shows varied accuracy compared to human descriptions. While small models may miss some details, they provide a valuable automated starting point for detailed content creation.
Advantages of Local Inference
Running local model inference grants users privacy, enhances transparency, and allows for precise carbon footprint management. It also simplifies model improvement processes, with Mozilla planning frequent updates based on user feedback and data.
Integration and Expansion
Firefox Nightly will employ Firefoxâs existing translation infrastructure adapted for AI-generated alt text. This integration uses the Bergamot and ONNX runtime projects, offering WASM and soon WebGPU support, enhancing the modelâs efficiency and usability.
A custom model caching system will manage these local models, stored and managed separately from user data, ensuring efficiency and privacy.
Future Plans
The alt text feature is just a beginning. Mozilla aims to refine this function to enhance user experience, eventually applying the technology to broader web browsing contexts for screen reader users. As the tool develops, Mozilla is committed to ongoing improvements and community involvement through open-source collaboration.
For more in-depth information, visit the original source from Mozilla Hacks.