AI Meets Real-World Tasks 🗨️🤖 Talking AI

View in browser

Read time: 5 minutes

Welcome back to the Talking AI Newsletter, where we help you cut through the AI hype.

👋🏻 I'm Omar Shanti, CTO at HatchWorks AI.

With so much buzz around AI, it's easy to feel overwhelmed.

So let's break it down. In every issue, we'll dive into:

The latest AI advancements
Practical applications to enhance your business
How these AI trends can shape your future growth

This Week's Insights

Phones and AI as Samsung and Apple lead the next frontier
AI Image Editing sparks deepfake and privacy concerns
Claude 3.5 and Jarvis bridge tech with real tasks
Balancing fair detection with AI in education

🖼️ Picture This... Picture This Again (this time, AI edited) 🖼️🦆

Next time you're running late for work, just text your boss a picture of your car's flat tire. Don't have one? No problem – have your phone edit one into your image for you.

In the last newsletter, we spoke about how Generative AI models are proliferating: coming soon to software near you. Your phone is no exception.

Phones have long been the vanguard of innovation in experiential AI. Manufacturers have long sought to break past the limitations of visual interfaces by investing in chatbots and voice assistants. With Generative AI, the goal has become multimodality – interweaving text, image, and speech into rich conversations.

This week, we're shining a light on one specific modality: text-based image processing. Give the model an image and some instructions and it will output a realistic, edited version.

Before Midjourney announced plans to release a web-based photo editor, others had proven the concept. For instance, Samsung's Galaxy AI, Google's Reimagine, and Apple's Image Playground bring to phones what Adobe's AI Photo Editor brought to its software.

But there's still no consensus on how we mark something as AI-edited. Current strategies revolving around adding metadata or watermarks are easily bypassed. With deepfakes driving misinformation and security exploits as well as increased concerns around intellectual property, this question is crucial.

Of course, none of this is unique to images.

Higher education is rife with solutions detecting Generative AI-powered plagiarism – and, inevitably, they sometimes get it wrong. No model will be 100% accurate. This underscores the need to build guardrails and defense mechanisms to prevent the wrong class of error.

After all, not all squares in the confusion matrix are weighted equally.

🐝 Swarming the InterWebs🏄💻

Back in March, one X user went viral for tweeting –
> "I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes."

If only she waited until November; her AI could have used her computer and tweeted that for her!

Google caught headlines this month for teasing Project Jarvis: its Gemini powered assistant capable of navigating Google Chrome for you.

In the near-term, consider the value for use-cases such as application testing, automating manual workflows, and web-scraping. Consider as well the merits in bridging the gaps in web accessibility. Consider also the headaches of more bots than ever before crawling your websites, clogging up your APIs, and skewing your user-session analytics.

Google isn't the only one playing this game. Microsoft, Apple, Claude, and OpenAI are all making interventions that shift the field of control from a browser to applications to an entire operating system.

Like any technological trend, this one isn't inherently good or bad – but neutral. By democratizing equally the use and abuse of applications, this technology ups the stakes of having effective design and security.

A new user persona has just arrived on the scene; how will your designs reflect that?

How will you use Generative AI to best support Agentic Automation?

How long until Search-Engine Optimization finds its twin in Agent-Navigation Optimization?

📰 Your AI News Brief

Stay updated with the latest developments in artificial intelligence.

Meta's latest AI advancements—SAM 2.1 sharpens image segmentation for precise visual analytics, and Spirit LM merges speech with text for enhanced voice-enabled services.
Shortcut Models enable one-step diffusion by avoiding superficial learning, ensuring efficient AI that truly understands data—crucial for delivering reliable client solutions.
VLM-Grounder demonstrates how we can achieve 3D object detection using just 2D images—unlocking advanced visual analytics for clients without requiring 3D data.
Microsoft hesitating to invest further in OpenAI signals that over-reliance on a single partner can challenge AI growth—iterative adaptability is crucial here.
H2O.ai's new H2OVL Mississippi models outperform larger rivals in document analysis, showing that efficient, smaller models can deliver real value—it's all about balancing performance with scalability.
Apple leverages homomorphic encryption and machine learning to enhance photo search while ensuring user privacy. This advancement offers clients secure, scalable AI solutions that prioritize data protection.
Introducing ChatGPT search, which combines conversational AI with real-time web data, offering users comprehensive answers and source references, enhancing productivity and information reliability.

🏝️ Generative AI Is Not an Island: ML’s Core Principles

In the episode, Simba Khadder, Co-Founder & CEO of Featureform, and I delved into the interconnectedness of generative AI and traditional machine learning.

We illustrated how generative models extend established ML practices like feature engineering and data retrieval.

This seamless integration ensures that AI advancements are grounded in proven methodologies, enabling us to build scalable and effective solutions that tackle real-world business challenges and foster innovation.

🎧 Generative AI Is Not an Island: ML’s Core Principles

💻 How to Use Small Language Models for Niche Needs

Our Senior ML/AI Engineer at HatchWorks AI, David Berrio, outlines a strategic method for leveraging small language models (SLMs) to address specialized business needs.

By selecting, customizing, and deploying SLMs through knowledge distillation and fine-tuning, companies can implement cost-effective and precise solutions tailored to specific tasks.

David emphasizes the ease of structuring SLMs and the importance of maintaining and updating them with new data, fostering innovation, and enabling businesses to meet niche demands efficiently without relying on larger models.

📚 How to Use Small Language Models for Niche Needs

Thank you for tuning into this edition of the Talking AI Newsletter.

We hope these insights help you navigate the complex AI landscape and apply these advancements to drive your business forward.

Thoughts or feedback?

Reach out or connect on LinkedIn.

P.S. Today's newsletter features insights from David Berrio, our Senior ML/AI Engineer, who has hosted several HatchWorks AI Labs on LLMs, SLMs, and AI model training.

David is an expert in the field. Give him a follow on LinkedIn and check out his latest projects to stay updated on his innovative AI/ML techniques and practical applications.

We're your AI development partner.

We build AI-native solutions and use AI to build software better, faster, smarter.
If you're interested in working with us, contact us here.