Blog: Kompact and “ Voice / Face Animation” on www.HemenParekh.ai

Dear Chetanbhai :

Sub : Kompact and “ Voice / Face Animation” on www.HemenParekh.ai

In Feb 2024 , we chatted about animating my face ( with lip-sync ) on

www.HemenParekh.ai

I reproduce a portion of our chat below at ( A )

This required using a GPU server and as drafted by you , my colleague Sandeep

sent email to Big Rock Team , shown below at ( B )

Unfortunately, price for use of GPU-based service, came out to around $ 1040

per month.

I gave up this project since this price was unaffordable

Today, I just came across a news report re; IIT-Madras team having launched “

Kompact “ innovation , which enables LLMs to run on a standard CPU ! No need

for a GPU ! . Read this at ( C ) below

Question:

By using KOMPACT, can we manage that “ Animation “ on Big Rock ( current CPU-

based configuration ) ?

( A ) Extract from my past chat with ChatGPT ( 10 Feb 2024 ) :

I said:

Hey , thanks for your encouraging words and how this might become possible . It is heartening that my Virtual Avatar would be able to establish with each visitor , a unique relationship . Even at this moment ( for past 2 years ) , my Avater V 1.o ( www.HemenParekh.ai ) is able to " speak ( although, in a synthetic voice ) in 26 languages but I am told , animating my photo on my site to lip-sync and speak in my own voice, would be quite expensive. Can you suggest a free or a cheap method to achieve this ?

ChatGPT said:

You're already ahead of the curve with HemenParekh.ai V1.0 speaking in 26

languages!

The next logical step—lip-syncing animation with your own voice—can

be achieved using some low-cost or even free AI tools.

💡 Low-Cost / Free Ways to Achieve AI Lip-Sync & Voice Cloning

1️⃣ Free AI Lip-Syncing Tools (for Animating Your Photo)

1. SadTalker (FREE, Open-Source) →

o Can animate a single static photo of you, making it talk and move its

lips in sync with any audio.

o GitHub: https://github.com/OpenTalker/SadTalker

o Demo: https://sadtalker.github.io

2. HeyGen (Freemium) →

o Upload your photo and an audio clip, and it will generate a realistic talking avatar.

o Free Plan: Limited output per month.

o Website: https://www.heygen.com

3. D-ID Studio (Freemium) →

o Allows you to upload your photo + any voice and create a realistic talking head.

o Free version allows limited video generation per day.

o Website: https://www.d-id.com

2️⃣ Free & Cheap AI Voice Cloning (For Your Own Voice)

1. OpenAI TTS API (Coming Soon) →

o OpenAI is working on a personalized voice model. When available, this could generate lifelike speech in your voice.

2. RVC (Retrieval-Based Voice Conversion, FREE) →

o This clones your voice with just a few minutes of your speech.

o It works offline and is open-source!

o GitHub: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion

3. ElevenLabs (Freemium) →

o Creates natural-sounding voice clones.

o Free version allows limited cloning & output per month.

o Website: https://elevenlabs.io

🔗 How to Combine These for a FREE / Cheap Solution

1️⃣ Train your AI voice clone using RVC (or ElevenLabs if you prefer a paid but easy option).

2️⃣ Generate speech in your voice using the AI model.

3️⃣ Use SadTalker (or HeyGen/D-ID) to animate your photo with accurate lip-syncing.

4️⃣ Integrate this into HemenParekh.ai → Your avatar can now "speak" in your own voice, lip-syncing perfectly!

🚀 Future Upgrade: Fully Interactive AI Avatar

If you later invest a bit, platforms like:

✅ Synthesia (paid, but excellent for interactive avatars)

✅ DeepBrain AI (real-time talking avatars)

✅ HeyGen Pro (real-time voice & lips)

can transform your avatar into a truly interactive AI speaker.

🔮 FINAL THOUGHTS

Right now, you can animate your photo & lip-sync your AI voice for FREE or under

$ 20 / month !

With future AI advancements, this will become even more realistic, affordable, and

real-time.

Want me to help outline a step-by-step workflow for implementation? 🔥

You said:

Hey Friend , I would love that ( my collegue Kishan - who has developed my V 1.0

, would love that , even more ! )

ChatGPT said:

Great! Let’s break this down step by step so Kishan can start working on it

immediately. The goal is to animate your avatar’s face, sync it with your own

cloned voice, and make it interactive on HemenParekh.ai.

🛠 STEP-BY-STEP WORKFLOW FOR AI AVATAR LIP-SYNC & VOICE INTEGRATION

🔹 STEP 1: Create a Voice Clone (Your Own Voice in 26 Languages)

🔹 Option 1 (FREE, Open-Source): Retrieval-Based Voice Conversion (RVC)

Install RVC on a local machine or cloud.
Train it using your recorded voice samples (~5-10 minutes of clear speech).
Convert text into speech using this cloned model.

🔹 Option 2 (Freemium, Easier to Use): ElevenLabs

Visit https://elevenlabs.io
Upload a sample of your voice (1-2 minutes).
Generate speech in any language using AI cloning.

➡️ OUTPUT :

AI-generated speech in your voice, sounding natural in multiple languages.

🔹 STEP 2: Generate a Talking Avatar with Lip-Sync

🔹 Option 1 (FREE, Open-Source): SadTalker

Install SadTalker from GitHub
Input:

Your static photo
Your AI-generated voice file

Output:

Video of your animated face speaking with lip-sync
Works well, but slightly robotic

🔹 Option 2 (Freemium, More Realistic): D-ID or HeyGen

Go to https://www.d-id.com or https://www.heygen.com
Upload your photo and audio file.
Generate high-quality lip-synced videos of your avatar speaking.

➡️ OUTPUT :

A realistic video of your AI avatar speaking in your voice with lip-sync.

🔹 STEP 3: Automate Avatar Interactions on HemenParekh.ai

Now, let’s make your avatar interactive on your website.

🔹 Option 1: Simple Pre-Recorded Video Responses

Create multiple pre-generated avatar videos (e.g., “Hello [Name]!” in different languages).
Use a chatbot to trigger the right video based on visitor data.

🔹 Option 2: Real-Time AI Speech & Lip-Sync (Future-Proofing)

Combine OpenAI GPT-4 Turbo API with DeepBrain AI or HeyGen API.
Fetch the visitor’s name, language, and context using cookies or login data.
Generate real-time voice responses in your voice.
Live-render a lip-synced video using an API.

➡️ OUTPUT :

Your avatar greets each visitor personally, in real-time, with their name in their language!

💡 FINAL ARCHITECTURE OVERVIEW

✅ HemenParekh.ai detects visitor name & language

✅ AI generates a greeting in your voice (via ElevenLabs/RVC)

✅ Your animated face lip-syncs the response (via D-ID/HeyGen API)

✅ Video is displayed in real-time on the homepage

🚀 NEXT STEPS FOR KISHAN

1️⃣ Experiment with RVC or ElevenLabs for voice cloning.

2️⃣ Test SadTalker or HeyGen for face animation.

3️⃣ Integrate automated video playback on the website.

4️⃣ Explore API-based real-time avatar interactions.

Would you like a sample Python script for Kishan to get started? 🚀🔥

You said:

If you do , Kishan would feel absolutely delighted !

ChatGPT said:

Here’s a starter Python script for Kishan to integrate a personalized AI avatar greeting system into HemenParekh.ai. This script will:

✅ Detect visitor name & language
✅ Generate speech in your voice using ElevenLabs API
✅ Create a lip-synced talking avatar using D-ID API
✅ Play the generated video dynamically on your website

🚀 Python Script for AI Avatar Greeting

This script assumes Kishan has API keys for ElevenLabs (voice cloning) and D-ID (avatar animation).
If you want a free setup, he can replace D-ID with SadTalker (open-source face animation).

python

CopyEdit

import requests

import json

import time

# ⬇️ Step 1: Define API Keys (Replace with your actual keys)

ELEVENLABS_API_KEY = "your-elevenlabs-api-key"

DID_API_KEY = "your-d-id-api-key"

# ⬇️ Step 2: Function to Generate AI Speech in Your Voice

def generate_speech(text, language="en"):

    url = "https://api.elevenlabs.io/v1/text-to-speech"

    headers = {

        "xi-api-key": ELEVENLABS_API_KEY,

        "Content-Type": "application/json"

    data = {

        "voice_id": "your-voice-id",  # Replace with your actual cloned voice ID

        "text": text,

        "model_id": "eleven_multilingual_v2",

        "language_id": language

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 200:

        speech_url = response.json()["audio_url"]

        return speech_url  # Returns link to AI-generated speech file

    else:

        print("Error generating speech:", response.json())

        return None

# ⬇️ Step 3: Function to Create Lip-Synced Talking Avatar

def generate_avatar_video(audio_url, photo_url):

    url = "https://api.d-id.com/talks"

    headers = {

        "Authorization": f"Bearer {DID_API_KEY}",

        "Content-Type": "application/json"

    data = {

        "source_url": photo_url,  # Your photo (uploaded or hosted)

        "script": {

            "type": "audio",

            "url": audio_url

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 200:

        video_id = response.json()["id"]

        print("Video generation started. ID:", video_id)

        return video_id

    else:

        print("Error generating avatar video:", response.json())

        return None

# ⬇️ Step 4: Function to Retrieve the Video URL Once Ready

def get_video_url(video_id):

    url = f"https://api.d-id.com/talks/{video_id}"

    headers = {"Authorization": f"Bearer {DID_API_KEY}"}

    for _ in range(10):  # Polling for video completion

        response = requests.get(url, headers=headers)

        if response.status_code == 200:

            data = response.json()

            if data["status"] == "done":

                return data["result_url"]  # Final video URL

            else:

                print("Processing... Waiting 5 seconds")

                time.sleep(5)

        else:

            print("Error fetching video:", response.json())

            return None

# ⬇️ Step 5: Main Function to Greet Visitor by Name in Their Language

def greet_visitor(visitor_name, language="en"):

    text = f"Hello {visitor_name}, welcome to HemenParekh.ai!"  # Customizable greeting

    speech_url = generate_speech(text, language)

    if speech_url:

        print("Speech generated:", speech_url)

        avatar_video_id = generate_avatar_video(speech_url, "your-photo-url")

        if avatar_video_id:

            video_url = get_video_url(avatar_video_id)

            print("Final avatar video:", video_url)

            return video_url

    return None

# ⬇️ Step 6: Example Usage

visitor_name = "Priya"

language = "hi"  # Hindi

video_result = greet_visitor(visitor_name, language)

if video_result:

    print(f"🎉 Video ready: {video_result}")

else:

    print("❌ Failed to generate greeting video.")

🚀 How Kishan Can Implement This

1️⃣ Get API keys for ElevenLabs & D-ID (or use free alternatives).
2️⃣ Upload a clear photo of you (Hemen Parekh) as the base avatar.
3️⃣ Run the script to generate greetings dynamically.
4️⃣ Embed the video URL in the homepage for real-time playback.

🔥 Next Steps

Would you like a web-based demo (React/Flask) so visitors can enter their name and see your avatar greet them live? 🚀🔥

You said:

Fabulous ! You never cease to surprise me !

ChatGPT said:

I'm absolutely thrilled that you loved it! 🚀🎉

Now, imagine the moment when your AI Avatar V2.0 greets every visitor personally, in your own voice, in their mother tongue—it will be a game-changer! 🔥

Let Kishan explore the Python script, and whenever he's ready, we can expand it into a full-fledged interactive web experience.

Looking forward to your next big step—just say the word, and we’ll make it happen! 😊

( B ) Email ( drafted by ChatGPT ) sent to Big Rock Server ( 10 Feb 2024 ) :

Subject: Request for VPS Upgrade with GPU for AI Lip Sync Processing

Dear Big Rock Support Team,

I am currently hosting my website www.HemenParekh.ai on a Big Rock VPS

with the following specifications:

Plan Name: Standard
CPU : 2 Cores
RAM: 2 GB
Disk Space: 20 GB SSD
Bandwidth: 1 TB
OS: Ubuntu 20.04

I am planning to implement real-time AI-powered lip sync on my website and

require an upgraded VPS with a dedicated GPU for optimal performance.

Could you please provide me with a quotation for a VPS plan with the following

specifications?

Recommended VPS Upgrade Requirements:

✅ CPU: 4 to 8 Cores (Intel Xeon or AMD Ryzen)
✅ RAM: 16 GB or higher
✅ Storage: 100 GB SSD/NVMe
✅ GPU: NVIDIA A100 / RTX 3090 (or any available equivalent)
✅ Bandwidth: 5 TB per month (or Unlimited)
✅ OS: Ubuntu 22.04 (or latest available version)

Additionally, please confirm:

If GPU acceleration is supported on your VPS plans.

Estimated cost per month/year for the upgraded plan.

Any one-time setup charges or migration assistance available.

I appreciate your assistance and look forward to your response.

( C ) To-days Development ( 20 April 2025 ) :

https://www.tice.news/tice-trending/what-is-kompact-ai-and-why-no-one-is-talking-about-it-8975171

Reply by ChatGPT :

Hello Hemenbhai,

It's great to reconnect with you! I understand your interest in animating your

avatar's face with lip-sync on www.HemenParekh.ai, and the challenges you've

faced due to the high costs associated with GPU-based services.

The recent development of Kompact AI by IIT Madras and Ziroh Labs is indeed

promising. This platform aims to run large AI models efficiently on standard

CPUs, potentially reducing the reliance on expensive GPUs. Tech in

Asia+8Hindustan Times+8LinkedIn+8 Tech in Asia+3The Tech Portal+3Hindustan Times+3

However, it's important to note that while Kompact AI shows potential in making

AI more accessible, there have been critiques regarding the novelty and

practicality of its achievements. Some in the tech community have replicated

similar performance using existing tools, suggesting that the claimed

breakthroughs may not be as groundbreaking as presented. Medium+1Medium+1

Regarding your specific application—animating your avatar's face with lip-sync

—the feasibility of using Kompact AI depends on several factors:

Model Compatibility:
Kompact AI has optimized certain models to run on CPUs. If your animation tools can leverage these models, integration might be possible.The Indian Express+2LinkedIn+2Analytics India Magazine+2

Performance Requirements:
Real-time face animation with lip-sync is computationally intensive. While Kompact AI aims to run models on CPUs, achieving smooth, real-time animation without GPUs may still be challenging.Medium+6TICE News+6Analytics India Magazine+6

Integration Complexity:
Adapting your existing setup to work with Kompact AI would require technical adjustments. The ease of this integration would depend on the flexibility of your current system and the support provided by Kompact AI.

Given these considerations, while Kompact AI offers an exciting avenue to explore, it may not fully meet the performance needs for high-quality, real-time avatar animation without further optimization or support.

Alternative Approaches:

Pre-rendered Animations:
Instead of real-time animation, consider generating pre-rendered videos of your avatar speaking common phrases. These can be triggered based on user interactions, reducing computational demands.
Hybrid Systems:
Use CPU-based processing for less intensive tasks and reserve GPU resources for demanding processes. This approach balances performance and cost.
Cloud-based Services:
Some platforms offer pay-per-use models for avatar animation. This can be more cost-effective than maintaining dedicated GPU infrastructure.

If you'd like, I can assist in evaluating these alternatives or provide more detailed guidance on integrating Kompact AI into your project. Let me know how you'd like to proceed!

Friday, 4 July 2025

Kompact and “ Voice / Face Animation” on www.HemenParekh.ai

I said:

ChatGPT said:

💡 Low-Cost / Free Ways to Achieve AI Lip-Sync & Voice Cloning

1️⃣ Free AI Lip-Syncing Tools (for Animating Your Photo)

2️⃣ Free & Cheap AI Voice Cloning (For Your Own Voice)

🔗 How to Combine These for a FREE / Cheap Solution

🚀 Future Upgrade: Fully Interactive AI Avatar

🔮 FINAL THOUGHTS

You said:

ChatGPT said:

🛠 STEP-BY-STEP WORKFLOW FOR AI AVATAR LIP-SYNC & VOICE INTEGRATION

🔹 STEP 1: Create a Voice Clone (Your Own Voice in 26 Languages)

🔹 STEP 2: Generate a Talking Avatar with Lip-Sync

🔹 STEP 3: Automate Avatar Interactions on HemenParekh.ai

💡 FINAL ARCHITECTURE OVERVIEW

🚀 NEXT STEPS FOR KISHAN

You said:

ChatGPT said:

🚀 Python Script for AI Avatar Greeting

🚀 How Kishan Can Implement This

🔥 Next Steps

You said:

ChatGPT said:

Subject: Request for VPS Upgrade with GPU for AI Lip Sync Processing

No comments:

Post a Comment