GPT4Free

G4F - Media Documentation

This document outlines how to use the G4F (Generative Framework) library to generate and process various media types, including audio, images, and videos.


1. Audio Generation and Transcription

G4F supports audio generation through providers like PollinationsAI and audio transcription using providers like Microsoft_Phi_4.

Generate Audio with PollinationsAI:

import asyncio
from g4f.client import AsyncClient
import g4f.Provider

async def main():
    client = AsyncClient(provider=g4f.Provider.PollinationsAI)

    response = await client.chat.completions.create(
        model="openai-audio",
        messages=[{"role": "user", "content": "Say good day to the world"}],
        audio={"voice": "alloy", "format": "mp3"},
    )
    response.choices[0].message.save("alloy.mp3")

asyncio.run(main())
Examples with other providers ```python from g4f.client import Client from g4f.Provider import gTTS, EdgeTTS, Gemini, PollinationsAI client = Client(provider=PollinationsAI) response = client.media.generate("Hello", audio={"voice": "alloy", "format": "mp3"}) response.data[0].save("openai.mp3") client = Client(provider=PollinationsAI) response = client.media.generate("Hello", model="hypnosis-tracy") response.data[0].save("hypnosis.mp3") client = Client(provider=Gemini) response = client.media.generate("Hello", model="gemini-audio") response.data[0].save("gemini.ogx") client = Client(provider=EdgeTTS) response = client.media.generate("Hello", audio={"language": "en"}) response.data[0].save("edge-tts.mp3") # The EdgeTTS provider also support the audio parameters `rate`, `volume` and `pitch` client = Client(provider=gTTS) response = client.media.generate("Hello", audio={"language": "en-US"}) response.data[0].save("google-tts.mp3") # The gTTS provider also support the audio parameters `tld` and `slow` ```

Transcribe an Audio File:

Some providers in G4F support audio inputs in chat completions, allowing you to transcribe audio files by instructing the model accordingly. This example demonstrates how to use the AsyncClient to transcribe an audio file asynchronously:

import asyncio
from g4f.client import AsyncClient
import g4f.Provider

async def main():
    client = AsyncClient(provider=g4f.Provider.Microsoft_Phi_4)

    with open("audio.wav", "rb") as audio_file:
        response = await client.chat.completions.create(
            messages="Transcribe this audio",
            media=[[audio_file, "audio.wav"]],
            modalities=["text"],
        )

    print(response.choices[0].message.content)

if __name__ == "__main__":
    asyncio.run(main())

Explanation

Notes

This example complements the guide by showcasing how to handle audio inputs asynchronously, expanding on the multimodal capabilities of the G4F AsyncClient API.

More examples for Transcription:

JavaScript ```javascript const endpoint = "/v1/audio/transcriptions"; const formData = new FormData(); formData.append("file", audioFile); formData.append("model", ""); try { const response = await fetch(endpoint, { method: "POST", body: formData, }); if (!response.ok) { throw new Error(`API request failed: ${response.status}`); } const result = await response.json(); console.log("Transcribed text:", result.text); } catch (error) { console.error("Transcription error:", error); return null; } ```
Python and requests ```python import requests with open('audio.wav', 'rb') as audio_file: response = requests.post('http://localhost:8080/api/markitdown', files={'file': audio_file}) if response.status_code == 200: data = response.json() print(data['text']) else: print(f"Error: {response.status_code}, {response.text}") ```
Python and openai ```python from openai import OpenAI client = OpenAI(base_url="http://localhost:8080/v1", api_key="secret") with open("audio.wav", "rb") as file: transcript = client.audio.transcriptions.create( model="", extra_body={"provider": "MarkItDown"}, file=file ) print(transcript.text) ```

2. Image Generation

G4F can generate images from text prompts and provides options to retrieve images as URLs or base64-encoded strings.

Generate an Image:

import asyncio
from g4f.client import AsyncClient

async def main():
    client = AsyncClient()

    response = await client.images.generate(
        prompt="a white siamese cat",
        model="flux",
        response_format="url",
    )

    image_url = response.data[0].url
    print(f"Generated image URL: {image_url}")

asyncio.run(main())

Base64 Response Format:

import asyncio
from g4f.client import AsyncClient

async def main():
    client = AsyncClient()

    response = await client.images.generate(
        prompt="a white siamese cat",
        model="flux",
        response_format="b64_json",
    )

    base64_text = response.data[0].b64_json
    print(base64_text)

asyncio.run(main())

Image Parameters:

Example with Image Parameters:

import asyncio
from g4f.client import AsyncClient

async def main():
    client = AsyncClient()

    response = await client.images.generate(
        prompt="a white siamese cat",
        model="flux",
        response_format="url",
        width=512,
        height=512,
        n=2,
    )

    for image in response.data:
        print(f"Generated image URL: {image.url}")

asyncio.run(main())

3. Creating Image Variations

You can generate variations of an existing image using G4F.

Create Image Variations:

import asyncio
from g4f.client import AsyncClient
from g4f.Provider import OpenaiChat

async def main():
    client = AsyncClient(image_provider=OpenaiChat)

    response = await client.images.create_variation(
        prompt="a white siamese cat",
        image=open("docs/images/cat.jpg", "rb"),
        model="dall-e-3",
    )

    image_url = response.data[0].url
    print(f"Generated image URL: {image_url}")

asyncio.run(main())

4. Video Generation

G4F supports video generation through providers like HuggingFaceMedia.

Generate a Video:

import asyncio
from g4f.client import AsyncClient
from g4f.Provider import HuggingFaceMedia

async def main():
    client = AsyncClient(
        provider=HuggingFaceMedia,
        api_key=os.getenv("HF_TOKEN") # Your API key here
    )

    video_models = client.models.get_video()
    print("Available Video Models:", video_models)

    result = await client.media.generate(
        model=video_models[0],
        prompt="G4F AI technology is the best in the world.",
        response_format="url",
    )

    print("Generated Video URL:", result.data[0].url)

asyncio.run(main())

Video Parameters:

Example with Video Parameters:

import os
import asyncio
from g4f.client import AsyncClient
from g4f.Provider import HuggingFaceMedia

async def main():
    client = AsyncClient(
        provider=HuggingFaceMedia,
        api_key=os.getenv("HF_TOKEN")  # Your API key here
    )

    video_models = client.models.get_video()
    print("Available Video Models:", video_models)

    result = await client.media.generate(
        model=video_models[0],
        prompt="G4F AI technology is the best in the world.",
        resolution="720p",
        aspect_ratio="16:9",
        n=1,
        response_format="url",
    )

    print("Generated Video URL:", result.data[0].url)

asyncio.run(main())

Key Points:


Return to Documentation