Merge branch 'jobs/instagram-repost/ai-description'

2025-06-30 16:14:57 +02:00
parent 7054597696 228d67a48d
commit 25b5b1be27
28 changed files with 868 additions and 22 deletions
--- a/.env.example
+++ b/.env.example
@@ -81,3 +81,8 @@ VITE_REVERB_APP_KEY="${REVERB_APP_KEY}"
 VITE_REVERB_HOST="${REVERB_HOST}"
 VITE_REVERB_PORT="${REVERB_PORT}"
 VITE_REVERB_SCHEME="${REVERB_SCHEME}"
+
+# AI LLM
+LLM_HOST_URL="https://openai.com/api"
+LLM_CHAT_MODEL="gpt-4o"
+LLM_VISION_MODEL="gpt-4o-vision-preview"
--- a/2
+++ b/2
@@ -54,6 +54,8 @@ RUN apk update && apk add --no-cache \
    openssl \
    linux-headers \
    supervisor \
+    tesseract-ocr \
+    ffmpeg \
    && rm  -rf /tmp/* /var/cache/apk/*

 RUN docker-php-ext-configure zip && docker-php-ext-install zip
--- a/LLMPrompts.md
+++ b/LLMPrompts.md
@@ -0,0 +1,52 @@
+
+# What is this file ?
+
+This file provides the user prompts used to get the prompts directly from the LLM used to give answers.
+
+For example, for the Instagram reel caption generation, here will be listed a prompt that asks the LLM to give 
+the prompt, system message and output format that will be used in the Instagram reel caption generation.
+
+This method comes from the idea that the best way to prompt engineer is to ask the concerned model to generate it directly.
+
+# Prompts
+
+Starting sentence is usually : 
+```
+I’m using some LLM and I would need a prompt and a system message for every use case I will give you.
+```
+
+## Instagram
+
+### Instagram Reel caption generation
+
+```
+I’m using some LLM and I would need a prompt, a system message and an output format for every use case I will give you.
+The first one is when I’m trying to generate a caption for an instagram Reel. For the moment, I can give the LLM the original instagram reel caption that was downloaded from, and a description by an LLM of the video, or the joke behind it.
+The caption must be short and well placed with the reel. For example, if the reel is funny, the caption must be short and funny, while still relating to the reel. The caption must not be describint the video like the LLM description does
+The LLM can add some appropriate hashtags if it wants to and seem appropriate.
+Sometimes, the original caption will credit the original author, most of the times on twitter like (“credit : t/twitteruser”). Those credit can appear in the generated caption too, But I don’t want any instagram account mention (“@instagramUser”) because most of the time it’s to incite to subscribe to the downloaded reel account. The use of emoji is encouraged, but not too much and it has to not look stupid or too.
+```
+
+## Video Descriptor
+
+I’m using some LLM and I would need a prompt and a system message for every use case I will give you.  
+
+The LLM here will be used to describe an Instagram Reel (video). Each screenshot of that video will be described using an LLM, prompt, system message and output format. The description of all the screenshots will be given to this LLM that will try to recreate the video based on the description of the screenshots, and describe the video.  
+The required prompt here is for the LLM that will compile the description into one and try to understand the video and describe it. I’m particularly interested in the joke behind the reel if there is one.
+
+This is an example of a screenshot description by an LLM : “The image shows a close-up of a person's hands holding what appears to be a brown object with a plastic covering, possibly food wrapped in paper or foil. There is also a small portion visible at the top right corner, which seems to be a red and white label. The focus of the image is on the hands holding the object.”
+
+Most of the description won’t make sense, so some details should be omitted. For example, one screenshot description could say the main subject is a car, and another one 3 seconds later in the video could say the main subject is a cat. You could say the car transformed into a cat, but it would be safer to assume that one of the description is wrong and the main characted was a cat all along the video because another description in the video also says the main subject is a cat. 
+It is safe to say that most analysed videos will be of bad quality. which means the screenshots description can vary a lot
+
+### Screenshot descriptor
+
+```
+I’m using some LLM and I would need a prompt, a system message and an output format for every use case I will give you.  
+The first one must describe a screenshot from a video. Each screenshot of that video will be described using the same LLM, prompt, system message and output format. The description of all the screenshots will be given to another LLM that will try to recreate the video based on the description of the screenshots, and describe the video.  
+The required prompt here is the one that describes a screenshot. The LLM will only be given the screenshot as input information. I need the LLM to describe the given screenshot. No need to specify that it is a screenshot. The LLM description must include specify the scene, the character or the main subject, the text present on the screenshots, most of the time it will be caption added after video editing, that may use emojis.  
+  
+The LLM used here is llava:7b-v1.6-mistral-q4_1, it is not the best for text generation , but it is very prowerful when using it’s vision capabilty.
+```
+
+The last part is personnal, I included it because I gave the prompt to another LLM that the one used because llava would'nt give me a good prompt.
--- a/app/Browser/BrowserJob.php
+++ b/app/Browser/BrowserJob.php
@@ -33,7 +33,7 @@ abstract class BrowserJob implements ShouldQueue

    public int $jobId;

-    public $timeout = 500;
+    public $timeout = 300; // 5 minutes

    public function __construct(int $jobId)
    {
@@ -53,6 +53,7 @@ abstract class BrowserJob implements ShouldQueue

        $this->browse(function (Browser $browser) use ($callback, &$log) {
            try {
+                $browser->driver->manage()->timeouts()->implicitlyWait(20);
                $log = $callback($browser);
            // } catch (Exception $e) {
            //     $browser->screenshot("failure-{$this->jobId}");
@@ -160,7 +161,7 @@ abstract class BrowserJob implements ShouldQueue
            '--disable-setuid-sandbox',
            '--whitelisted-ips=""',
            '--disable-dev-shm-usage',
-            '--user-data-dir=/home/seluser/profile/',
+            '--user-data-dir=/home/seluser/profile/nigga/', // seems that selenium doesn't like docker having a volume on the exact same folder ("session not created: probably user data directory is already in use")
        ])->all());

        return RemoteWebDriver::create(
@@ -169,6 +170,13 @@ abstract class BrowserJob implements ShouldQueue
                ChromeOptions::CAPABILITY,
                $options
            )
+            ->setCapability('timeouts', [
+                'implicit' => 20000, // 20 seconds
+                'pageLoad' => 300000, // 5 minutes
+                'script' => 30000, // 30 seconds
+            ]),
+            4000,
+            $this->timeout * 1000
        );
    }

--- a/app/Browser/Jobs/InstagramRepost/InstagramRepostJob.php
+++ b/app/Browser/Jobs/InstagramRepost/InstagramRepostJob.php
@@ -17,11 +17,14 @@ use Illuminate\Contracts\Queue\ShouldBeUniqueUntilProcessing;
 use Illuminate\Support\Collection;
 use Illuminate\Support\Facades\Log;
 use Laravel\Dusk\Browser;
+use App\Services\AIPrompt\OpenAPIPrompt;

 class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProcessing
 {
    // === CONFIGURATION ===

+    public $timeout = 1800; // 30 minutes
+
    private const APPROXIMATIVE_RUNNING_MINUTES = 2;

    private Collection $jobInfos;
@@ -29,6 +32,10 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces

    protected IInstagramVideoDownloader $videoDownloader;

+    protected ReelDescriptor $ReelDescriptor;
+
+    protected OpenAPIPrompt $openAPIPrompt;
+
    protected string $downloadFolder = "app/Browser/downloads/InstagramRepost/";

    /**
@@ -40,12 +47,14 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces
     */
    protected InstagramDescriptionPipeline $descriptionPipeline;

-    public function __construct($jobId = 4)
+    public function __construct($jobId = 4, ReelDescriptor $ReelDescriptor = null, OpenAPIPrompt $openAPIPrompt = null)
    {
        parent::__construct($jobId);

        $this->downloadFolder = base_path($this->downloadFolder);
        $this->videoDownloader = new YTDLPDownloader();
+        $this->ReelDescriptor = $ReelDescriptor ?? app(ReelDescriptor::class);
+        $this->openAPIPrompt = $openAPIPrompt ?? app(OpenAPIPrompt::class);
        $this->descriptionPipeline = new InstagramDescriptionPipeline([
            // Add steps to the pipeline here
            new DescriptionPipeline\RemoveAccountsReferenceStep(),
@@ -152,13 +161,17 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces
             */
            $downloadedReels = [];
            foreach ($toDownloadReels as $repost) {
+                $downloadInfos = $this->downloadReel(
+                    $browser,
+                    $repost
+                );
+
                $downloadedReels[] = [
                    $repost,
-                    $this->downloadReel(
-                        $browser,
-                    $repost
-                    )
+                    $downloadInfos
                ];
+
+                $this->describeReel($repost, $downloadInfos);
            }

            $this->jobRun->addArtifact(new JobArtifact([
@@ -282,6 +295,15 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces
        return $videoInfo;
    }

+    protected function describeReel(InstagramRepost $reel, IInstagramVideo $videoInfo): void
+    {
+        // Set the video description to the reel description
+        $reel->video_description = $this->ReelDescriptor->getDescription($videoInfo->getFilename());
+        $reel->save();
+
+        Log::info("Reel description set: {$reel->reel_id} - {$reel->video_description}");
+    }
+
    protected function repostReel(Browser $browser, InstagramRepost $reel, IInstagramVideo $videoInfo): bool
    {
        try {
@@ -321,16 +343,17 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces
            $this->clickNext($browser); // Skip cover photo and trim

            // Add a caption
-            $captionText = $this->descriptionPipeline->process($videoInfo->getDescription());
+            $captionText = $this->descriptionPipeline->process($this->getReelCaption($reel, $videoInfo));
            $this->pasteText($browser, $captionText, 'div[contenteditable]');

            sleep(2); // Wait for the caption to be added

-            if (config("app.environment") !== "local") { // Don't share the post in local environment
+            if (config("app.env") !== "local") { // Don't share the post in local environment
                $this->clickNext($browser); // Share the post
            }

-            sleep(5); // Wait for the post to be completed
+            sleep(7); // Wait for the post to be completed
+            $this->removePopups($browser);

            // Check if the post was successful
            try {
@@ -364,6 +387,56 @@ class InstagramRepostJob extends BrowserJob implements ShouldBeUniqueUntilProces
        }
    }

+    private function getReelCaption(InstagramRepost $reel, IInstagramVideo $videoInfo): string
+    {
+        if (isset($reel->instagram_caption)) {
+            return $reel->instagram_caption;
+        }
+
+        // Get the reel description from the database or the video info
+        $reelDescription = $reel->video_description;
+        $originalDescription = $videoInfo->getDescription();
+        $llmAnswer = $this->openAPIPrompt->generate(
+            config('llm.models.chat.name'),
+            "Original Caption: {$originalDescription}
+Video Description/Directive: {$reelDescription}",
+            [],
+            outputFormat: '{"type": "object", "properties": {"answer": {"type": "string"}}, "required": ["answer"]}',
+            systemMessage: "You are an AI assistant specialized in creating engaging and concise Instagram Reel captions. Your primary task is to transform the provided original caption (often from Twitter) and description/directions into a fresh, unique, but still relevant caption for Instagram Reels format.
+
+Key instructions:
+1.  **Analyze Input:** You will receive two things: an *original reel caption* (usually starting with \"credit:\" or mentioning a Twitter handle like `t/TwitterUser`), and either a *video description* or explicit directions about the joke/idea behind the video.
+2.  **Transform, Don't Reproduce:** Your output must be significantly different from the original provided caption. It should capture the essence of the content described but phrase it anew – often with humor if appropriate.
+3.  **Keep it Short & Punchy:** Instagram Reels thrive on quick engagement. Prioritize brevity (ideally under two lines, or three lines max) and impact. Make sure your caption is concise enough for fast-scroll viewing.
+4.  **Maintain the Core Idea:** The new caption must directly relate to the video's content/direction/joke without simply restating it like a description would. Focus on what makes the reel *interesting* or *funny* in its own right.
+5.  **Preserve Original Credit (Optional):** If an explicit \"credit\" line is provided, you may incorporate this into your new caption naturally, perhaps using `(via...)` or similar phrasing if it fits well and doesn't sound awkward. **Do not** include any original Instagram account mentions (@handles). They are often intended for promotion which isn't our goal.
+6.  **Use Emoji Judiciously:** Incorporate relevant emojis to enhance the tone (funny, relatable, etc.) or add visual interest. Use them purposefully and in moderation – they should complement the caption, not overwhelm it.
+7.  **Add Hashtags (Optional but Recommended):** Generate a few relevant Instagram hashtags automatically at the end of your output to increase visibility. Keep these organic to the content and avoid forcing irrelevant tags.
+
+Your response structure is as follows:
+-   The generated caption (your core answer).
+-   Then, if you generate any hashtags, list them on the next line(s) prefixed with `#`.
+
+Example Input Structure:
+Original Caption: credit: t/otherhandle This banana is looking fly today!
+Video Description/Directive: A man walks into a store holding a banana and wearing sunglasses. He looks around confidently before leaving.
+
+Your answer should only contain the generated caption, and optionally hashtags if relevant.
+
+Remember to be creative and ensure the generated caption feels like something you would see naturally on an Instagram Reel. Aim for personality and relevance.
+",
+            keepAlive: true,
+            shouldThink: config('llm.models.chat.shouldThink')
+        );
+        $llmAnswer = json_decode($llmAnswer, true)['answer'] ?? null;
+        if ($llmAnswer !== null) {
+            $reel->instagram_caption = $llmAnswer;
+            $reel->save();
+            Log::info("Reel caption generated: {$reel->reel_id} - {$llmAnswer}");
+        }
+        return $llmAnswer;
+    }
+
    private function clickNext(Browser $browser) {
        $nextButton = $browser->driver->findElement(WebDriverBy::xpath('//div[contains(text(), "Next") or contains(text(), "Share")]'));
        $nextButton->click();
--- a/app/Browser/Jobs/InstagramRepost/ReelDescriptor.php
+++ b/app/Browser/Jobs/InstagramRepost/ReelDescriptor.php
@@ -0,0 +1,11 @@
+<?php
+
+namespace App\Browser\Jobs\InstagramRepost;
+
+use App\Services\AIPrompt\OpenAPIPrompt;
+use App\Services\FileTools\OCR\IImageOCR;
+
+class ReelDescriptor extends \App\Services\FileTools\VideoDescriptor\OCRLLMVideoDescriptor
+{
+    public const DESCRIPTION_PROMPT = "Analyze this Instagram Reel sequence. You are given information for each individual screenshot/analysis from the video:";
+}
--- a/app/Providers/AIPromptServiceProvider.php
+++ b/app/Providers/AIPromptServiceProvider.php
@@ -0,0 +1,27 @@
+<?php
+
+namespace App\Providers;
+
+use App\Services\AIPrompt\OpenAPIPrompt;
+use Illuminate\Support\ServiceProvider;
+
+class AIPromptServiceProvider extends ServiceProvider
+{
+    /**
+     * Register services.
+     */
+    public function register(): void
+    {
+        $this->app->singleton(OpenAPIPrompt::class, function ($app) {
+            return new OpenAPIPrompt();
+        });
+    }
+
+    /**
+     * Bootstrap services.
+     */
+    public function boot(): void
+    {
+        //
+    }
+}
--- a/app/Providers/ImageOCRServiceProvider.php
+++ b/app/Providers/ImageOCRServiceProvider.php
@@ -0,0 +1,27 @@
+<?php
+
+namespace App\Providers;
+
+use App\Services\FileTools\OCR\IImageOCR;
+use Illuminate\Support\ServiceProvider;
+
+class ImageOCRServiceProvider extends ServiceProvider
+{
+    /**
+     * Register services.
+     */
+    public function register(): void
+    {
+        $this->app->singleton(IImageOCR::class, function ($app) {
+            return new \App\Services\FileTools\OCR\TesseractImageOCR();
+        });
+    }
+
+    /**
+     * Bootstrap services.
+     */
+    public function boot(): void
+    {
+        //
+    }
+}
--- a/app/Providers/VideoDescriptorServiceProvider.php
+++ b/app/Providers/VideoDescriptorServiceProvider.php
@@ -0,0 +1,36 @@
+<?php
+
+namespace App\Providers;
+
+use App\Services\AIPrompt\OpenAPIPrompt;
+use App\Services\FileTools\OCR\IImageOCR;
+use App\Services\FileTools\VideoDescriptor\IVideoDescriptor;
+use Illuminate\Support\ServiceProvider;
+
+class VideoDescriptorServiceProvider extends ServiceProvider
+{
+    /**
+     * Register services.
+     */
+    public function register(): void
+    {
+        // Register the VideoDescriptor service
+        $this->app->singleton(IVideoDescriptor::class, function ($app) {
+            return new \App\Services\FileTools\VideoDescriptor\LLMFullVideoDescriptor(
+                $app->make(IImageOCR::class),
+                $app->make(OpenAPIPrompt::class)
+            );
+        });
+
+        // Register the VideoDescriptor service
+        $this->app->singleton(\App\Browser\Jobs\InstagramRepost\ReelDescriptor::class);
+    }
+
+    /**
+     * Bootstrap services.
+     */
+    public function boot(): void
+    {
+        //
+    }
+}
--- a/app/Services/AIPrompt/IAIPrompt.php
+++ b/app/Services/AIPrompt/IAIPrompt.php
@@ -0,0 +1,10 @@
+<?php
+
+namespace App\Services\AIPrompt;
+
+interface IAIPrompt
+{
+    public function generate(string $model, string $prompt, array $images = [], string $outputFormat = "json", string $systemMessage = null, bool $keepAlive = true, bool $shouldThink = false): string;
+
+    //public function chat(string $model, string $prompt, array $images = []): string;
+}
--- a/app/Services/AIPrompt/OpenAPIPrompt.php
+++ b/app/Services/AIPrompt/OpenAPIPrompt.php
@@ -0,0 +1,137 @@
+<?php
+
+namespace App\Services\AIPrompt;
+
+use Uri;
+
+/**
+ * Use OpenAI API to get answers from a model.
+ */
+class OpenAPIPrompt implements IAIPrompt
+{
+    private string $host;
+    private ?string $token = null;
+
+    public function __construct(string $host = null) {
+        $this->host = $host ?? config('llm.api.host');
+        if (config('llm.api.token')) {
+            $this->token = config('llm.api.token');
+        }
+    }
+
+    private function getHeaders(): array
+    {
+        return [
+            'Authorization: ' . ($this->token ? 'Bearer ' . $this->token : ''),
+            'Content-Type: application/json',
+        ];
+    }
+
+    /**
+     * Call the OpenAI API with the given endpoint and body.
+     * @param string $endpoint
+     * @param string $body
+     * @throws \Exception
+     * @return string
+     */
+    private function callAPI(string $endpoint, string $body): string
+    {
+        $url = $this->host . $endpoint;
+
+        $ch = curl_init($url);
+        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
+        curl_setopt($ch, CURLOPT_HTTPHEADER, $this->getHeaders());
+        curl_setopt($ch, CURLOPT_POST, true);
+        curl_setopt($ch, CURLOPT_POSTFIELDS, $body);
+        $response = curl_exec($ch);
+        $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
+        curl_close($ch);
+
+        if ($httpCode !== 200) {
+            throw new \Exception("Error calling OpenAI API: HTTP $httpCode - $response");
+        }
+        return $response;
+    }
+
+    /**
+     * Call the OpenAI API generate endpoint. to generate a response to a prompt.
+     * @param string $model
+     * @param string $prompt
+     * @param array $images
+     * @return void
+     */
+    public function generate(string $model, string $prompt, array $images = [], string $outputFormat = null, string $systemMessage = null, bool $keepAlive = true, bool $shouldThink = false): string
+    {
+        /*
+            Generate a completion
+
+            POST /api/generate
+
+            Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
+            Parameters
+
+                model: (required) the model name
+                prompt: the prompt to generate a response for
+                suffix: the text after the model response
+                images: (optional) a list of base64-encoded images (for multimodal models such as llava)
+                think: (for thinking models) should the model think before responding?
+
+            Advanced parameters (optional):
+
+                format: the format to return a response in. Format can be json or a JSON schema
+                options: additional model parameters listed in the documentation for the Modelfile such as temperature
+                system: system message to (overrides what is defined in the Modelfile)
+                template: the prompt template to use (overrides what is defined in the Modelfile)
+                stream: if false the response will be returned as a single response object, rather than a stream of objects
+                raw: if true no formatting will be applied to the prompt. You may choose to use the raw parameter if you are specifying a full templated prompt in your request to the API
+                keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
+                context (deprecated): the context parameter returned from a previous request to /generate, this can be used to keep a short conversational memory
+
+            Structured outputs
+
+            Structured outputs are supported by providing a JSON schema in the format parameter. The model will generate a response that matches the schema. See the structured outputs example below.
+            JSON mode
+
+            Enable JSON mode by setting the format parameter to json. This will structure the response as a valid JSON object. See the JSON mode example below.
+
+            Important
+
+            **It's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.**
+        */
+
+        // Transform the images to base64
+        foreach ($images as &$image) {
+            if (file_exists($image)) {
+                $image = base64_encode(file_get_contents($image));
+            }
+        }
+
+        $body = [
+            'model' => $model,
+            'prompt' => $prompt,
+            'images' => $images,
+            'think' => $shouldThink,
+            'stream' => false,
+        ];
+
+        if ($systemMessage !== null) {
+            $body['system'] = $systemMessage;
+        }
+        if ($outputFormat !== null) {
+            $body['format'] = json_decode($outputFormat);
+        }
+        if (!$keepAlive) {
+            $body['keep_alive'] = "0m";
+        }
+
+        $body = json_encode($body);
+
+        dump($body);
+        $response = $this->callAPI('/api/generate', $body);
+        $decodedResponse = json_decode($response, true);
+        if (json_last_error() !== JSON_ERROR_NONE) {
+            throw new \Exception("Error decoding JSON response: " . json_last_error_msg());
+        }
+        return $decodedResponse['response'] ?? '';
+    }
+}
--- a/app/Services/FileTools/OCR/IImageOCR.php
+++ b/app/Services/FileTools/OCR/IImageOCR.php
@@ -0,0 +1,14 @@
+<?php
+
+namespace App\Services\FileTools\OCR;
+
+interface IImageOCR
+{
+    /**
+     * Perform OCR on the given file.
+     *
+     * @param string $filePath The path to the file to be processed.
+     * @return string The extracted text from the file.
+     */
+    public function performOCR(string $filePath): string;
+}
--- a/app/Services/FileTools/OCR/TesseractImageOCR.php
+++ b/app/Services/FileTools/OCR/TesseractImageOCR.php
@@ -0,0 +1,15 @@
+<?php
+
+namespace App\Services\FileTools\OCR;
+use thiagoalessio\TesseractOCR\TesseractOCR;
+
+class TesseractImageOCR implements IImageOCR
+{
+    /**
+     * @inheritDoc
+     */
+    public function performOCR(string $filePath): string {
+        $tesseract = new TesseractOCR($filePath);
+        return $tesseract->run();
+    }
+}
--- a/app/Services/FileTools/VideoDescriptor/AbstractLLMVideoDescriptor.php
+++ b/app/Services/FileTools/VideoDescriptor/AbstractLLMVideoDescriptor.php
@@ -0,0 +1,59 @@
+<?php
+
+namespace App\Services\FileTools\VideoDescriptor;
+
+use App\Services\FileTools\VideoDescriptor\IVideoDescriptor;
+use Log;
+
+abstract class AbstractLLMVideoDescriptor implements IVideoDescriptor
+{
+    public const MAX_FRAMES = 5;
+
+    abstract public function getDescription(string $filePath): ?string;
+
+    /**
+     * Cut the video into screenshots.
+     * Using ffmpeg to cut the video into screenshots at regular intervals.
+     * The screenshots will be saved in a temporary directory.
+     * @param string $filePath
+     * @return array array with timestamps as key and screenshot file paths as values.
+     */
+    protected function cutVideoIntoScreenshots(string $filePath): array
+    {
+        $tempDir = sys_get_temp_dir() . '/video_screenshots';
+        if (!is_dir($tempDir)) {
+            mkdir($tempDir, 0777, true);
+        }
+        else {
+            // Clear the directory if it already exists
+            array_map('unlink', glob($tempDir . '/*'));
+        }
+
+        Log::info("Cutting video into screenshots: $filePath");
+
+        $videoDuration = shell_exec("ffprobe -v error -show_entries format=duration -of csv=p=0 " . escapeshellarg($filePath));
+        if ($videoDuration === null) {
+            Log::error("Failed to get video duration for file: $filePath");
+            return [];
+        }
+        $videoDuration = floatval($videoDuration);
+
+        $framesInterval = ceil($videoDuration / self::MAX_FRAMES);
+        $fps = 1/$framesInterval; // Frames per second for the screenshots
+
+        $outputPattern = $tempDir . '/screenshot_%d.png';
+        $command = "ffmpeg -i " . escapeshellarg($filePath) . " -vf fps={$fps} " . escapeshellarg($outputPattern);
+        exec($command);
+
+        // Collect all screenshots
+        $screenshots = glob($tempDir . '/screenshot_*.png');
+        $array = [];
+        foreach ($screenshots as $screenshot) {
+            $array[] = [
+                "screenshot" => $screenshot,
+                "timestamp" => floor(sizeof($array) * $framesInterval),
+            ];
+        }
+        return $array;
+    }
+}
--- a/app/Services/FileTools/VideoDescriptor/IVideoDescriptor.php
+++ b/app/Services/FileTools/VideoDescriptor/IVideoDescriptor.php
@@ -0,0 +1,14 @@
+<?php
+
+namespace App\Services\FileTools\VideoDescriptor;
+
+interface IVideoDescriptor
+{
+    /**
+     * Get the video description.
+     *
+     * @param string $filePath The path to the video file.
+     * @return string The description of the video.
+     */
+    public function getDescription(string $filePath): ?string;
+}
--- a/app/Services/FileTools/VideoDescriptor/LLMFullVideoDescriptor.php
+++ b/app/Services/FileTools/VideoDescriptor/LLMFullVideoDescriptor.php
@@ -0,0 +1,64 @@
+<?php
+
+namespace App\Services\FileTools\VideoDescriptor;
+
+use App\Services\AIPrompt\OpenAPIPrompt;
+use App\Services\FileTools\OCR\IImageOCR;
+
+class LLMFullVideoDescriptor extends AbstractLLMVideoDescriptor implements IVideoDescriptor
+{
+    public const DESCRIPTION_PROMPT = "Describe the video based on the screenshots. Each screenshot has a timestamp of when in the video the screenshot was taken. Do not specify that it is a video, just describe the video. Do not describe the screenshots one by one, try to make sense out of all the screenshots, what could be the video about ? What capion is attached to the video ? is it a meme ? If yes, what is the joke ? Be the most descriptive without exceeding 5000 words.\n";
+
+    public function __construct(public IImageOCR $ocr, public OpenAPIPrompt $llm) {
+    }
+
+    public function getDescription(string $filePath): ?string
+    {
+        /*
+            1. Cut videos in screenshots
+            2. Ask an LLM to describe the video with all the screenshots
+        */
+
+        // Step 1: Cut video into screenshots
+        $screenshots = $this->cutVideoIntoScreenshots($filePath);
+
+        if (empty($screenshots)) {
+            throw new \Exception("No screenshots were generated from the video {$filePath}.");
+        }
+
+        // Step 4: Combine the descriptions of all screenshots into a single description
+        $combinedDescription = '';
+        $screenshotCount = 0;
+        foreach ($screenshots as $values) {
+            $screenshot = $values['screenshot'];
+            $timestamp = $values['timestamp'];
+
+            $screenshotCount++;
+            $combinedDescription .= "Screenshot: {$screenshotCount}\n";
+            $combinedDescription .= "Timestamp: {$timestamp}s\n"; // TODO Cut the video in smaller parts when the video is short
+            $ocrDescription = $this->ocr->performOCR($screenshot);
+            $ocrDescription = empty($ocrDescription) ? 'No text found' : $ocrDescription;
+            $combinedDescription .= "OCR: {$ocrDescription}\n"; // Perform OCR on the screenshot
+            $combinedDescription .= "\n";
+        }
+        $combinedDescription = trim($combinedDescription);
+
+        // Step 5: Ask an LLM to describe the video based on the combined descriptions
+        $llmDescription = $this->llm->generate(
+            config('llm.models.vision.name'),
+            static::DESCRIPTION_PROMPT . $combinedDescription,
+            images: array_map(function ($screenshot) {return $screenshot["screenshot"];}, $screenshots), // Pass the screenshots to the LLM
+            outputFormat: '{"type": "object", "properties": {"answer": {"type": "string"}}, "required": ["answer"]}',
+            systemMessage: "The user will ask something. Give your direct answer to that.",
+            keepAlive: true,
+            shouldThink: config('llm.models.vision.shouldThink')
+        );
+
+        $llmDescription = json_decode($llmDescription, true)['answer'] ?? null;
+        if (empty($llmDescription)) {
+            $llmDescription = null;
+        }
+
+        return $llmDescription;
+    }
+}
--- a/app/Services/FileTools/VideoDescriptor/OCRLLMVideoDescriptor.php
+++ b/app/Services/FileTools/VideoDescriptor/OCRLLMVideoDescriptor.php
@@ -0,0 +1,144 @@
+<?php
+
+namespace App\Services\FileTools\VideoDescriptor;
+
+use App\Services\AIPrompt\OpenAPIPrompt;
+use App\Services\FileTools\OCR\IImageOCR;
+
+class OCRLLMVideoDescriptor extends AbstractLLMVideoDescriptor implements IVideoDescriptor
+{
+    public const DESCRIPTION_PROMPT = "Analyze this Video sequence. You are given information for each individual screenshot/analysis from the video:";
+
+    public function __construct(public IImageOCR $ocr, public OpenAPIPrompt $llm) {
+    }
+
+    public function getDescription(string $filePath): ?string
+    {
+        /*
+            1. Cut videos in screenshots
+            2. Use OCR to extract text from screenshots
+            3. Use LLM to generate a description of the screenshot
+            4. Combine the descriptions of all screenshots into a single description
+            5. Ask an LLM to describe the video
+        */
+
+        // Step 1: Cut video into screenshots
+        $screenshots = $this->cutVideoIntoScreenshots($filePath);
+
+        if (empty($screenshots)) {
+            throw new \Exception("No screenshots were generated from the video {$filePath}.");
+        }
+
+        // Step 2 & 3: Use OCR to extract text and LLM to get description from screenshots
+        $descriptions = [];
+        foreach ($screenshots as $values) {
+            $screenshot = $values['screenshot'];
+            $timestamp = $values['timestamp'];
+
+            $descriptions[$screenshot] = [];
+
+            $ocrDescription = $this->ocr->performOCR($screenshot);
+            $ocrDescription = empty($ocrDescription) ? 'No text found' : $ocrDescription;
+            $descriptions[$screenshot]['ocr'] = $ocrDescription;
+            dump($ocrDescription); // DEBUG
+
+            $llmDescription = $this->llm->generate(
+                config('llm.models.vision.name'),
+                "Describe this image in detail, breaking it down into distinct parts as follows:
+
+1.  **Scene Description:** Describe the overall setting and environment of the image (e.g., forest clearing, futuristic city street, medieval castle interior).
+2.  **Main Subject/Character(s):** Detail what is happening with the primary character or subject present in the frame.
+3.  **Text Description (if any):** If there are visible text elements (like words, letters, captions), describe them exactly as they appear and note their location relative to other elements. This includes any emojis used in captions, describing their visual appearance and likely meaning.
+4.  **Summary:** Briefly summarize the key content of the image for clarity.
+5.  **Joke:** If the image is part of a meme or humorous content, describe the joke or humorous element present in the image. Do not include this part if you are not sure to understand the joke/meme.
+
+Format your response strictly using numbered lines corresponding to these four points (1., 2., 3., 4., 5.). Do not use markdown formatting or extra text outside these lines; simply list them sequentially as plain text output.",
+                images: [$screenshot],
+                outputFormat: '{"type": "object", "properties": {"answer": {"type": "string"}}, "required": ["answer"]}',
+                systemMessage: "You are an image understanding AI specialized in describing visual scenes accurately and concisely. Your task is solely to describe the content of the provided image based on what you can visually perceive.
+
+Please analyze the image carefully and provide a description focusing purely on the visible information without generating any text about concepts, interpretations, or future actions beyond the immediate scene. Describe everything that is clearly depicted.",
+                keepAlive: $screenshot != end($screenshots), // Keep alive for all but the last screenshot
+                shouldThink: config('llm.models.vision.shouldThink')
+            );
+            dump($llmDescription); // DEBUG
+            $descriptions[$screenshot]['text'] = json_decode($llmDescription, true)['answer'] ?? 'No description generated';
+        }
+
+        // HERE COULD BE SOME INTERMEDIATE PROCESSING OF DESCRIPTIONS
+
+        // Step 4: Combine the descriptions of all screenshots into a single description
+        $combinedDescription = '';
+        $screenshotCount = 0;
+        foreach ($screenshots as $values) {
+            $screenshot = $values['screenshot'];
+            $timestamp = $values['timestamp'];
+
+            $screenshotCount++;
+            $description = $descriptions[$screenshot] ?? [];
+
+            $combinedDescription .= "Screenshot: {$screenshotCount}\n";
+            $combinedDescription .= "Timestamp: {$timestamp}s\n"; // TODO Cut the video in smaller parts when the video is short
+            $combinedDescription .= "OCR: {$description['ocr']}\n";
+            $combinedDescription .= "LLM Description: {$description['text']}\n";
+            $combinedDescription .= "\n";
+        }
+        $combinedDescription = trim($combinedDescription);
+
+        // Step 5: Ask an LLM to describe the video based on the combined descriptions
+        $llmDescription = $this->llm->generate(
+            config('llm.models.chat.name'),
+            static::DESCRIPTION_PROMPT . $combinedDescription . "\n\nBased only on these frame analyses, please provide:
+
+     A single, concise description that captures the main action or theme occurring in the reel across all frames.
+     Identify and describe any joke or humorous element present in the video if you can discern one.
+
+
+Important Considerations
+
+     Remember that most videos are of poor quality; frame descriptions might be inaccurate, vague, or contradictory due to blurriness or fast cuts.
+     Your task is synthesis: focus on the overall impression and sequence, not perfecting each individual piece of information. Some details mentioned in one analysis may simply be incorrect or misidentified from another perspective.
+
+
+Analyze all provided frames (separated by --- for clarity) to understand what's happening. Then, synthesize this understanding into point 1 above and identify the joke if present as per point 2.",
+            outputFormat: '{"type": "object", "properties": {"answer": {"type": "string"}}, "required": ["answer"]}',
+            systemMessage: "You are an expert social media content analyst specializing in interpreting Instagram Reels. Your primary function is to generate a comprehensive description and identify any underlying humor or joke in a given video sequence. You will be provided with individual frame analyses, each containing:
+
+     Screenshot Number: The sequential number of the frame.
+     Timestamp: When that specific frame occurs within the reel.
+     OCR Text Result: Raw text extracted from the image content using OCR (Optical Character Recognition), which may contain errors or misinterpretations (\"may appear\" descriptions).
+     LLM Description of Screenshot: A textual interpretation of what's visible in the frame, based on previous LLM processing.
+
+
+Please note:
+
+     The individual frame analyses can be inconsistent due to low video quality (e.g., blurriness) or rapid scene changes where details are hard to distinguish.
+     Your task is not to perfect each frame description but to understand the overall sequence and likely narrative, focusing on identifying any joke, irony, absurdity, or humorous transformation occurring across these frames.
+
+
+Your response should be structured as follows:
+
+     Overall Video Description: Provide a concise summary of what happens in the reel based on the combined information from all the provided screenshots.
+     Humor/Joke Identification (If Applicable): If you can discern any joke or humorous element, explicitly state it and explain how the sequence of frames contributes to this.
+
+
+Instructions for Synthesis:
+
+     Focus on identifying recurring elements, main subject(s), consistent actions/actions that seem unlikely (potential contradiction).
+     Look for patterns where details change rapidly or absurdly.
+     Prioritize information from descriptions over relying solely on OCR text if the description seems more plausible. Ignore minor inconsistencies between frames unless they clearly contradict a central theme or joke premise.
+     Be ready to point out where the humor lies, which might involve unexpected changes, wordplay captured by OCR errors in the context of the visual action described, absurdity, or irony.",
+            keepAlive: true,
+            shouldThink: config('llm.models.chat.shouldThink')
+        );
+
+        $llmDescription = json_decode($llmDescription, true)['answer'] ?? null;
+        if (empty($llmDescription)) {
+            $llmDescription = null;
+        }
+
+        dump($llmDescription); // DEBUG
+
+        return $llmDescription;
+    }
+}
--- a/bootstrap/providers.php
+++ b/bootstrap/providers.php
@@ -1,7 +1,10 @@
 <?php

 return [
+    App\Providers\AIPromptServiceProvider::class,
    App\Providers\AppServiceProvider::class,
    App\Providers\BrowserJobsServiceProvider::class,
+    App\Providers\ImageOCRServiceProvider::class,
    App\Providers\TelescopeServiceProvider::class,
+    App\Providers\VideoDescriptorServiceProvider::class,
 ];
--- a/composer.json
+++ b/composer.json
@@ -19,6 +19,7 @@
        "laravel/telescope": "^5.5",
        "laravel/tinker": "^2.9",
        "norkunas/youtube-dl-php": "dev-master",
+        "thiagoalessio/tesseract_ocr": "^2.13",
        "tightenco/ziggy": "^2.0"
    },
    "require-dev": {
--- a/composer.lock
+++ b/composer.lock
@@ -4,7 +4,7 @@
        "Read more about it at https://getcomposer.org/doc/01-basic-usage.md#installing-dependencies",
        "This file is @generated automatically"
    ],
-    "content-hash": "9a964008040d9ce219547515fe65dd86",
+    "content-hash": "20c0488746a861aecc1187374ca0aa7f",
    "packages": [
        {
            "name": "brick/math",
@@ -7038,6 +7038,55 @@
            ],
            "time": "2025-01-17T11:39:41+00:00"
        },
+        {
+            "name": "thiagoalessio/tesseract_ocr",
+            "version": "2.13.0",
+            "source": {
+                "type": "git",
+                "url": "https://github.com/thiagoalessio/tesseract-ocr-for-php.git",
+                "reference": "232a8cb9d571992f9bd1e263f2f6909cf6c173a1"
+            },
+            "dist": {
+                "type": "zip",
+                "url": "https://api.github.com/repos/thiagoalessio/tesseract-ocr-for-php/zipball/232a8cb9d571992f9bd1e263f2f6909cf6c173a1",
+                "reference": "232a8cb9d571992f9bd1e263f2f6909cf6c173a1",
+                "shasum": ""
+            },
+            "require": {
+                "php": "^5.3 || ^7.0 || ^8.0"
+            },
+            "require-dev": {
+                "phpunit/php-code-coverage": "^2.2.4 || ^9.0.0"
+            },
+            "type": "library",
+            "autoload": {
+                "psr-4": {
+                    "thiagoalessio\\TesseractOCR\\": "src/"
+                }
+            },
+            "notification-url": "https://packagist.org/downloads/",
+            "license": [
+                "MIT"
+            ],
+            "authors": [
+                {
+                    "name": "thiagoalessio",
+                    "email": "thiagoalessio@me.com"
+                }
+            ],
+            "description": "A wrapper to work with Tesseract OCR inside PHP.",
+            "keywords": [
+                "OCR",
+                "Tesseract",
+                "text recognition"
+            ],
+            "support": {
+                "irc": "irc://irc.freenode.net/tesseract-ocr-for-php",
+                "issues": "https://github.com/thiagoalessio/tesseract-ocr-for-php/issues",
+                "source": "https://github.com/thiagoalessio/tesseract-ocr-for-php"
+            },
+            "time": "2023-10-05T21:14:48+00:00"
+        },
        {
            "name": "tightenco/ziggy",
            "version": "v2.5.2",
@@ -9726,7 +9775,7 @@
    "prefer-stable": true,
    "prefer-lowest": false,
    "platform": {
-        "php": "^8.3"
+        "php": "8.3.*"
    },
    "platform-dev": [],
    "plugin-api-version": "2.6.0"
--- a/config/llm.php
+++ b/config/llm.php
@@ -0,0 +1,43 @@
+<?php
+
+return [
+    /**
+     * API configuration
+     */
+    'api' => [
+        /**
+         * Host for the OpenAI API.
+         * This should be the base URL of the OpenAI API you are using.
+         */
+        'host' => env('LLM_API_HOST_URL', null),
+
+        /**
+         * Token for authenticating with the OpenAI API.
+         * Null if not used
+         */
+        'token' => env('LLM_API_TOKEN', null),
+    ],
+
+    /**
+     * Models configuration.
+     */
+    'models' => [
+        /**
+         * Great for chatting, can have reasoning capabilities.
+         * This model is typically used for conversational or thinking AI tasks.
+         */
+        'chat' => [
+            'name' => env('LLM_CHAT_MODEL', null),
+            'shouldThink' => env('LLM_CHAT_MODEL_THINK', false),
+        ],
+
+        /**
+         * Great for analyzing images, can have reasoning capabilities.
+         * This model is typically used for tasks that require understanding and interpreting images.
+         */
+        'vision' => [
+            'name' => env('LLM_VISION_MODEL', null),
+            'shouldThink' => env('LLM_VISION_MODEL_THINK', false),
+        ],
+    ]
+];
--- a/database/migrations/2025_06_26_142942_add_instagram_repost_video_description.php
+++ b/database/migrations/2025_06_26_142942_add_instagram_repost_video_description.php
@@ -0,0 +1,32 @@
+<?php
+
+use Illuminate\Database\Migrations\Migration;
+use Illuminate\Database\Schema\Blueprint;
+use Illuminate\Support\Facades\Schema;
+
+return new class extends Migration
+{
+    /**
+     * Run the migrations.
+     */
+    public function up(): void
+    {
+        Schema::table('instagram_reposts', function (Blueprint $table) {
+            $table->text('video_description')->nullable()->after('reel_id')
+                ->comment('Description of the video being reposted on Instagram');
+            $table->text('instagram_caption')->nullable()->after('video_description')
+                ->comment('Caption generated for the Instagram video repost');
+        });
+    }
+
+    /**
+     * Reverse the migrations.
+     */
+    public function down(): void
+    {
+        Schema::table('instagram_reposts', function (Blueprint $table) {
+            $table->dropColumn('video_description');
+            $table->dropColumn('instagram_caption');
+        });
+    }
+};
--- a/undetectedChromedriver/chromedriver
+++ b/undetectedChromedriver/chromedriver
--- a/undetectedChromedriver/getChromeDriver.sh
+++ b/undetectedChromedriver/getChromeDriver.sh
@@ -1,20 +1,32 @@
 #!/bin/bash

+# version variable
+# Can be found here : https://hub.docker.com/r/selenium/standalone-chrome/tags
+# Will need to change it in seleniumChromedriverDockerfile and probably download
+# it and change it in patchChromedriver.py
+VERSION="latest"
+
 # From undetected chromedriver docker
 #sudo docker run --rm -it -p 3389:3389 -v ./undetectedChromedriver:/root/.local/share/undetected_chromedriver/ ultrafunk/undetected-chromedriver:latest

+sudo docker pull selenium/standalone-chrome:$VERSION
+
 # With undetected chromedriver patcher
 # Run the selenium/standalone-chrome:latest with a specific container name in the background
-sudo docker run -d --name standalone-chrome selenium/standalone-chrome:latest
+#sudo docker run -d --name standalone-chrome -v /home/ninluc/Documents/codage/DatBrowser/undetectedChromedriver/chrome/:/opt/google/chrome/ selenium/standalone-chrome:$VERSION
+sudo docker run -d --name standalone-chrome selenium/standalone-chrome:$VERSION

-sleep 5
+sleep 7

 # Copy the chromedriver binary from the container to the host
 sudo docker cp -L standalone-chrome:/bin/chromedriver ./chromedriver
-# Stop the container
-sudo docker stop standalone-chrome

 sudo chmod 777 ./chromedriver

 # Patch the chromedriver binary
+source venv/bin/activate
 python3 ./patchChromedriver.py
+
+# Stop the container
+sudo docker stop standalone-chrome
+sudo docker rm standalone-chrome
--- a/undetectedChromedriver/patchChromedriver.py
+++ b/undetectedChromedriver/patchChromedriver.py
@@ -4,5 +4,10 @@ import undetected_chromedriver as uc

 options = uc.ChromeOptions()
 # Chromedriver is in current directory
-driver = uc.Chrome(options = options, browser_executable_path="/usr/bin/google-chrome", driver_executable_path="/home/ninluc/Documents/codage/DatBrowser/undetectedChromedriver/chromedriver")
+# ERROR :  This version of ChromeDriver only supports Chrome version xxx
+# npx @puppeteer/browsers install chrome@xxx
+# Change the path to the Chrome binary if needed
+# "/home/ninluc/Documents/codage/DatBrowser/undetectedChromedriver/chrome/google-chrome"
+# "/home/ninluc/chrome/linux-124.0.6367.207/chrome-linux64/chrome"
+driver = uc.Chrome(options = options, browser_executable_path="/bin/google-chrome", driver_executable_path="/home/ninluc/Documents/codage/DatBrowser/undetectedChromedriver/chromedriver")
 driver.get('https://nowsecure.nl')
--- a/undetectedChromedriver/pushSeleniumStandaloneUcImage.sh
+++ b/undetectedChromedriver/pushSeleniumStandaloneUcImage.sh
@@ -1,4 +1,4 @@
 #!/bin/bash

-sudo docker build -f undetectedChromedriver/seleniumChromedriverDockerfile -t git.matthiasg.dev/ninluc/selenium/standalone-uc:latest .
+sudo docker build -f seleniumChromedriverDockerfile -t git.matthiasg.dev/ninluc/selenium/standalone-uc:latest .
 sudo docker push git.matthiasg.dev/ninluc/selenium/standalone-uc:latest
--- a/undetectedChromedriver/seleniumChromedriverDockerfile
+++ b/undetectedChromedriver/seleniumChromedriverDockerfile
@@ -1,10 +1,13 @@
 # FROM selenium/standalone-chrome:108.0 AS final
+# FROM selenium/standalone-chrome:133.0-20250606 AS final
 FROM selenium/standalone-chrome:latest AS final

-COPY undetectedChromedriver/chromedriver /bin/chromedriver
-RUN mkdir -p /home/seluser/profile/
+COPY ./chromedriver /bin/chromedriver
+#RUN mkdir -p /home/seluser/profile/

 ENV TZ=Europe/Brussels
+# 15 minutes session timeout
+ENV SE_OPTS="--session-timeout 900"

 HEALTHCHECK --interval=30s --timeout=10s --retries=3 CMD curl -s http://localhost:4444/wd/hub/status | jq -e '.value.ready == true' || exit 1

--- a/undetectedChromedriver/undetectedChromeDriver.yaml
+++ b/undetectedChromedriver/undetectedChromeDriver.yaml
@@ -1,8 +1,8 @@
 services:
  undetected-chromedriver:
    build:
-      context: ../
-      dockerfile: undetectedChromedriver/seleniumChromedriverDockerfile
+      context: ./
+      dockerfile: seleniumChromedriverDockerfile
    volumes:
      - /tmp:/tmp
      - chromeProfile:/home/seluser/profile/