Changing Whisper to Whisper CPP on mac M1 to get GPU usage #2819

scotnery · 2025-02-15T21:44:13Z

scotnery
Feb 15, 2025

I tried writing an apple action to transcribe my videos. It's working great, except slow. It's using a lot of CPU and zero GPU. I am coding with ChatGPT, so things are a little challenging. I was wondering if there was a way to modify this current code in order to trigger GPU processing? whisper cpp was suggested, but ChatGPT doesn't know how to give me instructions for a good setup and doesn't understand the current API of it. I figure there's probably a whisper direct way to do it in 2025?

here's the code i'm running in Automator for mac...


#!/bin/bash

###############################################################################
# Prepend Homebrew's bin directory so ffmpeg, ffprobe, and terminal-notifier
# are available
###############################################################################
export PATH="/opt/homebrew/bin:$PATH"

# Full path to the Whisper executable.
WHISPER_PATH="/Users/scotnery/Library/Python/3.9/bin/whisper"

###############################################################################
# Function: Open media in QuickTime, loop until ffprobe can read it,
# then quit QuickTime automatically.
###############################################################################
force_download_via_quicktime() {
    local FILE="$1"

    # Launch QuickTime with the file (asynchronously)
    open -a "QuickTime Player" "$FILE"

    local ATTEMPTS=20  # How many times to try ffprobe
    local DELAY=5      # Seconds between attempts

    for (( i=1; i<=ATTEMPTS; i++ )); do
        echo "Checking if file is fully available (attempt $i/$ATTEMPTS)..." >&2

        # If ffprobe can read container/duration, the file should be fully downloaded
        if ffprobe -v error -show_entries format=duration -of csv=p=0 "$FILE" >/dev/null 2>&1; then
            echo "File is fully readable. Quitting QuickTime..." >&2
            osascript -e 'tell application "QuickTime Player" to quit'
            return 0
        else
            echo "File not ready. Waiting $DELAY seconds..." >&2
            sleep $DELAY
        fi
    done

    # Timed out: couldn't confirm local availability
    echo "Timeout: could not confirm download after $ATTEMPTS attempts." >&2
    osascript -e 'tell application "QuickTime Player" to quit'
    return 1
}

###############################################################################
# Main loop: process each file passed to this script
###############################################################################
for FILE in "$@"; do

    # Extract file extension, parent directory, and parent folder name.
    EXT="${FILE##*.}"
    PARENT_DIR="$(dirname "$FILE")"
    PARENT_NAME="$(basename "$PARENT_DIR")"

    # If folder name starts with YYYY-MM-DD, remove that part for a cleaner name.
    if [[ "$PARENT_NAME" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2} ]]; then
        CLEAN_FOLDER="$(echo "$PARENT_NAME" | sed -E 's/^[0-9]{4}-([0-9]{2}-[0-9]{2}) [0-9]{2}\.[0-9]{2}\.[0-9]{2} (.*)$/\1 \2/')"
    else
        CLEAN_FOLDER="$PARENT_NAME"
    fi

    # Define the expected transcript file (Whisper saves .txt with the same basename).
    OUTPUT_FILE="$PARENT_DIR/$(basename "$FILE" ."$EXT").txt"
    # Temporary log file for capturing Whisper output/errors.
    LOG_FILE="/tmp/whisper_log_$(basename "$FILE").txt"

    # Only process certain media extensions
    if [[ "$EXT" =~ ^(mp4|mov|mkv|mp3|wav|flac)$ ]]; then

        # 1. Show ephemeral "Downloading file..." notification
        osascript -e "display notification \"$(basename "$FILE")/${CLEAN_FOLDER}\" \
            with title \"Whisper AI Transcription 📥 Opening QuickTime...\""

        ########################################################################
        # 2. Attempt to force local download via QuickTime
        ########################################################################
        if ! force_download_via_quicktime "$FILE"; then
            osascript -e "display notification \"Skipping $(basename "$FILE")\" with title \"Not downloaded\""
            echo "Skipping $FILE because it never became fully available." >&2
            continue
        fi

        ########################################################################
        # 3. Now that the file is presumably local, gather info & show "Starting..."
        ########################################################################
        # Human-readable file size
        FILE_SIZE="$(du -h "$FILE" | cut -f1)"
        # File size in bytes
        SIZE_BYTES="$(stat -f%z "$FILE")"

        # (Optional) Estimated processing time
        EST_SEC="$(awk -v size="$SIZE_BYTES" 'BEGIN { printf "%.0f", 0.01657*sqrt(size) }')"
        CURRENT_TIME="$(date +%s)"
        ESTIMATED_END_TIME=$((CURRENT_TIME + EST_SEC))
        ESTIMATED_TIME="$(date -r "$ESTIMATED_END_TIME" +"%I:%M%p")"

        # Media duration via ffprobe
        FILE_DURATION_RAW="$(ffprobe -v error -show_entries format=duration \
            -of default=noprint_wrappers=1:nokey=1 "$FILE")"
        FILE_DURATION_SEC="$(printf "%.0f" "$FILE_DURATION_RAW")"
        DURATION_HOURS=$(( FILE_DURATION_SEC / 3600 ))
        DURATION_MINUTES=$(( (FILE_DURATION_SEC % 3600) / 60 ))
        DURATION_SECONDS=$(( FILE_DURATION_SEC % 60 ))
        FILE_DURATION_DISPLAY="$(printf "%02d:%02d:%02d" $DURATION_HOURS $DURATION_MINUTES $DURATION_SECONDS)"

        # Show ephemeral "Transcription started..."
        osascript -e "display notification \"🎬$FILE_DURATION_DISPLAY 🧠$FILE_SIZE\n$(basename "$FILE")/${CLEAN_FOLDER}\" \
            with title \"Whisper AI Transcription 🐝 Starting...\""

        echo "Processing: $FILE..." >&2
        START_TIME="$(date +%s)"

        # 4. Run Whisper
        "$WHISPER_PATH" "$FILE" \
            --model small \
            --language en \
            --output_format txt \
            --output_dir "$PARENT_DIR" \
            > "$LOG_FILE" 2>&1
        EXIT_STATUS=$?

        END_TIME="$(date +%s)"
        ACTUAL_DURATION=$(( END_TIME - START_TIME ))
        HOURS=$(( ACTUAL_DURATION / 3600 ))
        MINUTES=$((( ACTUAL_DURATION % 3600) / 60 ))
        SECONDS=$(( ACTUAL_DURATION % 60 ))
        FORMATTED_DURATION="$(printf "%02d:%02d:%02d" $HOURS $MINUTES $SECONDS)"

        ########################################################################
        # 5. If the transcript file exists and is non-empty, show success;
        #    otherwise, show the error dialog.
        ########################################################################
        if [[ -s "$OUTPUT_FILE" ]]; then
            # Success block
            URI="$(python3 -c "import urllib.parse; import sys; print('file://' + urllib.parse.quote(sys.argv[1]))" "$OUTPUT_FILE")"

            terminal-notifier -title "Whisper AI Transcription ✅Done" \
              -message "🎬$FILE_DURATION_DISPLAY 🧠$FILE_SIZE ⏱️$FORMATTED_DURATION
$CLEAN_FOLDER/$(basename "$FILE")" \
              -open "$URI" \
              -timeout 0
        else
            # Error block (if file is missing or empty)
            ERROR_MSG="$(cat "$LOG_FILE" | tr -d '"' | tr -d "'")"
            osascript -e "display dialog \"❌ Error during transcription for: $(basename "$FILE")\nFolder: $CLEAN_FOLDER\nSize: $FILE_SIZE\nDuration: $FORMATTED_DURATION\n\nError:\n$ERROR_MSG\" with title \"Whisper AI Error\" buttons {\"OK\"}"
        fi

    else
        echo "Skipping non-media file: $FILE" >&2
    fi
done

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing Whisper to Whisper CPP on mac M1 to get GPU usage #2819

{{title}}

Replies: 0 comments

Select a reply

Changing Whisper to Whisper CPP on mac M1 to get GPU usage #2819

scotnery Feb 15, 2025

Replies: 0 comments

scotnery
Feb 15, 2025