Utils API

`llm_feature_gen.utils`

`utils`

Reusable utility helpers used by discovery and generation pipelines.

Modules:

text –
video –

Functions:

downsample_batch –

Takes a large list of base64 images (e.g. from multiple videos) and
extract_audio_track –

Extracts the audio track from a video file and saves it as a temporary WAV file.
extract_key_frames –

Selects diverse keyframes from a video using K-Means clustering.
extract_text_from_file –

Extracts text from a file and returns a list of text chunks (strings).

`downsample_batch(b64_list: List[str], target_count: int = 15) -> List[str]`

Takes a large list of base64 images (e.g. from multiple videos) and selects the most diverse set using K-Means clustering.

`extract_audio_track(file_path: str) -> Optional[str]`

Extracts the audio track from a video file and saves it as a temporary WAV file. Uses FFmpeg to convert the stream to mono, 16kHz PCM (standard for Whisper/STT).

Returns:

Optional[str] –

The path to the generated temporary WAV file, or None if extraction fails.

`extract_key_frames(video_path: str, frame_limit: int = 10, sharpness_threshold: float = 40.0, max_resolution: int = 1024) -> List[str]`

Selects diverse keyframes from a video using K-Means clustering. Instead of simple uniform sampling, it groups visually similar scenes and picks the sharpest image from each group to maximize information density.

Parameters:

video_path
(str) –

Path to the video file.
frame_limit
(int, default: 10 ) –

Maximum number of frames to extract (target K for clustering).
sharpness_threshold
(float, default: 40.0 ) –

Variance of Laplacian threshold to ignore blurry frames.
max_resolution
(int, default: 1024 ) –

Max dimension (width/height) for resizing to control payload size.

Returns:

List[str] –

List of base64-encoded image strings.

`extract_text_from_file(path: Path) -> List[str]`

Extracts text from a file and returns a list of text chunks (strings).

`llm_feature_gen.utils.image`

`image`

`llm_feature_gen.utils.text`

`text`

Functions:

extract_text_from_file –

Extracts text from a file and returns a list of text chunks (strings).

`extract_text_from_file(path: Path) -> List[str]`

Extracts text from a file and returns a list of text chunks (strings).

`llm_feature_gen.utils.video`

`video`

Functions:

downsample_batch –

Takes a large list of base64 images (e.g. from multiple videos) and
extract_audio_track –

Extracts the audio track from a video file and saves it as a temporary WAV file.
extract_key_frames –

Selects diverse keyframes from a video using K-Means clustering.

`downsample_batch(b64_list: List[str], target_count: int = 15) -> List[str]`

Takes a large list of base64 images (e.g. from multiple videos) and selects the most diverse set using K-Means clustering.

`extract_audio_track(file_path: str) -> Optional[str]`

Extracts the audio track from a video file and saves it as a temporary WAV file. Uses FFmpeg to convert the stream to mono, 16kHz PCM (standard for Whisper/STT).

Returns:

Optional[str] –

The path to the generated temporary WAV file, or None if extraction fails.

`extract_key_frames(video_path: str, frame_limit: int = 10, sharpness_threshold: float = 40.0, max_resolution: int = 1024) -> List[str]`

Selects diverse keyframes from a video using K-Means clustering. Instead of simple uniform sampling, it groups visually similar scenes and picks the sharpest image from each group to maximize information density.

Parameters:

video_path
(str) –

Path to the video file.
frame_limit
(int, default: 10 ) –

Maximum number of frames to extract (target K for clustering).
sharpness_threshold
(float, default: 40.0 ) –

Variance of Laplacian threshold to ignore blurry frames.
max_resolution
(int, default: 1024 ) –

Max dimension (width/height) for resizing to control payload size.

Returns:

List[str] –

List of base64-encoded image strings.

Utils API

`llm_feature_gen.utils`

`utils`

`downsample_batch(b64_list: List[str], target_count: int = 15) -> List[str]`

`extract_audio_track(file_path: str) -> Optional[str]`

`extract_key_frames(video_path: str, frame_limit: int = 10, sharpness_threshold: float = 40.0, max_resolution: int = 1024) -> List[str]`

`video_path`

`frame_limit`

`sharpness_threshold`

`max_resolution`

`extract_text_from_file(path: Path) -> List[str]`

`llm_feature_gen.utils.image`

`image`

`llm_feature_gen.utils.text`

`text`

`extract_text_from_file(path: Path) -> List[str]`

`llm_feature_gen.utils.video`

`video`

`downsample_batch(b64_list: List[str], target_count: int = 15) -> List[str]`

`extract_audio_track(file_path: str) -> Optional[str]`

`extract_key_frames(video_path: str, frame_limit: int = 10, sharpness_threshold: float = 40.0, max_resolution: int = 1024) -> List[str]`

`video_path`

`frame_limit`

`sharpness_threshold`

`max_resolution`