Skip to content

LLM Feature Gen

llm-feature-gen is a Python library for discovering interpretable feature schemas from multimodal data and then generating feature values that can be exported as CSV.

Unlike generic extraction tools that stop at free-form labels or raw embeddings, llm-feature-gen is built around stable, interpretable feature schemas that can be reused for downstream tabular modeling and analysis.

The package is organized around a two-step workflow:

  1. Run a discovery helper from llm_feature_gen.discover to produce a JSON feature schema in outputs/.
  2. Run a generation helper from llm_feature_gen.generate to score a class-organized dataset and write one CSV per class, plus an optional merged CSV.

The docs site is split into two layers:

  • Narrative guides for installation, quickstart usage, provider setup, and output formats.
  • Autogenerated API reference pages for the public modules and helper utilities.

If you want the shortest path into the package, start with the install guide and quickstart. For one canonical end-to-end example, see the text-to-tabular pipeline example and its example walkthrough. If you need concrete compatibility claims for review or deployment planning, see the platform support matrix.