ForgeClient API

The ForgeClient class is the main interface for interacting with the Glyph Forge API.

class glyph_forge.core.client.forge_client.ForgeClient(api_key=None, base_url=None, *, timeout=30.0)[source]

Bases: object

Synchronous HTTP client for Glyph Forge API.

Parameters:
  • api_key (Optional[str]) – API key for authentication (required). Format: “gf_live_…” or “gf_test_…”. Can also be read from GLYPH_API_KEY environment variable.

  • base_url (Optional[str]) – Base URL for the API. If not provided, falls back to: 1) GLYPH_API_BASE environment variable 2) Default: “https://dev.glyphapi.ai

  • timeout (float) – Request timeout in seconds (default: 30.0)

Example

>>> # Uses GLYPH_API_KEY env var and default base URL
>>> client = ForgeClient()
>>>
>>> # Or specify explicitly
>>> client = ForgeClient(api_key="gf_live_abc123...", base_url="https://api.glyphapi.ai")
>>> schema = client.build_schema_from_docx(ws, docx_path="sample.docx")
DEFAULT_BASE_URL = 'https://dev.glyphapi.ai'
__init__(api_key=None, base_url=None, *, timeout=30.0)[source]

Initialize ForgeClient.

Parameters:
  • api_key (Optional[str]) – API key for authentication. Falls back to GLYPH_API_KEY env var if not provided.

  • base_url (Optional[str]) – Base URL for API (no trailing slash). Falls back to GLYPH_API_BASE env var or default URL if not provided.

  • timeout (float) – Default timeout for all requests in seconds

Raises:

ForgeClientError – If no API key is provided or found in environment

__enter__()[source]
__exit__(exc_type, exc_val, exc_tb)[source]
close()[source]

Close the underlying HTTP client.

build_schema_from_docx(ws, *, docx_path, save_as=None, include_artifacts=False)[source]

Build a schema from a DOCX file via the API.

Endpoint: POST /schema/build

Parameters:
  • ws (Any) – Workspace instance for saving artifacts

  • docx_path (str) – Path to DOCX file (absolute or CWD-relative)

  • save_as (Optional[str]) – Optional name to save schema JSON (without .json extension)

  • include_artifacts (bool) – If True, retrieve and save tagged DOCX + unzipped files Adds ~300-800ms overhead depending on document complexity

Return type:

Dict[str, Any]

Returns:

Schema dict from API response

Raises:

Example

>>> # Fast mode (schema only)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema"
... )
>>>
>>> # Full mode (with artifacts for debugging/post-processing)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema",
...     include_artifacts=True
... )
run_schema(ws, *, schema, plaintext, dest_name='assembled_output.docx')[source]

Run a schema with plaintext to generate a DOCX.

Endpoint: POST /schema/run

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)

  • plaintext (str) – Input text content

  • dest_name (str) – Name for output DOCX file (saved in output_docx directory)

Return type:

str

Returns:

Local path to saved DOCX file

Raises:

Example

>>> docx_path = client.run_schema(
...     ws,
...     schema=schema,
...     plaintext="Sample text...",
...     dest_name="output.docx"
... )
run_schema_bulk(ws, *, schema, plaintexts, max_concurrent=5, dest_name_pattern='output_{index}.docx')[source]

Run a schema with multiple plaintexts in parallel to generate multiple DOCX files.

Endpoint: POST /schema/run/bulk

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)

  • plaintexts (list[str]) – List of plaintext strings to process (max 100)

  • max_concurrent (int) – Number of concurrent processes (default: 5, max: 20)

  • dest_name_pattern (str) – Pattern for output filenames. Use {index} placeholder (default: “output_{index}.docx”)

Returns:

  • results: List of dicts with index, status, docx_path (or error)

  • total: Total number of plaintexts

  • successful: Number of successful runs

  • failed: Number of failed runs

  • processing_time_seconds: Total processing time

  • metered_count: Number of API calls metered

Return type:

Dict containing

Raises:

Note

Each plaintext counts as 1 API call for billing purposes. Failed items return error messages but don’t block successful ones. All DOCX files are saved to workspace output_docx directory.

Example

>>> result = client.run_schema_bulk(
...     ws,
...     schema=schema,
...     plaintexts=["Text 1...", "Text 2...", "Text 3..."],
...     max_concurrent=5,
...     dest_name_pattern="invoice_{index}.docx"
... )
>>> print(f"Processed {result['successful']} of {result['total']}")
compress_schema(ws, *, schema, save_as=None)[source]

Compress a schema by deduplicating redundant pattern descriptors.

Many schemas have redundant pattern descriptor types (e.g., “H-SHORT” appearing multiple times with the same styles and properties). This endpoint compresses the schema by grouping pattern descriptors by type and selecting the highest scoring descriptor for each type.

Endpoint: POST /schema/compress

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict to compress (from build_schema_from_docx or loaded JSON)

  • save_as (Optional[str]) – Optional name to save compressed schema JSON (without .json extension)

Returns:

  • compressed_schema: The compressed schema with deduplicated pattern descriptors

  • stats: Compression statistics (original_count, compressed_count, reduction, etc.)

Return type:

Dict containing

Raises:

Example

>>> result = client.compress_schema(
...     ws,
...     schema=schema,
...     save_as="compressed_schema"
... )
>>> print(f"Reduced from {result['stats']['original_count']} to {result['stats']['compressed_count']}")
>>> compressed_schema = result['compressed_schema']
intake_plaintext_text(ws, *, text, save_as=None, **opts)[source]

Intake plaintext via JSON body.

Endpoint: POST /plaintext/intake

Parameters:
  • ws (Any) – Workspace instance

  • text (str) – Plaintext content to intake

  • save_as (Optional[str]) – Optional name to save intake result JSON

  • **opts (Any) – Additional options matching PlaintextIntakeRequest fields: - unicode_form: str (default: “NFC”) - strip_zero_width: bool (default: True) - expand_tabs: bool (default: True) - ensure_final_newline: bool (default: True) - max_bytes: int (default: 10MB) - filename: str (optional)

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

Example

>>> result = client.intake_plaintext_text(
...     ws,
...     text="Sample text...",
...     save_as="intake_result",
...     strip_zero_width=False
... )
intake_plaintext_file(ws, *, file_path, save_as=None, **opts)[source]

Intake plaintext via file upload.

Endpoint: POST /plaintext/intake_file

Parameters:
  • ws (Any) – Workspace instance

  • file_path (str) – Path to plaintext file

  • save_as (Optional[str]) – Optional name to save intake result JSON

  • **opts (Any) – Query parameters for normalization options: - unicode_form: str - strip_zero_width: bool - expand_tabs: bool - ensure_final_newline: bool

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

Example

>>> result = client.intake_plaintext_file(
...     ws,
...     file_path="sample.txt",
...     save_as="intake_result",
...     unicode_form="NFKC"
... )

Core Methods

Schema Building

ForgeClient.build_schema_from_docx(ws, *, docx_path, save_as=None, include_artifacts=False)[source]

Build a schema from a DOCX file via the API.

Endpoint: POST /schema/build

Parameters:
  • ws (Any) – Workspace instance for saving artifacts

  • docx_path (str) – Path to DOCX file (absolute or CWD-relative)

  • save_as (Optional[str]) – Optional name to save schema JSON (without .json extension)

  • include_artifacts (bool) – If True, retrieve and save tagged DOCX + unzipped files Adds ~300-800ms overhead depending on document complexity

Return type:

Dict[str, Any]

Returns:

Schema dict from API response

Raises:

Example

>>> # Fast mode (schema only)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema"
... )
>>>
>>> # Full mode (with artifacts for debugging/post-processing)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema",
...     include_artifacts=True
... )

Schema Running

ForgeClient.run_schema(ws, *, schema, plaintext, dest_name='assembled_output.docx')[source]

Run a schema with plaintext to generate a DOCX.

Endpoint: POST /schema/run

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)

  • plaintext (str) – Input text content

  • dest_name (str) – Name for output DOCX file (saved in output_docx directory)

Return type:

str

Returns:

Local path to saved DOCX file

Raises:

Example

>>> docx_path = client.run_schema(
...     ws,
...     schema=schema,
...     plaintext="Sample text...",
...     dest_name="output.docx"
... )

Bulk Processing

ForgeClient.run_schema_bulk(ws, *, schema, plaintexts, max_concurrent=5, dest_name_pattern='output_{index}.docx')[source]

Run a schema with multiple plaintexts in parallel to generate multiple DOCX files.

Endpoint: POST /schema/run/bulk

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)

  • plaintexts (list[str]) – List of plaintext strings to process (max 100)

  • max_concurrent (int) – Number of concurrent processes (default: 5, max: 20)

  • dest_name_pattern (str) – Pattern for output filenames. Use {index} placeholder (default: “output_{index}.docx”)

Returns:

  • results: List of dicts with index, status, docx_path (or error)

  • total: Total number of plaintexts

  • successful: Number of successful runs

  • failed: Number of failed runs

  • processing_time_seconds: Total processing time

  • metered_count: Number of API calls metered

Return type:

Dict containing

Raises:

Note

Each plaintext counts as 1 API call for billing purposes. Failed items return error messages but don’t block successful ones. All DOCX files are saved to workspace output_docx directory.

Example

>>> result = client.run_schema_bulk(
...     ws,
...     schema=schema,
...     plaintexts=["Text 1...", "Text 2...", "Text 3..."],
...     max_concurrent=5,
...     dest_name_pattern="invoice_{index}.docx"
... )
>>> print(f"Processed {result['successful']} of {result['total']}")

Schema Compression

ForgeClient.compress_schema(ws, *, schema, save_as=None)[source]

Compress a schema by deduplicating redundant pattern descriptors.

Many schemas have redundant pattern descriptor types (e.g., “H-SHORT” appearing multiple times with the same styles and properties). This endpoint compresses the schema by grouping pattern descriptors by type and selecting the highest scoring descriptor for each type.

Endpoint: POST /schema/compress

Parameters:
  • ws (Any) – Workspace instance

  • schema (Dict[str, Any]) – Schema dict to compress (from build_schema_from_docx or loaded JSON)

  • save_as (Optional[str]) – Optional name to save compressed schema JSON (without .json extension)

Returns:

  • compressed_schema: The compressed schema with deduplicated pattern descriptors

  • stats: Compression statistics (original_count, compressed_count, reduction, etc.)

Return type:

Dict containing

Raises:

Example

>>> result = client.compress_schema(
...     ws,
...     schema=schema,
...     save_as="compressed_schema"
... )
>>> print(f"Reduced from {result['stats']['original_count']} to {result['stats']['compressed_count']}")
>>> compressed_schema = result['compressed_schema']

Plaintext Intake

ForgeClient.intake_plaintext_text(ws, *, text, save_as=None, **opts)[source]

Intake plaintext via JSON body.

Endpoint: POST /plaintext/intake

Parameters:
  • ws (Any) – Workspace instance

  • text (str) – Plaintext content to intake

  • save_as (Optional[str]) – Optional name to save intake result JSON

  • **opts (Any) – Additional options matching PlaintextIntakeRequest fields: - unicode_form: str (default: “NFC”) - strip_zero_width: bool (default: True) - expand_tabs: bool (default: True) - ensure_final_newline: bool (default: True) - max_bytes: int (default: 10MB) - filename: str (optional)

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

Example

>>> result = client.intake_plaintext_text(
...     ws,
...     text="Sample text...",
...     save_as="intake_result",
...     strip_zero_width=False
... )
ForgeClient.intake_plaintext_file(ws, *, file_path, save_as=None, **opts)[source]

Intake plaintext via file upload.

Endpoint: POST /plaintext/intake_file

Parameters:
  • ws (Any) – Workspace instance

  • file_path (str) – Path to plaintext file

  • save_as (Optional[str]) – Optional name to save intake result JSON

  • **opts (Any) – Query parameters for normalization options: - unicode_form: str - strip_zero_width: bool - expand_tabs: bool - ensure_final_newline: bool

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

Example

>>> result = client.intake_plaintext_file(
...     ws,
...     file_path="sample.txt",
...     save_as="intake_result",
...     unicode_form="NFKC"
... )

Client Management

ForgeClient.close()[source]

Close the underlying HTTP client.

Usage Examples

Basic Schema Build and Run

from glyph_forge import ForgeClient, create_workspace

# Initialize
client = ForgeClient(api_key="gf_live_...")
ws = create_workspace()

# Build schema
schema = client.build_schema_from_docx(
    ws,
    docx_path="template.docx",
    save_as="my_schema"
)

# Run schema
output = client.run_schema(
    ws,
    schema=schema,
    plaintext="Content here...",
    dest_name="output.docx"
)

With Context Manager

from glyph_forge import ForgeClient, create_workspace

ws = create_workspace()

with ForgeClient(api_key="gf_live_...") as client:
    schema = client.build_schema_from_docx(
        ws,
        docx_path="template.docx"
    )

Bulk Processing

# Process multiple documents at once
plaintexts = ["Text 1...", "Text 2...", "Text 3..."]

result = client.run_schema_bulk(
    ws,
    schema=schema,
    plaintexts=plaintexts,
    max_concurrent=5,
    dest_name_pattern="output_{index}.docx"
)

print(f"Processed {result['successful']} of {result['total']}")

Schema Compression

# Compress schema to optimize size
result = client.compress_schema(
    ws,
    schema=schema,
    save_as="compressed_schema"
)

print(f"Reduced from {result['stats']['original_count']} "
      f"to {result['stats']['compressed_count']} pattern descriptors")