ForgeClient API¶

The ForgeClient class is the main interface for interacting with the Glyph Forge API.

class glyph_forge.core.client.forge_client.ForgeClient(api_key=None, base_url=None, *, timeout=30.0)[source]¶

Bases: object

Synchronous HTTP client for Glyph Forge API.

Parameters:

api_key (Optional[str]) – API key for authentication (required). Format: “gf_live_…” or “gf_test_…”. Can also be read from GLYPH_API_KEY environment variable.
base_url (Optional[str]) – Base URL for the API. If not provided, falls back to: 1) GLYPH_API_BASE environment variable 2) Default: “https://dev.glyphapi.ai”
timeout (float) – Request timeout in seconds (default: 30.0)

Example

>>> # Uses GLYPH_API_KEY env var and default base URL
>>> client = ForgeClient()
>>>
>>> # Or specify explicitly
>>> client = ForgeClient(api_key="gf_live_abc123...", base_url="https://api.glyphapi.ai")
>>> schema = client.build_schema_from_docx(ws, docx_path="sample.docx")

DEFAULT_BASE_URL = 'https://dev.glyphapi.ai'¶

__init__(api_key=None, base_url=None, *, timeout=30.0)[source]¶

Initialize ForgeClient.

Parameters:

api_key (Optional[str]) – API key for authentication. Falls back to GLYPH_API_KEY env var if not provided.
base_url (Optional[str]) – Base URL for API (no trailing slash). Falls back to GLYPH_API_BASE env var or default URL if not provided.
timeout (float) – Default timeout for all requests in seconds

Raises:

ForgeClientError – If no API key is provided or found in environment

__enter__()[source]¶

__exit__(exc_type, exc_val, exc_tb)[source]¶

close()[source]¶: Close the underlying HTTP client.

build_schema_from_docx(ws, *, docx_path, save_as=None, include_artifacts=False)[source]¶

Build a schema from a DOCX file via the API.

Endpoint: POST /schema/build

Parameters:

ws (Any) – Workspace instance for saving artifacts
docx_path (str) – Path to DOCX file (absolute or CWD-relative)
save_as (Optional[str]) – Optional name to save schema JSON (without .json extension)
include_artifacts (bool) – If True, retrieve and save tagged DOCX + unzipped files Adds ~300-800ms overhead depending on document complexity

Return type:

Dict[str, Any]

Returns:

Schema dict from API response

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)
ForgeClientError – File not found or encoding error

Example

>>> # Fast mode (schema only)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema"
... )
>>>
>>> # Full mode (with artifacts for debugging/post-processing)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema",
...     include_artifacts=True
... )

run_schema(ws, *, schema, plaintext, dest_name='assembled_output.docx')[source]¶

Run a schema with plaintext to generate a DOCX.

Endpoint: POST /schema/run

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)
plaintext (str) – Input text content
dest_name (str) – Name for output DOCX file (saved in output_docx directory)

Return type:

str

Returns:

Local path to saved DOCX file

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status
ForgeClientError – Failed to decode or save DOCX

Example

>>> docx_path = client.run_schema(
...     ws,
...     schema=schema,
...     plaintext="Sample text...",
...     dest_name="output.docx"
... )

run_schema_bulk(ws, *, schema, plaintexts, max_concurrent=5, dest_name_pattern='output_{index}.docx')[source]¶

Run a schema with multiple plaintexts in parallel to generate multiple DOCX files.

Endpoint: POST /schema/run/bulk

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)
plaintexts (list[str]) – List of plaintext strings to process (max 100)
max_concurrent (int) – Number of concurrent processes (default: 5, max: 20)
dest_name_pattern (str) – Pattern for output filenames. Use {index} placeholder (default: “output_{index}.docx”)

Returns:

results: List of dicts with index, status, docx_path (or error)
total: Total number of plaintexts
successful: Number of successful runs
failed: Number of failed runs
processing_time_seconds: Total processing time
metered_count: Number of API calls metered

Return type:

Dict containing

Raises:

ForgeClientError – If plaintexts exceeds 100 items or invalid parameters
ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)

Note

Each plaintext counts as 1 API call for billing purposes. Failed items return error messages but don’t block successful ones. All DOCX files are saved to workspace output_docx directory.

Example

>>> result = client.run_schema_bulk(
...     ws,
...     schema=schema,
...     plaintexts=["Text 1...", "Text 2...", "Text 3..."],
...     max_concurrent=5,
...     dest_name_pattern="invoice_{index}.docx"
... )
>>> print(f"Processed {result['successful']} of {result['total']}")

compress_schema(ws, *, schema, save_as=None)[source]¶

Compress a schema by deduplicating redundant pattern descriptors.

Many schemas have redundant pattern descriptor types (e.g., “H-SHORT” appearing multiple times with the same styles and properties). This endpoint compresses the schema by grouping pattern descriptors by type and selecting the highest scoring descriptor for each type.

Endpoint: POST /schema/compress

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict to compress (from build_schema_from_docx or loaded JSON)
save_as (Optional[str]) – Optional name to save compressed schema JSON (without .json extension)

Returns:

compressed_schema: The compressed schema with deduplicated pattern descriptors
stats: Compression statistics (original_count, compressed_count, reduction, etc.)

Return type:

Dict containing

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)
ForgeClientError – Failed to compress schema or save to workspace

Example

>>> result = client.compress_schema(
...     ws,
...     schema=schema,
...     save_as="compressed_schema"
... )
>>> print(f"Reduced from {result['stats']['original_count']} to {result['stats']['compressed_count']}")
>>> compressed_schema = result['compressed_schema']

intake_plaintext_text(ws, *, text, save_as=None, **opts)[source]¶

Intake plaintext via JSON body.

Endpoint: POST /plaintext/intake

Parameters:

ws (Any) – Workspace instance
text (str) – Plaintext content to intake
save_as (Optional[str]) – Optional name to save intake result JSON
**opts (Any) – Additional options matching PlaintextIntakeRequest fields: - unicode_form: str (default: “NFC”) - strip_zero_width: bool (default: True) - expand_tabs: bool (default: True) - ensure_final_newline: bool (default: True) - max_bytes: int (default: 10MB) - filename: str (optional)

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status

Example

>>> result = client.intake_plaintext_text(
...     ws,
...     text="Sample text...",
...     save_as="intake_result",
...     strip_zero_width=False
... )

intake_plaintext_file(ws, *, file_path, save_as=None, **opts)[source]¶

Intake plaintext via file upload.

Endpoint: POST /plaintext/intake_file

Parameters:

ws (Any) – Workspace instance
file_path (str) – Path to plaintext file
save_as (Optional[str]) – Optional name to save intake result JSON
**opts (Any) – Query parameters for normalization options: - unicode_form: str - strip_zero_width: bool - expand_tabs: bool - ensure_final_newline: bool

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status
ForgeClientError – File not found or unreadable

Example

>>> result = client.intake_plaintext_file(
...     ws,
...     file_path="sample.txt",
...     save_as="intake_result",
...     unicode_form="NFKC"
... )

Core Methods¶

Schema Building¶

ForgeClient.build_schema_from_docx(ws, *, docx_path, save_as=None, include_artifacts=False)[source]¶

Build a schema from a DOCX file via the API.

Endpoint: POST /schema/build

Parameters:

ws (Any) – Workspace instance for saving artifacts
docx_path (str) – Path to DOCX file (absolute or CWD-relative)
save_as (Optional[str]) – Optional name to save schema JSON (without .json extension)
include_artifacts (bool) – If True, retrieve and save tagged DOCX + unzipped files Adds ~300-800ms overhead depending on document complexity

Return type:

Dict[str, Any]

Returns:

Schema dict from API response

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)
ForgeClientError – File not found or encoding error

Example

>>> # Fast mode (schema only)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema"
... )
>>>
>>> # Full mode (with artifacts for debugging/post-processing)
>>> schema = client.build_schema_from_docx(
...     ws,
...     docx_path="sample.docx",
...     save_as="my_schema",
...     include_artifacts=True
... )

Schema Running¶

ForgeClient.run_schema(ws, *, schema, plaintext, dest_name='assembled_output.docx')[source]¶

Run a schema with plaintext to generate a DOCX.

Endpoint: POST /schema/run

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)
plaintext (str) – Input text content
dest_name (str) – Name for output DOCX file (saved in output_docx directory)

Return type:

str

Returns:

Local path to saved DOCX file

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status
ForgeClientError – Failed to decode or save DOCX

Example

>>> docx_path = client.run_schema(
...     ws,
...     schema=schema,
...     plaintext="Sample text...",
...     dest_name="output.docx"
... )

Bulk Processing¶

ForgeClient.run_schema_bulk(ws, *, schema, plaintexts, max_concurrent=5, dest_name_pattern='output_{index}.docx')[source]¶

Run a schema with multiple plaintexts in parallel to generate multiple DOCX files.

Endpoint: POST /schema/run/bulk

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict (from build_schema_from_docx or loaded JSON)
plaintexts (list[str]) – List of plaintext strings to process (max 100)
max_concurrent (int) – Number of concurrent processes (default: 5, max: 20)
dest_name_pattern (str) – Pattern for output filenames. Use {index} placeholder (default: “output_{index}.docx”)

Returns:

results: List of dicts with index, status, docx_path (or error)
total: Total number of plaintexts
successful: Number of successful runs
failed: Number of failed runs
processing_time_seconds: Total processing time
metered_count: Number of API calls metered

Return type:

Dict containing

Raises:

ForgeClientError – If plaintexts exceeds 100 items or invalid parameters
ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)

Note

Each plaintext counts as 1 API call for billing purposes. Failed items return error messages but don’t block successful ones. All DOCX files are saved to workspace output_docx directory.

Example

>>> result = client.run_schema_bulk(
...     ws,
...     schema=schema,
...     plaintexts=["Text 1...", "Text 2...", "Text 3..."],
...     max_concurrent=5,
...     dest_name_pattern="invoice_{index}.docx"
... )
>>> print(f"Processed {result['successful']} of {result['total']}")

Schema Compression¶

ForgeClient.compress_schema(ws, *, schema, save_as=None)[source]¶

Compress a schema by deduplicating redundant pattern descriptors.

Endpoint: POST /schema/compress

Parameters:

ws (Any) – Workspace instance
schema (Dict[str, Any]) – Schema dict to compress (from build_schema_from_docx or loaded JSON)
save_as (Optional[str]) – Optional name to save compressed schema JSON (without .json extension)

Returns:

compressed_schema: The compressed schema with deduplicated pattern descriptors
stats: Compression statistics (original_count, compressed_count, reduction, etc.)

Return type:

Dict containing

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status (401, 403, 429)
ForgeClientError – Failed to compress schema or save to workspace

Example

>>> result = client.compress_schema(
...     ws,
...     schema=schema,
...     save_as="compressed_schema"
... )
>>> print(f"Reduced from {result['stats']['original_count']} to {result['stats']['compressed_count']}")
>>> compressed_schema = result['compressed_schema']

Plaintext Intake¶

ForgeClient.intake_plaintext_text(ws, *, text, save_as=None, **opts)[source]¶

Intake plaintext via JSON body.

Endpoint: POST /plaintext/intake

Parameters:

ws (Any) – Workspace instance
text (str) – Plaintext content to intake
save_as (Optional[str]) – Optional name to save intake result JSON
**opts (Any) – Additional options matching PlaintextIntakeRequest fields: - unicode_form: str (default: “NFC”) - strip_zero_width: bool (default: True) - expand_tabs: bool (default: True) - ensure_final_newline: bool (default: True) - max_bytes: int (default: 10MB) - filename: str (optional)

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status

Example

>>> result = client.intake_plaintext_text(
...     ws,
...     text="Sample text...",
...     save_as="intake_result",
...     strip_zero_width=False
... )

ForgeClient.intake_plaintext_file(ws, *, file_path, save_as=None, **opts)[source]¶

Intake plaintext via file upload.

Endpoint: POST /plaintext/intake_file

Parameters:

ws (Any) – Workspace instance
file_path (str) – Path to plaintext file
save_as (Optional[str]) – Optional name to save intake result JSON
**opts (Any) – Query parameters for normalization options: - unicode_form: str - strip_zero_width: bool - expand_tabs: bool - ensure_final_newline: bool

Return type:

Dict[str, Any]

Returns:

Intake result dict from API

Raises:

ForgeClientIOError – Network/connection errors
ForgeClientHTTPError – API returned non-2xx status
ForgeClientError – File not found or unreadable

Example

>>> result = client.intake_plaintext_file(
...     ws,
...     file_path="sample.txt",
...     save_as="intake_result",
...     unicode_form="NFKC"
... )

Client Management¶

ForgeClient.close()[source]¶: Close the underlying HTTP client.

Usage Examples¶

Basic Schema Build and Run¶

from glyph_forge import ForgeClient, create_workspace

# Initialize
client = ForgeClient(api_key="gf_live_...")
ws = create_workspace()

# Build schema
schema = client.build_schema_from_docx(
    ws,
    docx_path="template.docx",
    save_as="my_schema"
)

# Run schema
output = client.run_schema(
    ws,
    schema=schema,
    plaintext="Content here...",
    dest_name="output.docx"
)

With Context Manager¶

from glyph_forge import ForgeClient, create_workspace

ws = create_workspace()

with ForgeClient(api_key="gf_live_...") as client:
    schema = client.build_schema_from_docx(
        ws,
        docx_path="template.docx"
    )

Bulk Processing¶

# Process multiple documents at once
plaintexts = ["Text 1...", "Text 2...", "Text 3..."]

result = client.run_schema_bulk(
    ws,
    schema=schema,
    plaintexts=plaintexts,
    max_concurrent=5,
    dest_name_pattern="output_{index}.docx"
)

print(f"Processed {result['successful']} of {result['total']}")

Schema Compression¶

# Compress schema to optimize size
result = client.compress_schema(
    ws,
    schema=schema,
    save_as="compressed_schema"
)

print(f"Reduced from {result['stats']['original_count']} "
      f"to {result['stats']['compressed_count']} pattern descriptors")