Skip to content

LLM Span Qualifier[source]

The eds.llm_span_qualifier component qualifies spans using a Large Language Model (LLM) that returns structured JSON attributes.

This component takes existing spans, wraps them with <ent> markers inside a context window and prompts an LLM to answer with a JSON object that matches the configured schema. The response is validated and written back on the span extensions.

In practice, along with a system prompt that constrains the allowed attributes and optional few-shot examples provided as previous user / assistant messages, the component sends snippets such as:

Biopsies du <date>12/02/2025</date> : adénocarcinome.

and expects a minimal JSON answer, for example:

{"biopsy_procedure": "yes"}
which is then parsed and assigned to the span attributes.

Experimental

This component is experimental. The API and behavior may change in future versions. Make sure to pin your edsnlp version if you use it in a project.

Dependencies

This component requires several dependencies. Run the following command to install them:

pip install openai bm25s Stemmer
We recommend even to add them to your pyproject.toml or requirements.txt.

Examples

If your data is sensitive, we recommend you to use a self-hosted model with an OpenAI-compatible API, such as vLLM.

You can store your OpenAI API key in the OPENAI_API_KEY environment variable.

import os
os.environ["OPENAI_API_KEY"] = "your_api_key_here"

Start a server with the model of your choice:

python -m vllm.entrypoints.openai.api_server \
   --model mistral-small-24b-instruct-2501 \
   --port 8080 \
   --enable-prefix-caching

You can then use the llm_span_qualifier component as follows:

from typing import Annotated, TypedDict
from pydantic import BeforeValidator, PlainSerializer, WithJsonSchema
import edsnlp, edsnlp.pipes as eds

# Pydantic schema used to validate and parse the LLM response
# The output will be a boolean field.
# Example:
# ent._.biopsy_procedure → False
class BiopsySchema1(BaseModel):
    biopsy_procedure: bool = Field(
        ..., description="Is the span a biopsy procedure or not"
    )

# Alternative schema using a TypedDict
# The output will be a dict with a boolean value instead of a boolean field.
# Example:
# ent._.biopsy_procedure → {'biopsy_procedure': False}
class BiopsySchema2(TypedDict):
    biopsy_procedure: bool

# Alternative annotated schema with custom (de)serializers.
# This schema transforms the LLM’s output into a boolean before validation.
# Any case-insensitive variant of "yes", "y", or "true" is interpreted as True;
# all other values are treated as False.
#
# When serializing to JSON, the boolean is converted back into the strings
# "yes" (for True) or "no" (for False).
# The output will be a boolean field.
# Example:
# ent._.biopsy_procedure → False
BiopsySchema3 = Annotated[
    bool,
    BeforeValidator(lambda v: str(v).lower() in {"yes", "y", "true"}),
    PlainSerializer(lambda v: "yes" if v else "no", when_used="json"),
]


PROMPT = """
You are a span classifier. The user sends text where the target is
marked with <ent>...</ent>. Answer ONLY with a JSON value: "yes" or
"no" indicating whether the span is a biopsy date.
""".strip()

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(eds.dates(span_setter="ents"))

# EDS-NLP util to create documents from Markdown or XML markup.
# This has nothing to do with the LLM component itself. The following
# will create docs with entities labelled "date", store them in doc.ents,
# and set their span._.biopsy_procedure attribute.
examples = list(edsnlp.data.from_iterable(
    [
        "IRM du 10/02/2025. Biopsies du <date biopsy_procedure=true>12/02/2025</date> : adénocarcinome.",
        "Chirurgie le 24/12/2021. Colectomie. Consultation du <date biopsy_procedure=false>26/12/2021</date>.",
    ],
    converter="markup",
    preset="xml",
).map(nlp.pipes.sentences))

doc_to_xml = edsnlp.data.converters.DocToMarkupConverter(preset="xml")
nlp.add_pipe(
    eds.llm_span_qualifier(
        api_url="http://localhost:8080/v1",
        model="mistral-small-24b-instruct-2501",
        prompt=PROMPT,
        span_getter="ents",
        context_getter="sent",
        context_formatter=doc_to_xml,
        attributes=["biopsy_procedure"],
        output_schema=BiopsySchema1, # or BiopsySchema2 or BiopsySchema3
        examples=examples,
        max_few_shot_examples=2,
        max_concurrent_requests=4,
        seed=0,
    )
)

text = """
RCP Prostate – 20/02/2025
Biopsies du 12/02/2025 : adénocarcinome Gleason 4+4=8.
Simulation scanner le 25/02/2025.
"""
doc = nlp(text)
for d in doc.ents:
    print(d.text, "→ biopsy_procedure:", d._.biopsy_procedure)
# Out: 20/02/2025 → biopsy_procedure: False
# Out: 12/02/2025 → biopsy_procedure: True
# Out: 25/02/2025 → biopsy_procedure: False
from typing import Annotated, Optional
import datetime
from pydantic import BaseModel, Field
import edsnlp, edsnlp.pipes as eds

# Pydantic schema used to validate the LLM response, serialize the
# few-shot example answers constrain the model output.
class CovidMentionSchema(BaseModel):
    negation: bool = Field(..., description="Is the span negated or not")
    date: Optional[datetime.date] = Field(
        None, description="Date associated with the span, if any"
    )

PROMPT = """
You are a span classifier. For every piece of markup-annotated text the
user provides, you predict the attributes of the annotated spans.
You must follow these rules strictly:
- Be consistent, similar queries must lead to similar answers.
- Do not add any comment or explanation, just provide the answer.
Example with a negation and a date:
User: "Le 1er mai 2024, le patient a été testé <ent>covid</ent> négatif"
Assistant: "{"negation": true, "date": "2024-05-01"}"
For each span, provide a JSON with a "negation" boolean attribute, set to
true if the span is negated, false otherwise. If a date is associated with
the span, provide it as a "date" attribute in ISO format (YYYY-MM-DD).
""".strip()

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(eds.covid())

# EDS-NLP util to create documents from Markdown or XML markup.
# This has nothing to do with the LLM component itself.
examples = list(edsnlp.data.from_iterable(
    [
        "<ent negation=false date=2024-05-01>Covid</ent> positif le 1er mai 2024.",
        "Pas de <ent negation=true>covid</ent>",
        # ... add more examples if you can
    ],
    converter="markup", preset="xml",
).map(nlp.pipes.sentences))

doc_to_xml = edsnlp.data.converters.DocToMarkupConverter(preset="xml")
nlp.add_pipe(
    eds.llm_span_qualifier(
        api_url="https://api.openai.com/v1",
        model="gpt-5-mini",
        prompt=PROMPT,
        span_getter="ents",
        context_getter="words[-10:10]",
        context_formatter=doc_to_xml,
        output_schema=CovidMentionSchema,
        examples=examples,
        max_few_shot_examples=1,
        max_concurrent_requests=4,
        seed=0,
    )
)
doc = nlp("Pas d'indication de <ent>covid</ent> le 3 mai 2024.")
(ent,) = doc.ents
print(ent.text, "→ negation:", ent._.negation, "date:", ent._.date)
# Out: covid → negation: True date: 2024-05-03

Advanced usage

You can also control the prompt more finely by providing a callable instead of a string. For example, to put few-shot examples in the system message and keep the span context as the user payload:

# Use this for the `prompt` argument instead of PROMPT above
def prompt(context_text, examples):
    messages = []
    system_content = (
        "You are a span classifier.\n"
        "Answer with JSON using the keys: biopsy_procedure.\n"
        "Here are some examples:\n"
    )
    for ex_context, ex_json in examples:
        system_content += f"- Context: {ex_context}\n"
        system_content += f"  JSON: {ex_json}\n"
    messages.append({"role": "system", "content": system_content})
    messages.append({"role": "user", "content": context_text})
    return messages

You can also control the context formatting by providing a custom callable to the context_formatter parameter. For example, to wrap the context with a custom prefix and suffix as follows:

from spacy.tokens import Doc

class ContextFormatter:
    def __init__(self, prefix: str, suffix: str):
        self.prefix = prefix
        self.suffix = suffix

    def __call__(self, context: Doc) -> str:
        span = context.ents[0].text if context.ents else ""
        prefix = self.prefix.format(span=span)
        suffix = self.suffix.format(span=span)
        return f"{prefix}{context.text}{suffix}"

context_formatter = ContextFormatter(prefix="\n## Context\n\n<<<\n",
                                     suffix= "\n>>>\n\n## Instruction\nDoes '{span}' corresponds to a Biopsy date?")

max_concurrent_requests parameter

We recommend setting the max_concurrent_requests parameter to a greater value to improve throughput when processing batches of documents.

Parameters

PARAMETER DESCRIPTION
nlp

Pipeline object.

TYPE: PipelineProtocol

name

Component name.

TYPE: str DEFAULT: 'llm_span_qualifier'

api_url

Base URL of the OpenAI-compatible API.

TYPE: str

model

Model identifier exposed by the API.

TYPE: str

prompt

The prompt is the main way to control the model's behavior. It can be either:

  • A string, which will be used as a system prompt. Few-shot examples (if any) will be provided as user/assistant messages before the actual user query.
  • A callable that takes three arguments and returns a list of messages in the format expected by the OpenAI chat completions API.

    • context: the context text with the target span marked up
    • examples: a list of few-shot examples, each being a tuple of (context, answer)

TYPE: Union[str, Callable[[Union[str, Doc], List[Tuple[Union[str, Doc], str]]], List[Dict[str, str]]]]

span_getter

Spans to classify. Defaults to {"ents": True}.

TYPE: Optional[SpanGetterArg] DEFAULT: None

context_getter

Optional context window specification (e.g. "sent", "words[-10:10]"). If None, the whole document text is used.

TYPE: Optional[ContextWindow] DEFAULT: None

context_formatter

Callable used to render the context passed to the LLM. Defaults to lambda context_getter_output: context_getter_output.text.

TYPE: Optional[Callable[[Doc], str]] DEFAULT: None

attributes

Attributes to predict. If omitted, the keys are inferred from the provided schema.

TYPE: Optional[AttributesArg] DEFAULT: None

output_schema

Pydantic model class used to validate responses and serialise few-shot examples. If the schema is a mapping/object, it will also be used to force the model to output a valid JSON object.

TYPE: Optional[Union[Type[BaseModel], Type[Any], Annotated[Any, Any]]] DEFAULT: None

examples

Few-shot examples used in prompts.

TYPE: Optional[Iterable[Doc]] DEFAULT: None

max_few_shot_examples

Maximum number of few-shot examples per request (-1 means all).

TYPE: int DEFAULT: -1

use_retriever

Whether to select few-shot examples with BM25 (defaults to automatic choice). If there are few shot examples and max_few_shot_examples > 0, this enabled by default.

TYPE: Optional[bool] DEFAULT: None

seed

Optional seed forwarded to the API.

TYPE: Optional[int] DEFAULT: None

max_concurrent_requests

Maximum number of concurrent span requests per batch of documents.

TYPE: int DEFAULT: 1

api_kwargs

Extra keyword arguments forwarded to chat.completions.create.

TYPE: Dict[str, Any] DEFAULT: None

on_error

Error handling strategy. If "raise", exceptions are raised. If "warn", exceptions are logged as warnings and processing continues.

TYPE: Literal['raise', 'warn'] DEFAULT: raise

timeout

Optional timeout (in seconds) for each LLM request.

TYPE: Optional[float] DEFAULT: None

default_headers

Optional default headers for the API client.

TYPE: Optional[Dict[str, str]] DEFAULT: {'Connection': 'close'}