LLM Span Qualifier[source]

The eds.llm_span_qualifier component qualifies spans using a Large Language Model (LLM) that returns structured JSON attributes.

This component takes existing spans, wraps them with <ent> markers inside a context window and prompts an LLM to answer with a JSON object that matches the configured schema. The response is validated and written back on the span extensions.

In practice, along with a system prompt that constrains the allowed attributes and optional few-shot examples provided as previous user / assistant messages, the component sends snippets such as:

Biopsies du <date>12/02/2025</date> : adénocarcinome.

and expects a minimal JSON answer, for example:

{"biopsy_procedure": "yes"}

which is then parsed and assigned to the span attributes.

Experimental

This component is experimental. The API and behavior may change in future versions. Make sure to pin your edsnlp version if you use it in a project.

Dependencies

This component requires several dependencies. Run the following command to install them:

pip install openai bm25s Stemmer

We recommend even to add them to your pyproject.toml or requirements.txt.

Examples

If your data is sensitive, we recommend you to use a self-hosted model with an OpenAI-compatible API, such as vLLM.

You can store your OpenAI API key in the OPENAI_API_KEY environment variable.

import os
os.environ["OPENAI_API_KEY"] = "your_api_key_here"

Start a server with the model of your choice:

python -m vllm.entrypoints.openai.api_server \
   --model mistral-small-24b-instruct-2501 \
   --port 8080 \
   --enable-prefix-caching

You can then use the llm_span_qualifier component as follows:

Yes/no bool classificationMulti-attribute classification

from typing import Annotated, TypedDict
from pydantic import BeforeValidator, PlainSerializer, BaseModel, Field
import edsnlp, edsnlp.pipes as eds

# Pydantic schema used to validate and parse the LLM response
# The output will be a boolean field.
# Example:
# ent._.biopsy_procedure → False
class BiopsySchema1(BaseModel):
    biopsy_procedure: bool = Field(
        ..., description="Is the span a biopsy procedure or not"
    )

# Alternative schema using a TypedDict
# The output will be a dict with a boolean value instead of a boolean field.
# Example:
# ent._.biopsy_procedure → {'biopsy_procedure': False}
class BiopsySchema2(TypedDict):
    biopsy_procedure: bool

# Alternative annotated schema with custom (de)serializers.
# This schema transforms the LLM’s output into a boolean before validation.
# Any case-insensitive variant of "yes", "y", or "true" is interpreted as True;
# all other values are treated as False.
#
# When serializing to JSON, the boolean is converted back into the strings
# "yes" (for True) or "no" (for False).
# The output will be a boolean field.
# Example:
# ent._.biopsy_procedure → False
BiopsySchema3 = Annotated[
    bool,
    BeforeValidator(lambda v: str(v).lower() in {"yes", "y", "true"}),
    PlainSerializer(lambda v: "yes" if v else "no", when_used="json"),
]


PROMPT = """
You are a span classifier. The user sends text where the target is
marked with <ent>...</ent>. Answer ONLY with a JSON value: "yes" or
"no" indicating whether the span is a biopsy date.
""".strip()

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(eds.dates(span_setter="ents"))

# EDS-NLP util to create documents from Markdown or XML markup.
# This has nothing to do with the LLM component itself. The following
# will create docs with entities labelled "date", store them in doc.ents,
# and set their span._.biopsy_procedure attribute.
examples = list(edsnlp.data.from_iterable(
    [
        "IRM du 10/02/2025. Biopsies du <date biopsy_procedure=true>12/02/2025</date> : adénocarcinome.",
        "Chirurgie le 24/12/2021. Colectomie. Consultation du <date biopsy_procedure=false>26/12/2021</date>.",
    ],
    converter="markup",
    preset="xml",
).map(nlp.pipes.sentences))

doc_to_xml = edsnlp.data.converters.DocToMarkupConverter(preset="xml")
nlp.add_pipe(
    eds.llm_span_qualifier(
        api_url="http://localhost:8080/v1",
        model="mistral-small-24b-instruct-2501",
        prompt=PROMPT,
        span_getter="ents",
        context_getter="sent",
        context_formatter=doc_to_xml,
        attributes=["biopsy_procedure"],
        output_schema=BiopsySchema1, # or BiopsySchema2 or BiopsySchema3
        examples=examples,
        max_few_shot_examples=2,
        max_concurrent_requests=4,
        seed=0,
    )
)

text = """
RCP Prostate – 20/02/2025
Biopsies du 12/02/2025 : adénocarcinome Gleason 4+4=8.
Simulation scanner le 25/02/2025.
"""
doc = nlp(text)
for d in doc.ents:
    print(d.text, "→ biopsy_procedure:", d._.biopsy_procedure)
# Out: 20/02/2025 → biopsy_procedure: False
# Out: 12/02/2025 → biopsy_procedure: True
# Out: 25/02/2025 → biopsy_procedure: False

from typing import Annotated, Optional
import datetime
from pydantic import BaseModel, Field
import edsnlp, edsnlp.pipes as eds

# Pydantic schema used to validate the LLM response, serialize the
# few-shot example answers constrain the model output.
class CovidMentionSchema(BaseModel):
    negation: bool = Field(..., description="Is the span negated or not")
    date: Optional[datetime.date] = Field(
        None, description="Date associated with the span, if any"
    )

PROMPT = """
You are a span classifier. For every piece of markup-annotated text the
user provides, you predict the attributes of the annotated spans.
You must follow these rules strictly:
- Be consistent, similar queries must lead to similar answers.
- Do not add any comment or explanation, just provide the answer.
Example with a negation and a date:
User: "Le 1er mai 2024, le patient a été testé <ent>covid</ent> négatif"
Assistant: "{"negation": true, "date": "2024-05-01"}"
For each span, provide a JSON with a "negation" boolean attribute, set to
true if the span is negated, false otherwise. If a date is associated with
the span, provide it as a "date" attribute in ISO format (YYYY-MM-DD).
""".strip()

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(eds.covid())

# EDS-NLP util to create documents from Markdown or XML markup.
# This has nothing to do with the LLM component itself.
examples = list(edsnlp.data.from_iterable(
    [
        "<ent negation=false date=2024-05-01>Covid</ent> positif le 1er mai 2024.",
        "Pas de <ent negation=true>covid</ent>",
        # ... add more examples if you can
    ],
    converter="markup", preset="xml",
).map(nlp.pipes.sentences))

doc_to_xml = edsnlp.data.converters.DocToMarkupConverter(preset="xml")
nlp.add_pipe(
    eds.llm_span_qualifier(
        api_url="https://api.openai.com/v1",
        model="gpt-5-mini",
        prompt=PROMPT,
        span_getter="ents",
        context_getter="words[-10:10]",
        context_formatter=doc_to_xml,
        output_schema=CovidMentionSchema,
        examples=examples,
        max_few_shot_examples=1,
        max_concurrent_requests=4,
        seed=0,
    )
)
doc = nlp("Pas d'indication de <ent>covid</ent> le 3 mai 2024.")
(ent,) = doc.ents
print(ent.text, "→ negation:", ent._.negation, "date:", ent._.date)
# Out: covid → negation: True date: 2024-05-03

Advanced usage

You can also control the prompt more finely by providing a callable instead of a string. For example, to put few-shot examples in the system message and keep the span context as the user payload:

# Use this for the `prompt` argument instead of PROMPT above
def prompt(context_text, examples):
    messages = []
    system_content = (
        "You are a span classifier.\n"
        "Answer with JSON using the keys: biopsy_procedure.\n"
        "Here are some examples:\n"
    )
    for ex_context, ex_json in examples:
        system_content += f"- Context: {ex_context}\n"
        system_content += f"  JSON: {ex_json}\n"
    messages.append({"role": "system", "content": system_content})
    messages.append({"role": "user", "content": context_text})
    return messages

You can also control the context formatting by providing a custom callable to the context_formatter parameter. For example, to wrap the context with a custom prefix and suffix as follows:

from spacy.tokens import Doc

class ContextFormatter:
    def __init__(self, prefix: str, suffix: str):
        self.prefix = prefix
        self.suffix = suffix

    def __call__(self, context: Doc) -> str:
        span = context.ents[0].text if context.ents else ""
        prefix = self.prefix.format(span=span)
        suffix = self.suffix.format(span=span)
        return f"{prefix}{context.text}{suffix}"

context_formatter = ContextFormatter(prefix="\n## Context\n\n<<<\n",
                                     suffix= "\n>>>\n\n## Instruction\nDoes '{span}' corresponds to a Biopsy date?")

max_concurrent_requests parameter

We recommend setting the max_concurrent_requests parameter to a greater value to improve throughput when processing batches of documents.

Parameters

PARAMETER	DESCRIPTION
`nlp`	Pipeline object. TYPE: `PipelineProtocol`
`name`	Component name. TYPE: `str` DEFAULT: `'llm_span_qualifier'`
`api_url`	Base URL of the OpenAI-compatible API. TYPE: `str`
`model`	Model identifier exposed by the API. TYPE: `str`
`prompt`	The prompt is the main way to control the model's behavior. It can be either: A string, which will be used as a system prompt. Few-shot examples (if any) will be provided as user/assistant messages before the actual user query. A callable that takes three arguments and returns a list of messages in the format expected by the OpenAI chat completions API. `context`: the context text with the target span marked up `examples`: a list of few-shot examples, each being a tuple of (context, answer) TYPE: `Union[str, Callable[[Union[str, Doc], List[Tuple[Union[str, Doc], str]]], List[Dict[str, str]]]]`
`span_getter`	Spans to classify. Defaults to `{"ents": True}`. TYPE: `Optional[SpanGetterArg]` DEFAULT: `None`
`context_getter`	Optional context window specification (e.g. `"sent"`, `"words[-10:10]"`). If `None`, the whole document text is used. TYPE: `Optional[ContextWindow]` DEFAULT: `None`
`context_formatter`	Callable used to render the context passed to the LLM. Defaults to `lambda context_getter_output: context_getter_output.text`. TYPE: `Optional[Callable[[Doc], str]]` DEFAULT: `None`
`attributes`	Attributes to predict. If omitted, the keys are inferred from the provided schema. TYPE: `Optional[AttributesArg]` DEFAULT: `None`
`output_schema`	Pydantic model class used to validate responses and serialise few-shot examples. If the schema is a mapping/object, it will also be used to force the model to output a valid JSON object. TYPE: `Optional[Union[Type[BaseModel], Type[Any], Annotated[Any, Any]]]` DEFAULT: `None`
`examples`	Few-shot examples used in prompts. TYPE: `Optional[Iterable[Doc]]` DEFAULT: `None`
`max_few_shot_examples`	Maximum number of few-shot examples per request (`-1` means all). TYPE: `int` DEFAULT: `-1`
`use_retriever`	Whether to select few-shot examples with BM25 (defaults to automatic choice). If there are few shot examples and `max_few_shot_examples > 0`, this enabled by default. TYPE: `Optional[bool]` DEFAULT: `None`
`seed`	Optional seed forwarded to the API. TYPE: `Optional[int]` DEFAULT: `None`
`max_concurrent_requests`	Maximum number of concurrent span requests per batch of documents. TYPE: `int` DEFAULT: `1`
`api_kwargs`	Extra keyword arguments forwarded to `chat.completions.create`. TYPE: `Dict[str, Any]` DEFAULT: `None`
`on_error`	Error handling strategy. If `"raise"`, exceptions are raised. If `"warn"`, exceptions are logged as warnings and processing continues. TYPE: `Literal['raise', 'warn']` DEFAULT: `raise`
`timeout`	Optional timeout (in seconds) for each LLM request. TYPE: `Optional[float]` DEFAULT: `None`
`default_headers`	Optional default headers for the API client. TYPE: `Optional[Dict[str, str]]` DEFAULT: `{'Connection': 'close'}`