A container for results of an LLM call. Both chat models and LLMs generate an LLMResult object. This object contains the generated outputs and any additional information that the model provider wants to return.
| 9 | |
| 10 | |
| 11 | class LLMResult(BaseModel): |
| 12 | """A container for results of an LLM call. |
| 13 | |
| 14 | Both chat models and LLMs generate an LLMResult object. This object contains |
| 15 | the generated outputs and any additional information that the model provider |
| 16 | wants to return. |
| 17 | """ |
| 18 | |
| 19 | generations: List[List[Generation]] |
| 20 | """Generated outputs. |
| 21 | |
| 22 | The first dimension of the list represents completions for different input |
| 23 | prompts. |
| 24 | |
| 25 | The second dimension of the list represents different candidate generations |
| 26 | for a given prompt. |
| 27 | |
| 28 | When returned from an LLM the type is List[List[Generation]]. |
| 29 | When returned from a chat model the type is List[List[ChatGeneration]]. |
| 30 | |
| 31 | ChatGeneration is a subclass of Generation that has a field for a structured |
| 32 | chat message. |
| 33 | """ |
| 34 | llm_output: Optional[dict] = None |
| 35 | """For arbitrary LLM provider specific output. |
| 36 | |
| 37 | This dictionary is a free-form dictionary that can contain any information that the |
| 38 | provider wants to return. It is not standardized and is provider-specific. |
| 39 | |
| 40 | Users should generally avoid relying on this field and instead rely on |
| 41 | accessing relevant information from standardized fields present in |
| 42 | AIMessage. |
| 43 | """ |
| 44 | run: Optional[List[RunInfo]] = None |
| 45 | """List of metadata info for model call for each input.""" |
| 46 | |
| 47 | def flatten(self) -> List[LLMResult]: |
| 48 | """Flatten generations into a single list. |
| 49 | |
| 50 | Unpack List[List[Generation]] -> List[LLMResult] where each returned LLMResult |
| 51 | contains only a single Generation. If token usage information is available, |
| 52 | it is kept only for the LLMResult corresponding to the top-choice |
| 53 | Generation, to avoid over-counting of token usage downstream. |
| 54 | |
| 55 | Returns: |
| 56 | List of LLMResults where each returned LLMResult contains a single |
| 57 | Generation. |
| 58 | """ |
| 59 | llm_results = [] |
| 60 | for i, gen_list in enumerate(self.generations): |
| 61 | # Avoid double counting tokens in OpenAICallback |
| 62 | if i == 0: |
| 63 | llm_results.append( |
| 64 | LLMResult( |
| 65 | generations=[gen_list], |
| 66 | llm_output=self.llm_output, |
| 67 | ) |
| 68 | ) |
no outgoing calls