> For the complete documentation index, see [llms.txt](https://docs.skyone.cloud/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.skyone.cloud/english/skyone-studio/how-to/generation-parameters-in-skyone-studio.md). # Generation Parameters in Skyone Studio ### Introduction Large Language Models (LLMs) are artificial intelligence systems capable of understanding and generating text in a way that resembles human communication. They are trained on billions of words and language examples to predict the next word in a sentence. In Skyone Studio, LLMs can be fine-tuned through configuration parameters. These parameters act like control levers: they allow the user to decide whether they want shorter or longer answers, more creative or more precise, more varied or more objective. This document explains in detail the main text generation parameters available, helping both technical professionals and business users understand and use the tool more effectively. *** ### Key Terms (Glossary) * **LLM (Large Language Model)**: A large-scale language model trained to understand and generate text. * **Token**: The minimum unit of text used by the model (it can be a whole word, part of a word, or even a symbol). * **Prompt**: The text or instruction provided by the user for the model to generate a response. * **Max\_tokens**: The maximum number of tokens the model can generate in an output. * **Temperature**: A parameter that controls the level of creativity/randomness in the text. * **Top\_p (Nucleus Sampling)**: Defines the percentage of the most probable tokens to be considered. * **Top\_k**: Limits the number of possible tokens at each generation step. * **Presence\_penalty**: Penalizes repetitions and encourages variety in the text. * **Stop**: Defines words or symbols that interrupt text generation. *** ### Generation Parameters #### Max\_tokens Description: Sets the maximum number of tokens the model can generate.\ Practical Example: * max\_tokens = 15 → short answer. * max\_tokens = 100 → long and detailed answer.\ Analogy: It’s like choosing the size of the sheet of paper the model can write on.

*** #### Temperature **Description**: Controls the creativity and randomness of the response. * Low temperature → Objective and predictable answers. * High temperature → Creative and varied answers. Analogy: It’s like the “temperature” of a conversation: cold (direct) or warm (diverse and full of ideas).

*** #### Top\_p **Description**: Defines the cumulative percentage of the most probable tokens to be considered.\ Example: * top\_p = 0.1 → only the top 10% most likely tokens. * top\_p = 0.9 → includes less common words.\ Analogy: It’s like using a sieve: the finer it is, the fewer options get through.

*** #### Top\_k **Description**: Restricts generation to the top k most likely tokens.\ Example: * top\_k = 2 → very restricted choices. * top\_k = 40 → broader choices.\ Analogy: It’s like a menu: it can be small (few options) or large (more variety).

*** #### Presence\_penalty **Description**: Penalizes repetitions and encourages the model to explore new words and ideas.\ Example: * Without penalty: “He likes to run, run, and run...” * With penalty: “He likes to run, play sports, and stay active.”\ Analogy: It’s like asking someone not to repeat the same story over and over in a conversation.

*** #### Stop **Description**: A list of words or symbols that determine where the model should stop.\ Example: * stop = \["end"] → the response stops immediately after this word.\ Analogy: It’s like pressing the “pause” button at the right moment.\

*** ### Best Practices * Adjust max\_tokens according to the expected response length. * Use low temperature for technical answers and high for creative tasks. * Combine top\_p and top\_k to balance diversity and predictability. * Apply presence\_penalty to avoid redundancy. * Use stop to ensure the output ends at the desired point. * Always log the parameters used to reproduce results in the future.

*** ### FAQ

What is a token?

A token is a piece of text, which can be a whole word, part of a word, or even a symbol.

What’s the difference between Top_p and Top_k?

* Top\_k sets a fixed number of possible words. * Top\_p uses a cumulative probability percentage.

When should I use a high temperature?

For creative tasks such as brainstorming, story generation, or free drafting.

Can the presence_penalty cause problems?

Yes. If set too high, it can hurt coherence by forcing excessive variety.

Do I always need to define all parameters?

No. Many have default values that work well in most cases, but manually tuning them helps achieve more precise results.

--- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.skyone.cloud/english/skyone-studio/how-to/generation-parameters-in-skyone-studio.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.