Models
Introduction – Language Models in the Context of AI Agents
Language models form the foundation of an AI agent’s intelligence. They act as the “brain” that processes text, interprets questions, understands context, and generates coherent responses. These models are trained on large amounts of data to learn patterns of human language and apply them to problem-solving, content generation, and automated decision-making.
In Skyone Studio, selecting the right language model is a strategic step. It determines the quality of responses, processing speed, depth of analysis, and even the operational cost. Different models have different capabilities — some are faster and more cost-efficient, while others are more advanced and contextually accurate.
Every agent, Skill, or workflow configuration depends directly on the choice of an appropriate model.
Model Types
Language Model Categories in Skyone Studio
Currently, Skyone Studio operates with two categories of language models (LLMs): Embedded (native) and Integrated. Each has distinct characteristics and modes of operation.
1. Embedded (Native)
Embedded models — such as GPT-OSS, Llama 3.2, Llama 3.2 Vision, Gemma, and Granite — are provided and maintained by Skyone.
They are downloaded and executed directly within Skyone Studio’s internal environments, using local processing resources and operating within a private data network.
These models are already available in the system, cannot be edited or removed, and are used directly by clients when running agents.
This model type is ideal for environments that require greater control, security, and confidentiality, since all processing occurs within Skyone’s internal infrastructure, without relying on external connections.
2. Integrated
Integrated LLMs allow clients to connect solutions from external providers (such as OpenAI, Anthropic, and others) to Skyone Studio.
To do this, the client must:
Create an account on the provider’s platform;
Generate an AppKey (access key);
Register the key in Skyone Studio.
These models can be configured in two ways:
OpenAI: by providing only the AppKey;
Custom: by providing the AppKey or the external model’s URL.
Skyone Studio offers an integration interface that enables the use of public LLMs, either with the client’s own token or with Skyone’s token.
This category is ideal for those who want to access more advanced, up-to-date, or specialized models without maintaining the entire infrastructure locally.

Key Differences Between Language Models
Context Window Capacity:
Determines how much information the model can “remember” and consider during an interaction.
Models with larger context windows can understand longer and more complex conversations.
Accuracy and Sophistication:
More advanced models typically generate responses that are more natural and contextually accurate.
Smaller models are faster and more cost-efficient but may have limitations in handling complex interactions.
Usage Cost:
More powerful models tend to have higher processing costs.
Choosing the right model helps balance quality and investment.
Response Speed:
Smaller models respond faster, making them ideal for simple interactions.
Larger models may respond more slowly but provide deeper, more detailed outputs.
Embedded Models Available in Skyone Studio
Currently, Skyone Studio provides the following embedded models:
Gemma – A Google DeepMind model known for efficiency, security, and scalability.
GPT OSS – A robust model with high processing capacity and strong textual accuracy.
Granite – A model optimized for corporate performance tasks.
Llama 3.2 – An open-source model from Meta offering high performance.
Llama 3.2 Vision – A variant of Llama with enhanced support for image interpretation.
How to Create an Integrated Model
Access the side menu and click “Models.”
Click “Add Model.”
Choose the model type:
OpenAI: enter your AppKey generated from your OpenAI account.
Custom: enter the AppKey or URL from your external provider.
Fill in the following information:
Model Name
Description (for internal identification)
System Prompt: defines the model’s main behavior.
Configure advanced parameters:
Temperature: controls the creativity level of responses.
top_k / top_p: manage word variation and sampling.
token_limit: sets the response token limit.
context_limit: defines the conversation’s context memory.
repeat_penalty: discourages repetitive outputs.
Click “Save” to complete the setup.

Click “Save model.”
Glossary – Terms Used in the Context of Language Models
Language Model (LLM – Large Language Model): An algorithm trained to understand and generate text in natural language.
Context Window: The maximum number of tokens (words + fragments) a model can consider within an interaction.
Token: A unit of measurement representing part of a word or a whole word used during model processing.
Fine-Tuning: The process of retraining a model with specific data to specialize it for a particular domain.
Prompt: An instruction or text input provided to the model to generate a response.
Inference: The process of running the model to generate a response.
Embedding: A numerical representation of text used for semantic comparison or search.
FAQ – Frequently Asked Questions About Language Models
Last updated