AI Model

An AI model, is a structured representation of algorithms and data designed to perform tasks that typically necessitate human intelligence. These tasks encompass a wide spectrum, including pattern recognition, decision-making, language understanding, and predictive analytics. AI models are foundational elements in various machine learning and deep learning systems.

Core Components

Foundation Models

Foundation models are self-trained models that serve as a base for fine-tuning the models we use, for instance the Instruction Tuned Models I will write further.

A foundation model is any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks; current examples include BERT [Devlin et al. 2019], GPT-3 [Brown et al. 2020], and CLIP [Radford et al. 2021]. From a technological point of view, foundation models are not new — they are based on deep neural networks and self-supervised learning, both of which have existed for decades. However, the sheer scale and scope of foundation models from the last few years have stretched our imagination of what is possible; for example, GPT-3 has 175 billion parameters and can be adapted via natural language prompts to do a passable job on a wide range of tasks despite not being trained explicitly to do many of those tasks [Brown et al. 2020]. At the same time, existing foundation models have the potential to accentuate harms, and their characteristics are in general poorly understood. Given their impending widespread deployment, they have become a topic of intense scrutiny [Bender et al. 2021]. On the Opportunities and Risks of Foundation Models, Foundation Models Hierarchy

Instruction Tuned Models

They have become very common with the OpenAI’s Create your own GPT feature. An Instruction Tuned Model is a Foundation Model which has been fine-tuned to do to follow detailed instructions provided by users. These models are designed to interpret, understand, and respond to a variety of explicit directives.

One of the most interesting

Tokenization

Tokenization involves translating a series of words into numeric form called tokens, so the inference model can predict the next token; by repeating this process the output is created and translated back to text for the user.

Tokenization strategies

  1. Space-Based Tokenization:
    • In many languages, words are separated by spaces. For example, in English, the sentence “Chatbots are helpful” can be tokenized into individual words: [“Chatbots”, “are”, “helpful”].
    • However, languages like Chinese or Japanese don’t use spaces to separate words. In such cases, tokenization uses techniques like character-level segmentation or statistical models to find the most probable word boundaries.
  2. Character-Level Tokenization:
    • Instead of breaking text into words, character-level tokenization dissects it into individual characters. For example, the English sentence “Chatbots are helpful” would be tokenized as: [“C”, “h”, “a”, “t”, “b”, “o”, “t”, “s”, “ “, “a”, “r”, “e”, “ “, “h”, “e”, “l”, “p”, “f”, “u”, “l”].
    • This approach is especially useful for languages without explicit word separators or for specific NLP tasks¹.
  3. Language-Agnostic Techniques:
    • Some tokenization methods are designed to be language-agnostic. They iteratively merge frequent sequences of characters or subwords in a given corpus, regardless of the language².
    • These techniques allow models to process and understand text by converting it into a sequence of meaningful tokens, capturing nuances like grammar, syntax, and semantics³.

Given those techniques and language native particularities, there is a substancial difference when tokenizing the same sentence on different languages, for instance the following comparision between Tokenizing the same text in Spanish and English, to do so I’m using the OpenAI Tokenizer

Tokenization Spanish Tokenization English

Model Classification

There are a lot of models in the market there

License

Deploying and Improving LLM results

Azure AI Studio includes a enriched catalog of models we can use to create our platform, with such variety selecting the best model is a mix of deploy, experiment, measure, making Azure AI Studio a great tool to fast deploy an iterate. How to deploy on Azure AI Studio

Deployment Options: LLMs deployment

Img taken from the course, original source: Four Ways that Enterprises Deploy LLMs