Introduction
In Token 1.2, we explored how to choose between traditional machine learning models and when to apply Foundation Models (FMs)/Large Language Models (LLMs). You've determined that for your specific task, an FM/LLM is the right fit. It may seem like the choice is made, but there's more to it – now you need to decide: will you opt for an open-source model or the API of a closed model?
This pivotal decision is not just about selecting a tool; it's about choosing a path that aligns with your strategic objectives and operational philosophy. In this Token, we analyze this critical choice and its implications for your AI strategy.
In today’s Token:
Understanding Open- and Closed-source Models
Factors to Consider
Key Differences of Open- and Closed-source Models, including:
Customization and control
Cost implications:
typical costs in closed-source models
hidden costs in open-source models
Scalability and performance
Support and community
Data privacy and security
Long-term viability
Conclusion
Understanding Open- and Closed-source Models (Let’s Define Them!)
While at first glance, open- and closed-source models might seem to differ only in their source code accessibility, a deeper dive reveals a more complex landscape.
Closed-source models are straightforward in one aspect: their source code, which details the model's construction and training, remains private. To use these models, you need to pay the owning company, referred to as the provider. However, even after the payment, the source code of the model will not be available to you. Instead, it allows the use of the model's capabilities within boundaries set by the provider. About the specifics that – in the following section.
Contrary to initial impressions, open-source models are not that simple. The term 'open-source' indeed implies publicly accessible code, but this is just the beginning. The intricacies lie in several key areas:
Diverse open-source licenses: Open-source models come with a variety of licenses, each with its own set of regulations. Some may restrict commercial use, confining the model's application to research purposes. For example: there are still debates if Llama 2 is open-sourced. Meta AI says it is, but Open Source Initiative (OSI) says, it isn’t.
The question of model weights: Open-source models universally share their source code, but there's variability in whether they include model weights*. The absence of these weights, critical for the model's operation, means users would need to undertake the resource-intensive task of training the model themselves – a feat often only feasible for major tech companies. The ML community debates: is a model truly open-source if it doesn't include its weights?
Open-source datasets: This aspect extends the philosophy of open-source models and may be perceived as a bonus addition to the open-source model. Unlike their closed-source counterparts, which typically don't reveal their training data, open-source models might provide an added advantage by sharing the datasets used for training. These datasets are crucial not only for training and fine-tuning foundation models but also for fostering open research and innovation.
*Model weights are numerical values assigned to neuron connections in a neural network, indicating the strength and direction of influence between neurons. Adjusted during training, they minimize the output error by influencing the signal as it passes through the network.Let’s review the main factors you need to take into account while choosing the model and choosing between open- and closed-source alternatives.
Factors to Consider
Choosing the right foundation model, and deciding between open- and closed-source options, involves a multifaceted analysis. What to consider:
Project requirements and objectives: What are the specific tasks and goals of your project? What do you need the model to achieve?
Cost implications: What are the visible and hidden costs of each option, including initial expenses, maintenance, and possible future costs?
Data privacy and security: How does each model handle sensitive data? Is it secure for projects involving confidential or personal information?
Customization and control: What level of adaptability do you need? Do you you require extensive customization capabilities, like fine-tuning and modifying model parameters?
Support and community: The level of available support and the robustness of the community can be vital, how does this align with your team's expertise and resources?
Scalability and performance: What is the model's ability to handle growing volumes of data and increasing task complexity, both currently and in the future?
Legal and ethical considerations: What ethical implications, such as potential biases in the model, and legal aspects, including data usage rights and commercial application restrictions, should be considered?
Availability of skills and resources: Does your team have the necessary skills to implement and maintain an open-source model, or are a closed-source model's ready-made solutions more suitable?
Long-term viability: How sustainable is the model in terms of ongoing support and development? This ensures the model's long-term usefulness.
Integration with existing systems: How well does the model integrate with your current infrastructure and workflows, especially in complex or established operational environments?
Balancing these factors against your project’s unique requirements will guide you in making an informed decision between open- and closed-source foundation models.
Key Differences of Open- and Closed-source Models
When considering your project's main use cases and the tasks your model needs to address, the degree of customization available becomes a critical factor.
And FMs can handle a diverse range of tasks! These include language processing (text generation, code generation, language translation, summarization), vision-related tasks (image and video generation), robotics applications, reasoning (math problem-solving, theorem proving), and complex user interactions. Language and vision capabilities form the cornerstone for more intricate applications in robotics, reasoning, and interaction.
Customization and control
In closed-source models, customization options are predefined and controlled by the provider. These options are often meticulously developed, reflecting their value proposition. Of course, closed-source providers offer highly-customizable plans, where their teams work directly on your specific project. However, this level of personalization often comes with a substantial price tag. Moreover, any enhancement or customization request can influence the development of new features, albeit within the provider's strategic direction.
Open-source models typically allow for more extensive customization. Developers can directly modify the source code to meet specific requirements, offering a higher degree of flexibility and control. This freedom, however, necessitates a substantial level of expertise and resourcefulness. Customization in open-source environments is primarily guided by the collective knowledge and experience of the community surrounding the model. If your use case isn't mentioned in existing documentation or community discussions, finding a way to achieve your project's goals might mean developing new solutions. Which is fun! but could result in higher project costs.
Cost implications
Making the choice – assess both the apparent and hidden costs of each option. This includes initial expenses, maintenance, and potential future costs.
When it comes to closed-source models, these models are often priced per token, a concept we defined in Token 1.6, discussing generative AI model architecture. (In case you were wondering, the name of this series episodes –Tokens – are also inspired by genAI model architecture.)
Typical costs in closed-source models include:
Training of custom models, including computational expenses.
Inference (running) costs.
Model storage fees.
Access to ticket-based support.
Potentially, deployment options.
The costs of closed-source models often encompass computational expenses (since these models run on the provider's servers). They can be more approachable, particularly for those with limited machine learning expertise. The heavy lifting and optimization is usually managed by the provider. (More about it in the next section.)
Conversely, open-source models are initially free to use. But the "free" aspect doesn't cover hidden costs associated with computational resources needed for fine-tuning, inference, deployment, and establishing a deployment infrastructure.
Storage costs for the model.
Computational costs for fine-tuning and inference.
Expenses associated with building deployment infrastructure.
Higher costs due to the need for more technical expertise.
Scalability and performance
Closed-source models are typically developed by companies with substantial resources. This means they are often optimized for performance and designed to handle large-scale operations efficiently. As we already mentioned, these models usually run on the provider's high-end infrastructure, ensuring robust performance and quick scalability to meet growing demands. The scalability in closed-source models can be expensive, as providers often charge for increased usage, and customization options for scaling might be limited, tying users to the provider's roadmap and pricing structure.
Open-source models offer a different flavor of scalability and performance. The open nature of these models allows for extensive customization and this can be a double-edged sword. The performance of open-source models depends largely on the user's ability to fine-tune and manage the system, which can be a challenging task without the requisite technical know-how. Additionally, while open-source models can be scaled to handle large datasets or high-demand scenarios, the responsibility for handling and funding the required infrastructure falls on the user. This can be a barrier, especially for smaller organizations or projects with limited budgets.
Tip: An important aspect of costs: model size*
As highlighted in "A Survey of Large Language Models," the size of an LLM can vary immensely, ranging from several billion to, in some cases, several trillion parameters. While the initial trend favored building larger models, research and industry leaders are now pivoting towards smaller, more efficient models. The same can be said about other types of foundation models. This shift has spurred new research in model compression for LLMs and Vision-Transformers (ViTs).
*Model size refers to the total number of parameters in a model. In the context of neural networks, parameters include both weights and biases*. This number is often used as a measure of the model's complexity and capacity for learning.*Biases in neural networks are parameters added to the sum of weighted inputs in each neuron, allowing it to adjust its output independently of its inputs, thereby aiding in accurately modeling complex data patterns.The size of a model directly affects its processing power requirements, inference speed, and memory usage. This is especially crucial for open-source models, where computational resources are user-supplied, in contrast to closed-source models where computation is generally handled by the provider.
Tip: Fine-tuning
Our discussions with practitioners about implementing LLMs highlighted the need to decide between using existing models or fine-tuning them. While using an existing model can save time and resources, fine-tuning open-source models demands significant computational resources and expertise. The same is true for vision models and other types of FMs.
Some tips for fine-tuning:
Employ parameter-efficient techniques like Low-Rank Adaptation of Large Language Models (LoRA) for LLMs and LCM-LoRA for Stable-Diffusion models
Experiment with LLM configurations (e.g., max seq length, temperature) and understand their impacts
Consider retrieval-augmented generation (RAG) as an alternative to fine-tuning.
Previously in the FM/LLM series:
Please give us feedback
Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍










