You have likely already used large language models (LLMs) many times. You may have accessed popular LLMs like CoPilot, ChatGPT, Claude or Gemini, interacting with the technology directly. And whether or not you are aware of it, you have probably also used one or more popular AI-driven services that leverage models like these via an API that talks to the models behind the scenes.
All of these popular models are trained on an enormous amount of publicly available information and have the preset disposition of a “helpful assistant.” They have been further fine-tuned using RLFH, which is a technique involving human beings label model responses in a way that helps the model learn how humans like to see their information.
These models are predisposed to anticipate your needs, always trying their best to give you the content you seek. LLMs are designed to make our lives better by automating tasks to save us time. Fundamentally, the basic usage pattern for any LLM is:
User inputs a query –> Model returns a response
And it’s what happens in between those two steps (i.e., the “–>“) where the magic happens.
The user query is often in text form, but it can also be audio or video, depending upon the model. Meanwhile, the response is also often in text form, but it can also be images, video and more.
Some popular LLM applications popular among executives and their teams include:
- Chatbots
- Virtual assistants
- Content creation
- Classifying and organizing text
- Fraud detection
- Writing computer code
Limitations of Out-of-the-Box Large Language Models
The top LLMs like OpenAI’s ChatGPT, Microsoft Copilot, and Anthropic’s Claude cost millions of dollars to train and are maintained by teams of very skilled humans. They are useful right out of the box, requiring very little training for the average user and can be more than sufficient for many use cases. To get started with any one of them, you just visit the company website, set up an account, and you are off to the races. Most services offer a limited use “free tier”, as well as paid plans that unlock more resources for your use.
As with anything else in life and in business, there is a significant trade-off when using LLMs as-is, without any customization.
But: what if your vision for your company is to remain truly cutting-edge in your industry? If you aim for “excellent” over “average,” then using LLM tools as-is is probably limiting your growth.
The Benefits of Customizing Your LLMs
What are some benefits of customizing LLMs? These include:
- Striking the right balance between speed and depth of knowledge
- Striking the right balance between cost and response quality
- Striking the right balance between cost and response length
- Striking the right balance between cost and how much recent query/prompt history the response takes into account
- Getting responses that more closely fit the subtleties of the information you seek
- Adjusting the degree to which the model takes creative risks versus staying more factual
- Adjusting the how deeply the model dips into possible response options
- Generating responses that are written in a tone or style of your choosing
- Generating responses that are aware of and incorporate your own data
- Generating responses that reflect sophisticated logic and reasoning capabilities
Going Deeper into LLMs
Business leaders who seek to remain competitive in their use of these amazing productivity tools are always looking for ways to “go deeper” into LLMs. Of course, there are several layers of depth when it comes to how leaders can guide their organizations toward more effective LLMs use. The deeper you go, the more customized the result.
There are many methods for squeezing more customized performance out of LLMs, including:
- Choosing the most appropriate model version
- Creating more sophisticated prompts
- Adjusting model parameters
- Adding your own documents to the information chain
- Fine-tuning the model with your own data
- Chaining multiple models together
Here’s a brief overview of each of these methods:
1. Choosing the most appropriate model version: Most major LLM creators offer several models. New models come in two main flavors: in some cases, they are better and faster than previous versions, taking the place of those older versions. In other cases, they are designed for different purposes. Some are designed to be faster, while others may be designed to perform more complex reasoning, create new images, output audio, etc. Choosing the right model for your use case can be as simple as reading the model descriptions on the company website. In other cases, you may need to play with a couple of them to compare performance-vs.-cost tradeoffs.
2. Creating more sophisticated prompts: Going a bit deeper, you can engage in prompt engineering by creating more thoughtful, longer prompts. This includes creating assistants by writing a description of how the assistant should act or what type of knowledge it should know. It can entail including in your prompt specific reference materials (such as an article or content from your website) or simply spending more time on carefully writing up the instructions you are using in your prompt.
3. Adjusting model parameters: Some models allow you to adjust the model parameters. Popular parameters include limiting the maximum number of tokens (i.e., portions of words) the model processes for a given query–>response combination. Parameter tuning can also include adjusting the model’s temperature, which is the degree of latitude you given the model to get creative in its responses. For poetry, think high temperature. For an encyclopedia entry, think lower temperature.
4. Adding your own documents to the information chain: Going deeper still, you can direct a data engineer to set up a multi-step process. It may start by preprocessing the user’s prompt by sending it to retrieve information from your own database of unstructured data (like PDF files, HTML files, and more) using something called vector embeddings that converts your content into long numerical strings, then using math to determine which text documents most resemble the user’s query. The system then bundles the relevant data from your own documents with the original user query before sending it to the LLM itself to elicit a response. This is called Retrieval-Augmented Generation, or RAG. For example, the user may ask about your company’s holiday schedule. The system starts by looking for and retrieving relevant data around the holiday schedule and feeds that data into the LLM along with the original query. The LLM then generates a user-friendly response that includes the schedule info.
5. Fine-tuning the model with your own data: You can take things a step further by creating your own version of an existing, popular LLM through a process called model fine-tuning. This involves preparing dozens or hundreds of rows of structured data arranged in corresponding “question –> answer” pairs, then using fine-tuning APIs to upload that data into an existing LLM. The result is your own, new model. The new model is the best of both worlds, in that it is trained on your own data, but it still allows gives the user access to the much larger corpus of knowledge the model was originally trained on.
6. Chaining multiple models together: Finally, there is a process or system whereby you can chain several models together. This is often done with a set of highly-specialized models called agents. You could, for example, build an agent that is an expert on coding in Python and another that is an expert at critiquing code in Python. You could string them together, passing the code generated by the first agent between the two agents 3-4 times until the code – improving the code through each iteration – before being returned to the human user. This is called a multi-agent system, and research shows that these systems can return higher quality content than just relying on a single agent or model.
Of course, going deeper into your LLMs through customization will involve an investment in time, money and talent. But to remain competitive as a high-performance team that is set apart from the competition, model customization can be an extremely sound investment with a very attractive ROI.
Bookmark the JLytics blog or add this blog to your browser’s favorites list. Stay tuned for more deep-dive content on each of these customization methods.