What Are ChatGPT, Bing, & Bard?
While OpenAI’s ChatGPT, Microsoft’s Bing, and Google’s Bard have received a lot of public attention in the past months, it is important to remember that they are specific products built on top of a class of technologies called Large Language Models (LLMs).
The most accessible way of using these products requires typing or pasting text into their respective input boxes and reading or copying their responses. Only some of these products offer programmatic interfaces, meaning they cannot be “plugged into” other enterprise data systems or processes. This enterprise-level use of this technology will be the focus of
this document.
What Is a Large Language Model (LLM)?
An LLM is a neural network model architecture based on a specific component called a transformer. Transformer technologies were originally developed by Google in 2017 and have been the subject of intense research and development since then. LLMs work by reviewing enormous volumes of text, identifying the ways that words relate to one another, and
building a model that allows them to reproduce similar text (Stephen Wolfram has written a detailed explanation of LLMs)
Importantly, when asked a question, they are not “looking up” a response. Rather, they are producing a string of words by predicting which word would best follow the previous, taking into account the broader context of the words before it. In essence, they are providing a “common sense” response to a question. What you see in ChatGPT is not a deliberate attempt to make the responses appear more natural by sequentially displaying words, but rather the algorithm works by "guessing" the next words in the sequence.
While the most powerful LLMs have shown their ability to produce largely accurate responses on an astonishingly wide range of tasks, the factual accuracy of those responses cannot be guaranteed.
What Makes a Large Language Model... Large?
A neural network is made up of a large number of “neurons,” which are simple mathematical formulas that pass the results of their calculations to one or more neurons in the system. The connections between these neurons are given “weights” that define the strength of the signal between the neurons. These weights are also sometimes called parameters.
One of the models behind ChatGPT (gpt-3.5-turbo) has 175 billion parameters. GPT-4 has an unknown number of parameters. Surely you remember the function y = ax2 + bx + c from your math classes in school. Now imagine a function that has billions of parameters instead of three (a, b, c). That's exactly how neural networks work.
The size of these models has important consequences for their performance, but also the cost and complexity of their use. On the one hand, larger models tend to produce more human-like text and are able to handle topics that they may not have been specifically prepared for. On the other hand, both building the model and using the model is extremely computationally intensive. It costs around $5mln to train Chat-GPT before it was released (this money was spent on using Google Cloud Platform).
It is no accident that the largest and most highly performing models have come from giant technology companies or startups funded by such companies: The development of these models likely costs billions of dollars in cloud computing.
Tradeoffs Between LLMs & Other Methods
LLMs (Large Language Models) can be very computationally intensive. For instance, a model with 175 billion weights needs to perform 175 billion calculations for each "token" it outputs, which means a lot of processing power is required.
However, the reason for creating such large models is that they offer great versatility. Similar to a smartphone, which provides many functions in a single device, large language models can be used for a wide range of tasks such as translation, summarization, or text generation based on just a few inputs. In other words, they are a one-stop solution for multiple tasks.
But, this flexibility comes at a cost. If you only need a simple stopwatch, you can choose a much cheaper alternative than a smartphone. Similarly, if you have a specific task at hand, you may be better off selecting a smaller, task-specific language model instead of investing in a larger one.
Other language models that can be used besides LLMs include GPT, BERT, ELMo, Transformer-XL, and RoBERTa. Each model has its own strengths and weaknesses, and the choice of which one to use depends on the task and resources available.
Identifying a Use Case for LLMs
If you’re interested in testing the usefulness of an LLM inside your organization, seek an application that balances the following:
Risk Tolerance: If this is the first time you’re using this technology, choose a domain where there is a certain tolerance for risk. The application should not be one that is critical to the organization’s operations and should, instead, seek to provide a convenience or efficiency gain to its teams.
Human Review: A law firm has been quoted as saying that they are comfortable using this technology to create a first draft of a contract, in the same way they would be comfortable
delegating such a task to a junior associate. This is because any such document will go
through many rounds of review thereafter, minimizing the risk that any error could slip
through.
Text (or Code) Intensive: It’s important to lean on the strengths of these models and set them to work on text-intensive or code-intensive tasks — in particular those that are “unbounded,” such as generating sentences or paragraphs of text. This is in contrast to “bounded” tasks, such as sentiment analysis, where existing, purpose-built tools will provide excellent results at lower cost and complexity.
Business Value: As always, and perhaps especially when there is a lot of excitement around a new technology, it is important to come back to basics and ask whether the application is actually valuable to the business. LLMs can do many things, whether those things are valuable or not is a separate question.
How to Use LLMs in Your Enterprise
Using an LLMs in the enterprise, beyond the simple web interface provided by products like ChatGPT, can be done in one of two ways:
Option-1.
Model-as-a-Service via API
Making an API call to a model provided as a service, such as the GPT-3 models provided
by OpenAI, including the gpt-3.5-turbo or gpt-4 models that powers ChatGPT. Generally, these services are provided by specialized companies like OpenAI, or large cloud computing companies like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
Option-2.
Self Managing and Open-Source Model
Downloading and running an open- source model in an environment that you manage. Platforms like Hugging Face aggregate a wide range of such models.
Each approach has advantages and drawbacks, which we’ll explore in the following sections. And remember, both options allow you to choose from smaller and larger models, with tradeoffs in terms of the breadth of their potential applications, the sophistication of the language generated, and the cost and complexity of using the model.
Advantages of Using Model-as-a-Service via API (Option-1)
Some companies (such as OpenAI, AWS, and GCP) provide public APIs that can be called programmatically. This means setting up a small software program, or a script, that will connect to the API and send a properly formatted request.
The API will submit the request to the model, which will then provide the response back to the API, which will in turn send it back to the original requester.
There are several advantages to this approach:
1. Low barrier to entry: Calling an API is a simple task that can be done by a regular developer in a matter of hours.
2. More sophisticated models: The models behind the API are often the largest and most sophisticated versions available. This means that they can provide more sophisticated and accurate responses on a wider range of topics than smaller, simpler models.
3. Fast responses: Generally, these models can provide relatively quick responses (on the order of seconds) allowing for real-time use.
Limitations to Using a Model-as-a-Service via API (Option-1)
The use of public models via API, while convenient and powerful, has limitations that may make it inappropriate for certain enterprise applications:
1. Data residency and privacy: By nature, public APIs require that the content of the query be sent to the servers of the API service. In some cases, the content of the query may be retained and used for further development of the model. Enterprises should be careful to check if this architecture respects their data residency and privacy obligations for a given use case.
2. Potentially higher cost: Most of the public APIs are paid services, whereby the user is charged based on the number of queries and the quantity of text submitted (you are charged for a summary of "prompt" and "competition", in other words, you are charged for what you send to the model and what you receive from the model). The companies providing these services usually provide tools to estimate the cost of their use. They also often provided smaller and cheaper models which may be appropriate for a narrower task.
3. Dependency: The provider of an API can choose to stop the service at any time, though such a decision is typically rare for a popular (and money-making) service. Smaller companies providing such services may be at risk of insolvency. Enterprises should weigh the risk of building a dependency on such a service and ensure that they are comfortable with it.
Advantages of Self Managing an Open-Source Model (Option-2)
Given the drawbacks of using a public model via API, it may be appropriate for a company to set up and run an open-source model themselves. Such a model can be run on on-premises servers that an enterprise owns or in a cloud environment that the enterprise manages (f.e. Azure, GCP, AWS).
Advantages of this approach include:
1. Data privacy and security: By self-hosting the LLM, enterprises have full control over their data, minimizing the risk of data breaches or unauthorized access. This can be particularly important for organizations handling sensitive information or operating in industries with strict regulatory requirements.
2. Wide range of choice: There are many open-source models available, each of which presents its own strengths and weaknesses. Companies can choose the model that best suits their needs. That said, doing so requires some familiarity with the technology and how to interpret those tradeoffs.
3. Potentially lower cost: In some cases, running a smaller model that is more limited in its application may allow for the desired performance for a specific use case at much lower cost than using a very large model provided as a service.
4. Independence: By running and maintaining open-source models themselves, organizations are not dependent on a third-party API service.
What are the Tradeoffs to Self Managing an Open-Source Model ?
While there are many advantages to using an open-source model, it may not be the appropriate choice for every organization or every use case for the following reasons:
Complexity: Setting up and maintaining a LLM requires a certain degree of data science and engineering expertise. Organizations should evaluate if they have sufficient expertise, and if those experts have the necessary time to set up and maintain the model in the long run.
Narrower performance: The very large models provided via public APIs are astonishing in the breadth of topics that they can cover. Models provided by the open-source community are generally smaller and more focused in their application, though this may change as ever-larger models are built by the open source community.
Security: When self-managing an open-source model, it is important to ensure that it is secure and protected from external threats. This requires dedicated resources and expertise in cybersecurity to ensure that the model is safe from data breaches or malicious attacks.
Customization: While open-source models offer a high degree of flexibility and customization, it also means that organizations need to devote resources to tailor the model to their specific needs. This requires a deep understanding of the model's architecture and algorithms, as well as data science expertise to fine-tune the model for optimal performance.
Upgrades and maintenance: Open-source models require regular upgrades and maintenance to keep up with the latest advancements and security patches. Organizations need to allocate resources to keep the model up-to-date and running smoothly, which can be a significant investment.
Support: When using an open-source model, organizations need to rely on community support for troubleshooting and technical assistance. This may be a challenge for organizations that require timely support or need to resolve critical issues quickly.
Choosing an Approach
Given the tradeoffs between the different approaches, how can an organization choose the one that is right for them?
In fact, there is no single approach that will be appropriate enterprise wide. The best strategy for most organizations will be to equip themselves with the means to choose the best model and architecture for a given application or use case.
In certain cases, the balance may tip toward using the model-as-a-serviceAPIs, in others, the use of an open-source model may be more appropriate. In some cases, a very large model may be required — in others, a smaller model may suffice.
The companies that are most successful in using LLMs will be those that equip themselves with the ability to choose and apply the right approach and the right model for a given application, especially given the rapid pace of innovation in this space.
Use Case Example 1: Summarizing Physicians' Notes
We have developed a demonstration project that aggregates and summarizes physicians notes within an electronic medical record system.
Using the open source BART model that has been trained on medical literature, the system is able to automate the drudgery of summarizing the copious physician’s notes into short patient histories. Importantly, the detailed notes remain in the system, allowing a consulting physician to get the full detail when needed.
Before use in any production context, the a selection of the summaries will be reviewed by a human expert to ensure that they are functioning as expected and capturing the most relevant information.
Use Case Example 2: Documentation Search
As an internal test of the technology, we has used OpenAI’s GPT- 3 model to provide full text responses on the full contents of Dataiku’s (Dataiku is a company that provides enterprise AI/ML platform) extensive public documentation, knowledge base, and community posts.
This was an appropriate use case for leveraging a third-party model as all of the data is public already, and the large third-party model will be well- suited to responding to the many different ways that users may phrase their questions.
The results have been impressive, providing easy-to-understand summaries and often offering helpful context to the response. Users have reported that it is more effective than simple links back to the highly technical reference documentation.
Use Case Example 3: HR Policy and Procedure Query Resolution
We've designed a demonstration project that applies the GPT-4 model for assisting employees in understanding complex HR policies and procedures. The system is designed to interpret and respond to a wide range of policy-related queries, providing instant, easy-to-understand responses.
This not only ensures that all employees have access to accurate HR information at their fingertips but also reduces the burden on the HR department, freeing them to focus on more strategic initiatives. Before deploying in a production environment, a selection of the responses will be thoroughly reviewed by HR professionals to ensure accuracy and compliance with company policies.
Use Case Example 4: Contract Review and Summary
Leveraging the power of the GPT-4 model, we have developed a prototype that can read, understand, and summarize lengthy legal contracts. It's designed to identify key clauses, obligations, and rights within a contract, providing a succinct summary that can be reviewed in a fraction of the time.
This system not only accelerates the contract review process but also mitigates the risk of human error. Despite this, the detailed contract remains accessible for a full legal review when necessary. Before the system is used in a production setting, a selection of the contract summaries will be scrutinized by legal experts to ensure they capture all crucial information.
Use Case Example 5: Real-Time Market Intelligence
We have utilized the GPT-4 model to develop an advanced Market Intelligence tool. It's designed to sift through vast amounts of global market data, news articles, social media chatter, and financial reports to produce digestible, real-time insights for businesses.
The tool can alert decision-makers to emerging trends, shifts in consumer sentiment, competitive moves, and potential risks, offering them a competitive edge. In advance of production use, a sample of generated insights will be assessed by market analysts to validate their relevance and accuracy.
Use Case Example 6: Learning and Development Personalization
Taking advantage of GPT-4, we've built a prototype for a personalized learning and development platform. This platform uses the model to understand the unique learning needs and styles of each employee, and curates personalized learning resources and programs accordingly.
It can also generate quizzes and interactive content to reinforce learning and track progress. Before deploying this in a production environment, a selection of the personalized learning paths and content will be thoroughly reviewed by learning and development professionals to ensure they meet the organization's standards and align with the learning objectives.
These examples illustrate how large language models like GPT-4 can be leveraged to provide personalized, real-time insights and learning experiences in an enterprise setting.
How to Use LLMs Responsibly
Using LLMs responsibly requires similar steps and considerations as the use of other machine learning and AI technologies:
Understand how the model is built. All machine learning and neural networks, including those used in LLMs, are influenced by the data that they are trained on. This can introduce biases that need to be corrected for.
Understand how the model will impact the end user. LLMs, in particular, present the risk that an end user may believe that they are interacting with a human when, in fact, they are not. We recommend that organizations disclose to end users where and how these technologies are being used. Organizations should also provide guidance to end users on how to interact with any information derived from the model and provide caveats for the overall quality and factual accuracy of the information. With this information, the end user is well-equipped to decide for themselves how best to interpret the information.
Establish a process for approving which applications are appropriate uses for each technology. The decision of whether or not to use a particular model for a given application should not be made by individual data scientists. Instead, organizations should set up principles for which technologies should — and should not — be used in what contexts, with a consistent review process by leaders who are accountable for the outcomes.
Keep track of which technology is being used for which application. AI governance principles are meant to both prevent problems from arising and to ensure that there is an auditable history of which technology has been used in the event that a problem occurs and needs to be reviewed. Tools, including AI platforms like Dataiku, C3 AI, Databricks, Microsoft Azure AI, Google AutoML can help serve as that central control tower.
How to kick off your own project?
Interested in leveraging the power of Large Language Models for your LLM project? Sign up now for a free consultation to discuss your project requirements and explore the possibilities.
Σχόλια