Python Cloud Advocate at Microsoft
Formerly: UC Berkeley, Khan Academy, Woebot, Coursera,Google
Find me online at:
Mastodon | @pamelafox@fosstodon.org |
@pamelafox | |
GitHub | www.github.com/pamelafox |
Website | pamelafox.org |
An LLM is a model that is so large that it achieves general-purpose language understanding and generation.
Product | LLM | ChatGPT | GPT-3.5/4 |
---|---|
Web search | |
Google: Bard | PaLM 2 |
Microsoft: Bing Chat | ChatGPT 3.5/4 |
Code | |
GitHub Copilot | ChatGPT 3.5/4 |
Amazon CodeWhisperer | ? |
Productivity | |
Microsoft Word | ChatGPT 3.5/4 |
Neptyne | ChatGPT 3.5/4 |
GPT models are LLMs based on Transformer architecture from:
📖 "Attention is all you need" paper
by Google Brain
Learn more:
Company | Model | Parameters |
---|---|---|
OpenAI | GPT-3.5 | 175B |
OpenAI | GPT-4 | Undisclosed |
PaLM | 540B | |
Meta | LlaMA | 70B |
Anthropic | Claude 2 | 130B |
Request access from openai.com or Azure OpenAI.
Once you have access, you can use the API from Python or any other language.
Install the OpenAI Python library:
pip install openai
For openai.com OpenAI, set your API key:
openai.api_type = "openai"
openai.api_key = os.getenv("OPENAI_KEY")
For Azure OpenAI, use Azure default credentials:
openai.api_type = "azure_ad"
openai.api_base = "https://cog-7bsyuhdvztows.openai.azure.com/"
default_credential = azure.identity.DefaultAzureCredential()
openai.api_key = default_credential.get_token(
"https://cognitiveservices.azure.com/.default").token
Using ChatCompletion.create():
response = openai.ChatCompletion.create(
messages = [
{"role":"system",
"content":"You are a helpful assistant.."
},
{"role":"user",
"content":"What can I do on my trip to Tokyo?"
}
],
max_tokens=400,
temperature=1,
top_p=0.95)
print(response["choices"][0]["message"]["content"])
response = openai.ChatCompletion.create(
stream=True,
messages = [
{"role":"system",
"content":"You are a helpful assistant.."
},
{"role":"user",
"content":"What can I do on my trip to Tokyo?"
}
])
for event in response:
print(event.choices[0].delta.content)
Using Python async/await constructs:
response = await openai.ChatCompletion.acreate(
messages = [
{"role":"system",
"content":"You are a helpful assistant.."
},
{"role":"user",
"content":"What can I do on my trip to Tokyo?"
}
])
Learn more: 📖 Best practices for OpenAI Chat apps: Concurrency
Pros:
Cons:
Use a retrieval system to find the best context for the generation model.
Retrieval system (Search) | ➡ Generative model (LLM) |
---|---|
|
|
github.com/Azure-Samples/azure-search-openai-demo
Query Azure Cognitive Search using both text and vectors:
r = await self.search_client.search(
query_text,
query_type=QueryType.SEMANTIC,
top=top,
vector=query_vector,
vector_fields="embedding",
)
results = [doc["sourcepage"] +
": " + doc["content"]
async for doc in r]
content = "\n".join(results)
Use the search results to create a prompt for the LLM:
messages = [system_prompt]
messages.extend(few_shots)
user_content = f"{q}\nSources:\n {content}"
messages.append({"role": "user", "content": user_content})
chat_completion = await openai.ChatCompletion.acreate(
deployment_id=self.chatgpt_deployment,
model=self.chatgpt_model,
messages=messages,
temperature=0.3,
max_tokens=1024,
n=1,
)
Sign up in minutes at startups.microsoft.com