Using GPT models
with Python

Tips for navigating the slides:

Press O or Escape for overview mode.
Visit this link for a nice printable version
Press the copy icon on the upper right of code blocks to copy the code

About me

Photo of Pamela smiling with an Olaf statue

Python Cloud Advocate at Microsoft

Formerly: UC Berkeley, Khan Academy, Woebot, Coursera,Google

Find me online at:

Mastodon	@pamelafox@fosstodon.org
Twitter	@pamelafox
GitHub	www.github.com/pamelafox
Website	pamelafox.org

LLMs & GPTs

The history of AI

AI box with ML box inside with Deep Learning box inside with Generative AI inside

1956: Artificial Intelligence:
The field of computer science that seeks to create intelligent machines that can replicate or exceed human intelligence
1997: Machine Learning:
Subset of AI that enables machines to learn from existing data and improve upon that data to make decisions or predictions
2017: Deep Learning:
A machine learning technique in which layers of neural networks are used to process data and make decisions
2021: Generative AI:
Create new written, visual, and auditory content given prompts, often using Large Language Models or Diffusion models

Large Language Models (LLMs)

An LLM is a model that is so large that it achieves general-purpose language understanding and generation.

Diagram of sentiment classification task using input prompting

Graphs comparing model scale to accuracy on tasks

From 📖 Characterizing Emergent Phenomena in LLMs

LLMs in the wild

Product	LLM
ChatGPT	GPT-3.5/4
Web search
Google: Bard	PaLM 2
Microsoft: Bing Chat	ChatGPT 3.5/4
Code
GitHub Copilot	ChatGPT 3.5/4
Amazon CodeWhisperer	?
Productivity
Microsoft Word	ChatGPT 3.5/4
Neptyne	ChatGPT 3.5/4

Generative Pretrained Transformer (GPT)

Diagram of multiple attention heads on tokens in a sentence

GPT models are LLMs based on Transformer architecture from:
📖 "Attention is all you need" paper
by Google Brain

Learn more:

Andrej Karpathy: 🎥 State of GPT
Andrej Karpathy: 🎥 Let's build GPT: from scratch, in code

GPT and GPT-like models

Company	Model	Parameters
OpenAI	GPT-3.5	175B
OpenAI	GPT-4	Undisclosed
Google	PaLM	540B
Meta	LlaMA	70B
Anthropic	Claude 2	130B

🔗 OpenAI models overview

Demo: Azure OpenAI Playground

Calling GPT models
from Python

A raccoon conjuring Python from their laptop (like a Snake charmer)

OpenAI API

Request access from openai.com or Azure OpenAI.

Once you have access, you can use the API from Python or any other language.

Install the OpenAI Python library:


                pip install openai

OpenAI API authentication

For openai.com OpenAI, set your API key:


                openai.api_type = "openai"
                openai.api_key = os.getenv("OPENAI_KEY")

For Azure OpenAI, use Azure default credentials:


                openai.api_type = "azure_ad"
                openai.api_base = "https://cog-7bsyuhdvztows.openai.azure.com/"
                default_credential = azure.identity.DefaultAzureCredential()
                openai.api_key =  default_credential.get_token(
                    "https://cognitiveservices.azure.com/.default").token

Call the Chat Completion API

Using ChatCompletion.create():


                response = openai.ChatCompletion.create(
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ],
                    max_tokens=400,
                    temperature=1,
                    top_p=0.95)

                print(response["choices"][0]["message"]["content"])

Stream the response


                response = openai.ChatCompletion.create(
                    stream=True,
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ])
                
                for event in response:
                    print(event.choices[0].delta.content)

Use asynchronous calls

Using Python async/await constructs:


                response = await openai.ChatCompletion.acreate(
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ])

Learn more: 📖 Best practices for OpenAI Chat apps: Concurrency

ChatGPT: Pros and Cons

Pros:

Creative 😊
Great with patterns
Good at syntax (natural and programming)

Cons:

Creative 😖
Makes stuff up (unknowingly)
Limited context window (4K-32K)

Retrieval Augmented Generation

A raccoon that looks like Neo from Matrix movie

Retrieval Augmented Generation (RAG)

Use a retrieval system to find the best context for the generation model.

Retrieval + Generation

Retrieval system (Search)	➡ Generative model (LLM)
Organize knowledge to fit needs of models Retrieve relevant information Ensure data freshness Enforce access control	Summarize information Answer questions Suggest follow-up questions

Demo: OpenAI + Cognitive Search

github.com/Azure-Samples/azure-search-openai-demo

RAG flow

RAG: Search step

Query Azure Cognitive Search using both text and vectors:

Search flow: vectors and keywords, combined with RRF algorithm, then semantic re-ranker step


                r = await self.search_client.search(
                    query_text,
                    query_type=QueryType.SEMANTIC,
                    top=top,
                    vector=query_vector,
                    vector_fields="embedding",
                )

                results = [doc["sourcepage"] +
                            ": " + doc["content"]
                           async for doc in r]
                
                content = "\n".join(results)

RAG: Search results

Use the search results to create a prompt for the LLM:


                messages = [system_prompt]
                messages.extend(few_shots)
                user_content = f"{q}\nSources:\n {content}"
                messages.append({"role": "user", "content": user_content})

                chat_completion = await openai.ChatCompletion.acreate(
                    deployment_id=self.chatgpt_deployment,
                    model=self.chatgpt_model,
                    messages=messages,
                    temperature=0.3,
                    max_tokens=1024,
                    n=1,
                )

More resources

Explanations:
- How GPT tokenizers work
- Embeddings 101
Prompting libraries:
Samples:
- Azure ChatGPT Quickstart
- Azure Search + ChatGPT

Microsoft for Startups Founders Hub

Get $150k of Azure credits to access OpenAI GPT-3.5 Turbo and GPT-4 through Azure OpenAI Service
Experiment with LLMs for free with $2,500 in OpenAI credits
Receive 1:1 advice from Microsoft AI experts
Free access to development and productivity tools like GitHub, Microsoft 365, LinkedIn Premium, and more

Any questions?

A bunch of raccoon students with computers

Using GPT models with Python