Using GPT models
with Python

Tips for navigating the slides:
  • Press O or Escape for overview mode.
  • Visit this link for a nice printable version
  • Press the copy icon on the upper right of code blocks to copy the code

About me

Photo of Pamela smiling with an Olaf statue

Python Cloud Advocate at Microsoft

Formerly: UC Berkeley, Khan Academy, Woebot, Coursera,Google


Find me online at:

Mastodon @pamelafox@fosstodon.org
Twitter @pamelafox
GitHub www.github.com/pamelafox
Website pamelafox.org

LLMs & GPTs

A raccoon studying robotics

The history of AI

AI box with ML box inside with Deep Learning box inside with Generative AI inside
  • 1956: Artificial Intelligence​:
    The field of computer science that seeks to create intelligent machines that can replicate or exceed human intelligence
  • 1997: Machine Learning:​
    Subset of AI that enables machines to learn from existing data and improve upon that data to make decisions or predictions​
  • 2017: Deep Learning​:
    A machine learning technique in which layers of neural networks are used to process data and make decisions​
  • 2021: Generative AI:
    Create new written, visual, and auditory content given prompts, often using Large Language Models or Diffusion models

Large Language Models (LLMs)

An LLM is a model that is so large that it achieves general-purpose language understanding and generation.

Diagram of sentiment classification task using input prompting
Graphs comparing model scale to accuracy on tasks

From 📖 Characterizing Emergent Phenomena in LLMs

LLMs in the wild

Product LLM
ChatGPT GPT-3.5/4
Web search
Google: Bard PaLM 2
Microsoft: Bing Chat ChatGPT 3.5/4
Code
GitHub Copilot ChatGPT 3.5/4
Amazon CodeWhisperer ?
Productivity
Microsoft Word ChatGPT 3.5/4
Neptyne ChatGPT 3.5/4

Generative Pretrained Transformer (GPT)

Diagram of multiple attention heads on tokens in a sentence

GPT models are LLMs based on Transformer architecture from:
📖 "Attention is all you need" paper
by Google Brain


Learn more:

GPT and GPT-like models

Company Model Parameters
OpenAI GPT-3.5 175B
OpenAI GPT-4 Undisclosed
Google PaLM 540B
Meta LlaMA 70B
Anthropic Claude 2 130B

🔗 OpenAI models overview

Demo: Azure OpenAI Playground

Screenshot of Azure OpenAI Playground

Calling GPT models
from Python

A raccoon conjuring Python from their laptop (like a Snake charmer)

OpenAI API

Request access from openai.com or Azure OpenAI.

Once you have access, you can use the API from Python or any other language.

Install the OpenAI Python library:


                pip install openai
                

OpenAI API authentication

For openai.com OpenAI, set your API key:


                openai.api_type = "openai"
                openai.api_key = os.getenv("OPENAI_KEY")
                

For Azure OpenAI, use Azure default credentials:


                openai.api_type = "azure_ad"
                openai.api_base = "https://cog-7bsyuhdvztows.openai.azure.com/"
                default_credential = azure.identity.DefaultAzureCredential()
                openai.api_key =  default_credential.get_token(
                    "https://cognitiveservices.azure.com/.default").token
                

Call the Chat Completion API

Using ChatCompletion.create():


                response = openai.ChatCompletion.create(
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ],
                    max_tokens=400,
                    temperature=1,
                    top_p=0.95)

                print(response["choices"][0]["message"]["content"])
                

Stream the response


                response = openai.ChatCompletion.create(
                    stream=True,
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ])
                
                for event in response:
                    print(event.choices[0].delta.content)
                

Use asynchronous calls

Using Python async/await constructs:


                response = await openai.ChatCompletion.acreate(
                    messages = [
                        {"role":"system",
                        "content":"You are a helpful assistant.."
                        },
                        {"role":"user",
                        "content":"What can I do on my trip to Tokyo?"
                        }
                    ])
                

Learn more: 📖 Best practices for OpenAI Chat apps: Concurrency

ChatGPT: Pros and Cons

Pros:

  • Creative 😊
  • Great with patterns
  • Good at syntax (natural and programming)

Cons:

  • Creative 😖
  • Makes stuff up (unknowingly)
  • Limited context window (4K-32K)

Retrieval Augmented Generation

A raccoon that looks like Neo from Matrix movie

Retrieval Augmented Generation (RAG)

Use a retrieval system to find the best context for the generation model.

RAG diagram

Retrieval + Generation

Retrieval system (Search) ➡ Generative model (LLM)
  • Organize knowledge to fit needs of models
  • Retrieve relevant information
  • Ensure data freshness
  • Enforce access control
  • Summarize information
  • Answer questions
  • Suggest follow-up questions

Demo: OpenAI + Cognitive Search

github.com/Azure-Samples/azure-search-openai-demo

RAG demo

RAG flow

RAG flow: User question, document search, LLM, response

RAG: Search step

Query Azure Cognitive Search using both text and vectors:

Search flow: vectors and keywords, combined with RRF algorithm, then semantic re-ranker step

                r = await self.search_client.search(
                    query_text,
                    query_type=QueryType.SEMANTIC,
                    top=top,
                    vector=query_vector,
                    vector_fields="embedding",
                )

                results = [doc["sourcepage"] +
                            ": " + doc["content"]
                           async for doc in r]
                
                content = "\n".join(results)
                

RAG: Search results

Use the search results to create a prompt for the LLM:


                messages = [system_prompt]
                messages.extend(few_shots)
                user_content = f"{q}\nSources:\n {content}"
                messages.append({"role": "user", "content": user_content})

                chat_completion = await openai.ChatCompletion.acreate(
                    deployment_id=self.chatgpt_deployment,
                    model=self.chatgpt_model,
                    messages=messages,
                    temperature=0.3,
                    max_tokens=1024,
                    n=1,
                )
                

More resources

Raccoons with laptops

Microsoft for Startups Founders Hub

Sign up in minutes at startups.microsoft.com

  • Get $150k of Azure credits to access OpenAI GPT-3.5 Turbo and GPT-4 through Azure OpenAI Service
  • Experiment with LLMs for free with $2,500 in OpenAI credits
  • Receive 1:1 advice from Microsoft AI experts
  • Free access to development and productivity tools like GitHub, Microsoft 365, LinkedIn Premium, and more

Any questions?

A bunch of raccoon students with computers