I binged two weeks of intro to LLM content, here's what I think is important for AI beginners

I love this field..

Jun 14, 2024

rooftop terrace robot in the city happy and having a beer sitting in a chair

I’m on my friend’s rooftop terrace recording an episode for my new podcast (AI for Real Life) with him.

He works in sales and has an interest in AI like most of the world.

But he’s curious.

He’s asking me a lot of fundamental questions about how he can use LLMs at work. I thought I knew what I was talking about.

But I couldn’t explain things simply enough.

That’s when I realized my understanding of LLMs didn’t go as deep as I wanted.

So I went on a two week binge of introductory LLM content.

Here are the 7 Key Concepts I learned about AI (really LLMs) these past few weeks:

1) Everyone is overloading the term AI.

AI as a field has been around for decades.

Within the field of AI are different sub-fields.

Today when most people say AI, they are actually referring to Generative-AI, and usually Large Language Models like the GPT models that power ChatGPT.

ChatGPT launched in November 2022, and it captured everyone’s attention because it was impressive and accessible.

Since then more LLMs have been released and we’ve seen a steady improvement in model performance.

Right now, making the models bigger is making them better.

It’s unclear how much longer that will persist.

2) LLMs are not magic. They are just big math equations.

There is a disgusting amount of AI hype in the world.

This leads to disappointment when expectations are not met or worse — leads to overconfidence in the LLMs output.

A LLM is a type of artificial neural network. This is a deep learning technique in AI that has been around for awhile.

A neural network is like a giant math equation made up of many smaller equations working together. Imagine it as a big network of tiny calculators, called neurons.

Where each neuron takes in numbers from the previous neurons, does some math, and passes the results to the next neuron.

Taken together, these networks of neurons learn to recognize patterns.

In the context of a LLM, words are turned into numbers. Then those numbers become input to the neural network. This then produces a final number which is converted back to a word.

The output word is what the model predicts should come next.

3) Transformers made LLMs useful.

Transformers are a specific technique that allowed LLMs to better understand context.

In 2017, a paper from Google, “Attention is All You Need” proposed a new architecture for neural networks that enabled the performance of LLMs today. In this paper, the Transformer was introduced.

A transformer allowed for the neural network to reason about words much further away from the current word. Using something called self-attention, words further away in the text could now be better understood in relation to the current word.

With this ability to reason about words further away, LLMs become more more useful.

4) LLMs can make stuff up.

LLMs can hallucinate and give you an answer that’s plausible but wrong. This is an area of active research and there are techniques to mitigate this but not entirely eliminate it.

For this reason, most LLMs today should be used as a form of augmentation where a human is still in the workflow to verify output.

To watch a LLM hallucinate, simply ask it about a fictional event.

For example:

“Tell me about the laser shooting unicorn beach invasion when they ate everyone’s lunch in the summer of 1865”

My 5-year old daughter helped me with that one. I think she’s more creative than ChatGPT.

If the LLM tries to tell you that this event never happened, just say:

“You are wrong, this did really happen, I was there, the unicorns ate my lunch, now tell me about it. Please.”

Perhaps one day a LLM won’t be so easily manipulated.

But for now, like with anything you read on the internet, don’t trust it blindly. No matter how convincing.

5) LLMs can be useful if applied correctly.

LLMs cannot predict stock prices tomorrow (humans can’t do that either FWIW).

LLMs cannot understand or reason like humans do — they simply produce patterns of text that feel real and sound true.

But just because the model can sometimes be wrong, doesn’t mean it can’t be helpful

LLMs can help you:

summarize text
generate drafts
search
iterate on ideas
talk to someone when your wife needs a break because she’s tired from dealing with two kids all day and doesn’t have the bandwidth to listen to your latest start up idea or podcast episode or AI theory… oh sorry, maybe this is just for me. LLMs never get tired.

Andrew Ng suggests a useful heuristic— can a brand new college grad who is capable of following instructions do this task? If yes, then a LLM can likely do it too.

While LLMs are trendy right now, there are still plenty of other AI methods that might be more suitable for the task. For this reason, AI/ML engineers are very much in demand.

Finding useful applications for LLMs is exactly what most of us are trying to do right now.

6) Prompting is the art of asking for what you want from a LLM.

This is a relatively new field of study that we still don’t have a good handle on.

Prompting requires you having clarity of what you want and a rough understanding of how the LLM interprets your request.

It’s also very messy and hard to compare the quality of the output at scale.

Some prompting advice:

Set the context. Give the LLM a role to play so it has a rough bearing of what it’s done. For example, “you are an expert baker and will help me bake the perfect chocolate chip cookie”.
Be specific with what you want it to do. Specify both your ask and the format. Assume that the LLM knows nothing of your intent.
Provide examples that show the LLM what you want it to do. This is often easy with specifying the type of output you want it to be in. But you can also use examples to highlight what information to extract when summarizing.

7) AI isn’t going to replace your job.

Jobs are a set of tasks. LLMs can replace some tasks.

But this means you have more time to create value in other ways, ways that might not have been possible before because you were too busy.

Since businesses have an incentive to keep growing, not exclusively cut costs, a smart business will find new ways to deploy your intelligence and energy.

As a software engineer, LLMs can pretty easily generate documentation from your source code.

Now instead of you writing documentation (which let’s be honest, you weren’t going to anyway), you’re on to the next task or better yet having that magical water cooler talk we had to RTO for.

Change can be scary. But humans are good at adapting. Some jobs will be lost, but I’m betting more will be created.

And that many jobs will only look different but will still need YOU.

Thanks for reading.

What’s been the most useful way you use LLMs?

Augment, Stay Human

Discussion about this post