|
Blog
In the dynamic field of AI, large language models (LLMs) have become crucial for a variety of applications, including content creation and problem-solving. We’ve seen widespread adoption of language models as “assistants” in various domains, such as healthcare, education, and content creation. However, in educational contexts, these models often fall short of providing an optimal learning experience. They tend to generate complete solutions upfront, robbing students of the opportunity to engage in the step-by-step reasoning process that is crucial for deep understanding.
This raises an important question – Can we get these systems to evaluate in a ‘Socratic’ way?
In other words, can we develop systems that encourage students to think step-by-step, rather than generating complete solutions upfront?
While there has been significant work on making LLMs reason step-by-step (e.g., chain-of-thought prompting), to our knowledge, there isn’t an existing framework to build systems that allow users to think through problems step-by-step while having the LLM assist in a pedagogical way. This is where the concept of “lazy evaluation” in language models comes into play, offering a more Socratic approach to AI-assisted tasks.
This work demonstrates an approach to building a framework for forcing models to evaluate in a lazy way, drawing inspiration from functional programming concepts.
Traditional tutoring methods often involve guiding students through problems step-by-step, allowing them to think critically and make connections on their own. AI tutors should aim to replicate this approach rather than simply providing answers.
Generating complete solutions upfront is computationally expensive, especially for complex problems. A lazy evaluation approach can significantly reduce resource usage by generating only the necessary information on demand.
Students have varying levels of understanding and may require different amounts of guidance. A lazy evaluation system can adapt to each student's needs, providing more or less detail as required.
By revealing information gradually, we can maintain student engagement and encourage active participation in the problem-solving process.
In many real-world scenarios, solutions are not immediately apparent and must be approached incrementally. Training students to think in this way prepares them for challenges beyond the classroom.
pip install lazy_lm
from dotenv import load_dotenv
import os from anthropic
import AnthropicVertex from lazy_lm.core
import lazy
load_dotenv()
project_id = os.getenv(“PROJECT_ID”)
location = os.getenv(“PROJECT_LOCATION”)
# Initialize the Anthropic client
client = AnthropicVertex(project_id=project_id, region=location)
lazy_lm = client.lazy(“What is the derivative of `2x^3 + x^2 + 2x + 1`? Give me the solution step-by-step”)
# Get the current step
print(lazy_lm.get_current_step())
"""
What is the derivative of `2x^3 + x^2 + 2x + 1`? Give me the solution step-by-step
"""
# Get the next step
print(lazy_lm.get_next_step())
"""
To find the derivative of the given function, we’ll use the power rule and the constant rule of differentiation. Let’s start with the first term:
Step 1: Find the derivative of 2x^3 The power rule states that for a term ax^n, the derivative is nax^(n-1).
For 2x^3, we have:
a = 2, n = 3
So, the derivative of 2x^
"""
# Query the current step
print(lazy_lm.ask_question(“I don’t understand this step”))
"""
I apologize for any confusion. I’d be happy to explain this step in more detail without advancing to the next step.
In this step, we’re focusing on finding the derivative of the first term in the given expression, which is 2x^3.
To do this, we're using the power rule of differentiation. The power rule states that for a term in the form ax^n (where 'a' is a constant and 'n' is the power)
"""
# Get the next step
print(lazy_lm.get_next_step())
"""
Step 2: Complete the derivative of 2x^3
Continuing from the previous step, we apply the power rule to 2x^3:
The derivative of 2x^3 is:
3 · 2x^(3-1) = 3 · 2x^2 = 6x^2
"""
The concept of lazy evaluation is well-established in functional programming languages, where the evaluation of an expression is only done when the value (or terminal) of the expression is needed. This is also known as call-by-need. The contrast of this evaluation strategy is what's called "eager" evaluation or strict evaluation. Eager evaluation evaluates all of the subexpressions of an expression regardless of whether the value is used or not.
For example the expression:
func :: int -> int -> int func a b = a
will always just pass back the first argument to the function
>> func (2+2) 100 4
In an eager (or strict) language like python. the evaluation of that function would look something like this
func((2+2), (100-1000000)) func(4. (100-1000000)) func(4. 10000000000000000000000000000.....) 4
That is a lot of wasted computation just for a function that returns the first argument!
Programming languages that support lazy evaluation have an evaluation strategy that looks more like this:
func((2+2). (100-1000000)) (2+2) 4
Our language "knows" that we don't need the second argument, so why bother evaluating it?
Language models are inherently eager: given a prompt, they will continue to generate tokens until reaching an end-of-sequence token. However, this behavior is not always desirable, especially in educational contexts.
For example, given a prompt such as:
"What is the derivative of 2x^3 + x^2 + 2x + 1? Give me the solution step-by-step"
A language model will generate the entire sequence of steps in one go. Something like this: Given function: f(x) = 2x^3 + x^2 + 2x + 1
Step 1: Differentiate each term separately using the power rule and constant rule.
The power rule states that the derivative of x^n is nx^(n-1).
The constant rule states that the derivative of a constant is 0.
a) Differentiate 2x^3:
d/dx(2x^ ^ 3)=2^ * 3x ^ (3 - 1) = 6x ^ 2
b) Differentiate x^2:
d/dx (x ^ 2) = 2x ^ (2 - 1) = 2x
c) Differentiate 2x:
d/dx (2x) = 2
d) Differentiate 1:
d/dx (1) = 0
Step 2: Combine the results from each term.
f' * (x) = 6x ^ 2 + 2x + 2 + 0
Step 3: Simplify the expression.
f' * (x) = 6x ^ 2 + 2x + 2
Therefore, the derivative of 2x ^ 3 + x ^ 2 + 2x + 1 is 6x ^ 2 + 2x + 2
In the context of creating an application that can assist students with learning this content, a full trace will likely be suboptimal in facilitating a productive learning environment.
More abstractly, we can frame the problems as a problem initialization, a sequence of steps, and the final solution.
sp_1 -> sp_2 -> sp_3 -> ... -> sp_n, where sp_i is the sub problem at step i for problem p
With some light definition in place we can frame the desired evaluation strategy like this:
past(sp_1) -> curr(sp_2) -> future(sp_3 -> ... -> sp_n)
where past are the steps that have already been evaluated, curr is the current step that has been evaluated, and future are all of the future steps to be evaluated.
To a language model, this is all just token sequences.
TokenSequence_1 -> TokenSequence_2 -> TokenSequence_3 -> ... -> TokenSequence_n | sp_1 | | sp_2 | | sp_3 | ... | sp_n |
Leveraging a model's KV cache for memoizing the compute done on TokenSequences that have already been computed we can frame our evaluation strategy to look much more "lazy".
memoized (TokenSequence_1 -> TokenSequence_2) -> TokenSequence_3 -> future (TokenSequence_4 -> ... -> TokenSequence_n)
where we can get roughly the desired evaluation strategy having the LLM compute the next sequence it samples as a sub problem and nothing more.