Math

Friday, March 12, 2010

The virial theorem in the workplace

There's a theorem from mathematical physics that may have an application in the workplace. This is the virial theorem, and its workplace analogy may explain why every job has its annoying parts.

One version of the virial theorem roughly says that for a finite collection of point particles interacting gravitationally, the time average of the kinetic energy is half the time average of the potential energy, or

<K> = - <U> / 2

The virial theorem is useful to astronomers because you can use it get a good idea of masses of distant objects, which you can't really observe, from their kinetic energy, which you can observe. The reason that we think that dark matter exists is basically from observations like those plus the virial theorem.

Driving in to work today, I had the thought that an appropriate analogy for the virial theorem the workplace might be that the bad parts of a job are always proportional to the good parts of a job. In my experience, this seems to have always been true.You probably have some relationship like this, for example:

<Bad> = - <Good> / 2

When I was an officer in the US Army, there were lots of good aspects of the job. There's nothing in the world as rewarding as working with soldiers, for example, and getting paid to work with explosives and fire guns is also lots of fun. To make up for this, however, there's also the fact that the military is really part of the government, so you're really part of a large, mind-numbingly bureaucratic organization.

Or when I used to do what's probably best called applied physics research, it was great fun working with things like lasers and electron microscopes. To make up for this, however, there's the never-ending battle that you have to fight to get funding for those expensive gadgets.

Or when I did mergers and acquisitions consulting, it was great fun getting a look inside lots of different companies in lots of different industries and seeing how they worked. The pay wasn't bad, either. To make up for this, however, there were the 20-hour days and the backstabbing from other consultants (particularly the lawyers) involved in the M&A projects that you had to keep a constant eye out for.

So although I'm not sure that you can write down a set of assumptions that lets you rigorously prove an analogy for virial theorem for the workplace, it certainly seems to be true. If there's a job out there for which it doesn't hold, I'd definitely like to hear about it.

Wednesday, March 10, 2010

The converse of the Nagell-Lutz theorem

The Nagell-Lutz tells us that rational points of finite order have integer coordinates, but it doesn't tell us that points with integer coordinates have finite order. As a reminder, here's the statement of the Nagell-Lutz theorem.

Let y2 = x3 + ax + b be an elliptic curve with integer coefficients and let D = 4 a3 + 27 b2. Then if P = (xP,yP) is a rational point of finite order then P has integer coordinates and either yP = 0 or yP2|D.

Here are some examples of points with integer coordinates that don't have finite order.

The point P = (1,2) is on the elliptic curve

y2 = x3 + 3

but (1,2) isn't a point of finite order. We have that

2P = (-23/16,-11/64)

for example. Since 2P doesn't have integer coordinates, it's not a point of finite order, so P isn't either.

For another example, consider the elliptic curve

y2 = x3 + 17

There are 16 points with integer coordinates on this curve. These are the following

(-2,±3)

(-1,±4)

(2,±5)

(4,±9)

(8,±23)

(43,±282)

(52,±375)

(5234,±378661)

Although we can find a few cases where adding these points gives another point with integer coordinates, like

(-2,3) + (-1,4) = (4,9)

most cases don't. We have that

(-1,4) + (-1,4) = (137/64, -2651/512)

for example.

Even worse, we have that

(5234,378661)  + (5234,378661)  = (187618163896928/143384152921, -1/4)

None of these points actually have finite order although they have integer coordinates. So points of finite order have to have integer coordinates, but not all points with integer coordinates have finite order.

Wednesday, February 24, 2010

An example of bad reduction mod p

Elliptic curves are a natural construction over the complex numbers, but curves over the complex numbers aren’t very useful in computing. For that, we need elliptic curves that are defined over a finite field. It turns out that an elliptic curve over the integers can be reduced to one over a finite field in most cases. In particular, if we have an elliptic curve defined by

y2 = x3 + ax + b

which has discriminant

D = -4a3 - 27 b2

then we want to look at what happens if reduce everything modulo a prime p. As long as p isn’t a factor of D, everything works fine. If p is a factor of D, however, then we get a singular curve when we reduce mod p. For example, the curve

y2 = (x - 3)(x - 8)(x + 11)

= x3 – 97x + 264

has no repeated roots over the complex numbers, which is reflected in its non-zero discriminant

D = 1768900

= 22 52 72 192

But because 19 is a factor of D, this curve becomes singular when we reduce everything modulo 19. In particular, we find that this curve becomes

y2 = x3 – 97x + 264

x3 + 17x + 17 (mod 19)

 ≡ (x + 11)2 (x + 16) (mod 19)

The cubic part of the curve has multiple roots so it's singular over GF(19). Modulo 19 this curve also has discriminant

D = -4(17)2 – 27(17)3

= -27455

≡ 0 (mod 19)

which also tells us that this curve is singular over GF(19).

Friday, February 19, 2010

Convergence of power series

Some thoughts on the convergence of power series - with pictures.

Consider the three functions

f1(x) = 1 / (1 - x2)

f2(x) = 1 / (1 + x2)

f3(x) = √(1 + x2)

If we expand each of these in a power series around x = 0 we find that

f1(x) = Σ x2n

f2(x) = Σ (-1)nx2n

f3(x) = Σ (-1)nx2n Binomial(1/2,n)

and that each converges for -1 < x < 1.

In the case of f1, it's easy to see why that's the case. Here's the graph of f1, and it behaves badly at -1 and 1.

Image001

In the case of f2, it's not immediately obvious why reaching -1 and 1 causes problems. Here's the graph of f2, and it certainly doesn't behave badly at all at these points.

Image002

On the other hand, the graph of what we get by summing the first few terms in the power series for f2 looks like this, which does behave badly at -1 and 1.

Image003

This is easy to explain. If we look at |f2| as a function of a complex variable, we see that f2 has poles at i and –i, and the location of those poles limits the radius of convergence of the power series.

Image004

In the case of f3, it's slightly more complicated. Here's what f3 looks like. It also doesn't behave badly at -1 or 1.

Image005

Here's what |f3| looks like as a function of a complex variable, and there aren't any poles there to cause problems. What's going on here?

Image006

What's causing problems in this case are the branch cuts that you need to define for f3. Here's a graph of the imaginary part of f3. Note that it's the locations of the branch cuts that limit the radius of convergence of the power series.

Image007
I don't recall being taught about branch cuts limiting the radius of convergence of a power series when I was in school. Like many other things, I'm not sure why this was overlooked.

Wednesday, February 17, 2010

Points of order three on elliptic curves

In an earlier post, we saw how it’s easy to tell which points on an elliptic curve y2 = x3 + ax + b have order 2. What about order 3? That’s not much harder. If we have 3P = O then 2P =  –P, and we can use what we know about points of order 2 to find out what happens for points of order 3.

Let’s write

P1= (x1,y1)

and

P2 = 2P1 = (x2,y2)

so that

 –P1 = (x1, –y1)

From the earlier post on point doubling we have that

x2 = (x14  – 2ax12  – 8bx1 + a2) / [4 (x13 + ax1 + b)]

If 2P1 =  –P1 then we have that

x2 = x1

or

(x14 – 2ax12 – 8bx1 + a2) / [4 (x13 + ax1 + b)] = x1

or

(x14 – 2ax12 – 8bx1 + a2) / [4 (x13 + ax1 + b)] - x1 = 0

so that

x14 – 2ax12 – 8bx1 + a2 – 4x144ax12 – 4bx1 = 0

or that

3x14 + 6ax12 + 12bx1a2 = 0

Example

For the elliptic curve y2 = x3 + 1 we have that the x-coordinates of the points of order 3 need to have the property that

3x4 + 6ax2 + 12bxa2 = 0

with a = 0 and b = 1 this means that we have

3 x4 + 12x  = 3x (x3 + 4) = 0

the only rational solution of which is x = 0. Thich corresponds to the points (0,1) and (0,-1) on the elliptic curve. Here’s what this looks like:

Image001

Another point of view

From the formula for doubling a point we get that

x2 = m2 – 2x1

where

m = y′ = (3x12 + a) / (2y1)

And because x2 = x1 we can write

x1 = m2 – 2x1

or that

m2 = 3x1

Now if we have that

y2 = x3 + ax + b

then we have that

2 y y′ = 3x2 + a

and that

2 y y′′  + 2 (y′)2 = 6x

so that

y′′ = (6x – 2 (y′)2) / (2y)

This means that we have y′′ = 0 when

6x – 2 (y′)2 = 0

or

(y′)2 = 3x

or

m2 = 3x

As we saw above, this happens at a point of order 3, so at a point of order 3 we have that y′′ = 0.

Tuesday, February 16, 2010

Using Pari for elliptic curves

I recently tried Pari, a piece of software that’s designed for doing number-theoretical calculations – like those you need to do in cryptography. After using it for a few hours, I have to say that I’m very impressed by it. It has lots of built-in functions for doing calculations involving elliptic curves, and because lots of the technology that we use at Voltage involves elliptic curves, I found that very useful.

On the other hand, Pari seems to assume that you already know a lot about things before you start using it. Here’s an example of defining the elliptic curve y2 = x3 + 1 and finding all of the points of finite order on the curve:

Pari 

The ellinit() function initializes a data structure for an elliptic curve, but what it tells you when it does this probably isn’t useful to most people. Here's how the Pari User's Guide explains the output of ellinit():

a1-a6,b2-b8,c4-c6: coefficients of the elliptic curve

area: volume of the complex lattice defining E

disc: discriminant of the curve

j: j-invariant of the curve

omega: [ω12], periods forming the basis of the complex lattice defining E1 is the real period and ω2 belongs to Poincare's half-plane).

eta: quasi-periods [η12] such that η1ω2 - η2ω1 = 2πi.

roots: roots of the associated Weierstrass equation

tate: [u2,u,v] in the notation of Tate

w: Mestre's w (this is technical)

The elltors() function finds all of the points of finite order. Its output is slightly more user-friendly, but it’s probably not obvious to most people what it’s telling you.

So overall, I’d have to say that I’m very impressed with Pari, and I’ll probably be using it a lot in the future. On the other hand, I can’t really recommend it for most people. If you feel comfortable reading Silverman’s The Arithmetic of Elliptic Curves, you’ll probably find it very useful. You'll also understand how to interpret the output of ellinit(). Otherwise, you might find its output a bit cryptic and tricky to interpret.

Wednesday, February 10, 2010

Visualizing complex multiplication

Here's an attempt to create a way to visualize elliptic curves with complex multiplication. With any luck, it will also give some insight into why the term "complex multiplication" is used.

An endomorphism of an elliptic curve is a homomorphism that maps points on the curve to other points on the curve. The multiplication-by-n maps

φn: PnP

all do this. For most elliptic curves, those are all the endomorphisms that exist, but some elliptic curves have other endomorphisms. Curves like those are said to have complex multiplication, and if we look at a few examples, it's not hard to see why it's called this.

Example 1

For the elliptic curve

y2 = x3 + x

the mapping given by

φ(x,y) = (-x,iy)

is an endomorphism that’s not a multiplication-by-n map. It also has the property that

φ2(P) = φ∘φ(P) = (x,-y) = -P

Because φ2 acts roughly like multiplication by -1, we can think of φ as acting roughly like multiplication by the complex number  i.

If we think of this elliptic curve as being parametrized by the Weierstrass ℘-function with periods ω1 and ω2, we find that

ω1 = 1.85407 – 1.85407 i

and

ω2 = 1.85407 + 1.85407 i

Note that ω2 = iω2.

Here’s what the lattice L of points nω1 + mω2 looks like:

Image001 

Here's what multiplication by i does to a typical point in this lattice:

CM
 

Note that this lattice has the property that iL = L, so the same multiplication by i that maps points on the elliptic curve to other points on the curve also maps points in the lattice to other points in the lattice.

Example 2

For the elliptic curve

y2 = x3 + 1

we have that the mapping given by

φ(x,y) = (ξx,y)

where

ξ = ei/3, a complex cube root of 1 that's also a root of x2 + x + 1 = 0

is an endomorphism that’s not a multiplication-by-n map. It also has the property that

φ3(x,y) = φ∘φ∘φ(x,y) = (x,y)

Because φ3 acts roughly like multiplication by 1, we can think of φ as acting roughly like multiplication by the complex number ξ.

If we think of this elliptic curve as being parametrized by the Weierstrass ℘-function with periods ω1 and ω2, we find that

ω1 = 2.10327 -1.21433 i

and

ω2 = 2.42865 i

Note that we have that ω2 = ξω1.

Here’s what the lattice L of points nω1 + mω2 looks like:

Image002
 

Here's what multiplication by ξ does to a typical point in this lattice:

CM

Note that this lattice has the property that ξL = L, so the same multiplication by ξ that maps points on the elliptic curve to other points on the curve also maps points in the lattice to other points in the lattice.

Example 3

Consider the elliptic curve

y2 = x3 - 4320 x + 96768

If we think of this elliptic curve as being parametrized by the Weierstrass ℘-function with periods ω1 and ω2, we find that

ω1 = -0.296858 i

and

ω2 = 0.419821

Note that ω2 = √2iω1.

Here’s what the lattice L of points nω1 + mω2 looks like:

Image001
 

In this lattice, if we multiply a point z by (√2i)2 we get the point -2z. This looks something like this: 

CM

This will also be reflected in the points on the elliptic curve, so this curve has complex multiplication by √2i. In this case, however, it's not at all obvious what the endomorphism is that's not a multiplication-by-n map.

Thursday, February 04, 2010

Making change

My wife went out for pizza last week and came back with an incredible story. She ordered two pizzas on-line and went to the store to pick them up. When she arrived at the pizza store their computer was down. When she paid with cash, she was surprised to see the cashier pull out a calculator that she used to calculate the change due from my wife's purchase. The cashier explained that they had to use a calculator to figure out how much change to give a customer and that they would get fired if they were caught making change without the use of the calculator.

I'd guess that the pizza store didn't just make the use of calculators mandatory for no reason. They almost certainly did this because their employees couldn't do it accurately without a calculator.

It's somewhat popular these days to criticize the exit exam that California high school students have to pass to graduate. Opponents like to say that it forces teachers to teach just the material that's covered on the test, and that this reduces the overall quality of education. Supporters of the exam maintain that it's just testing basic skills and that its requirements are quite reasonable. If pizza stores have to require their employees to use calculators to make change accurately, I'd guess that the exit exam isn't testing enough.

The fact that people can pass the exit exam and still can't make change accurately may account for the following joke:

Question: What is 2+2?

Answer: A question on the California high-school exit exam

Wednesday, February 03, 2010

Another example of rational points on an elliptic curve

Here's another example of finding rational points of finite order on an elliptic curve.

Suppose that we have the elliptic curve y2 = x3 - 2x + 1. In this case we have that D = 5, so that the possibilities for the y-coordinate of a rational point of finite order are limited to 0, ±1 by the Nagell-Lutz theorem. A quick check of these possibilities shows that we have the following points:

P1 = (1,0)

P2 = (0,1)

P3 = (0,-1)


Here's what this looks like:

Image001

These points form this subgroup of the points on the curve:

Rational points on y2 = x3 - 2x + 1

+

O

P1

P2

P3

O

O

P1

P2

P3

P1

P1

O

P3

P2

P2

P2

P3

P1

O

P3

P3

P2

O

P1

We can also think of the elliptic curve y2 = x3 - 2x + 1 as being parameterized by the Weierstrass ℘-function with periods ω1 and ω2 where we have approximately

ω1 = -2.01891 i

and

ω2 = 2.96882

All of the points on y2 = x3 - 2x + 1 that have the property 4P = O come from the complex numbers shown here: 

Lattice4 
Of these 16 points, these are the ones that we get the subgroup of rational points of finite order from (z1 corresponds to P1, etc., and z0 corresponds to O):

Lattice3

Wednesday, January 27, 2010

Why use projective coordinates?

Projective coordinates are often used to represent points on an elliptic curve. This is useful for two reasons. It provides an easy way to represent the point O on an elliptic curve using the same coordinate system that all of the other points use. It also may be more efficient to do operations on projective coordinates than on the usual, or affine, coordinates.

Projective coordinates represent ratios of numbers instead of just numbers. So the number 4 could be represented as either 4 = 4/1 or as 4 = 8/2, which we would write as (4,1) and (8,2) in projective coordinates. Clearly, there's not a unique way to represent affine points in projective coordinates.

To encode the point at infinity, we just set the last coordinate to 0, using division of anything by 0 to indicate infinity.

If we have to numbers x1 and x2 we can write them in projective coordinates as

x1 = X1/Z1

or

(X1,Z1)

and

x2 = X2/Z2

or

(X2,Z2)

If we want to calculate x1/x2 we can use the projective coordinates to find that

x1/x2 = (X1/Z1) / (X2/Z2)

= X1Z2 / X2Z1

which we can write as

(X1Z2,X2Z1)

in projective coordinates.

Note that this means that we've actually calculated x1/x2 without using any division operations at all. The tradeoff is that it takes more operations to implement projective operations. In this example, we've calculated x1/x2 without using any division operations, but at the cost of using two multiplications instead of one division. But if division operations are significantly more expensive that other operations, this can be a useful way to avoid using the more expensive operation.

There are actually more than one way to define projective coordinates in a useful way for points on elliptic curves. The most common way writes an affine point

P = (x,y)

as the projective point

P' = (X,Y,Z)

where

x = X/Z

and

y = Y/Z

This form of projective coordinates also has the advantage that division operations on coordinates can be implemented without using any division operations. Whether or not it's actually preferable to using affine coordinates depends on exactly how fast additions, subtractions, multiplications and divisions are on a particular platform.

Wednesday, January 20, 2010

The discriminant of an elliptic curve

Disc
 

If you're a fan of using high-powered techniques to solve easy problems, you can replace the first step with this.

Tuesday, January 19, 2010

Pokemon combinatorics

Collectable card games are incredibly popular. The cards for these games are typically sold in packages of several cards. The manufacturers print several sheets of cards, and include cards from the different sheets in these packages with different frequencies. They might include four cards from sheet A, two cards from sheet B and a single card from sheet C, calling the cards from sheet A “common,” the ones from sheet B “uncommon” and the cards from sheet C “rare.” Your ability to complete a collection of the entire set of cards is clearly limited by your ability to collect all of the rare cards.

How many packages of cards can you expect to have to buy before you complete a set?

A solution to this problem was described by Paul Erdős and Alfréd Rényi in 1961. Here’s roughly how their solution goes.

Suppose that we already have k different rare cards out of a total set of n possible cards. The next time we buy a pack of cards the probability to not get a new rare card is k/n so the probability that we need to buy exactly p packs of cards to get another new rare card is

(k/n)p-1 (1 - k/n)

so the expected number of packs of cards that we’ll need to buy to a new rare card is

Σp≥1 (k/n)p-1 (1 - k/n) p = 1 / (1 - k/n)

= n / ( n - k)

So the expected number of packs of cards that we’ll need to buy to get all n rare cards is

Σk=0:n-1 n / ( n - k) = n/n + n/(n - 1) + … + n/2 + n/1

= n (1/n + 1/(n - 1) + … + 1/2 + 1/1)

= n Hn

So you should expect to buy packs equal to about Hn times the number of rare cards to get a complete set. If there are 50 rare cards in a set, for example, you should expect to buy about H50 (approximately 4.5) times 50, or about 225 packs to get a complete set.

Here's how Hn grows as a function of n:

Image003

So for the number of rare cards in a typical set of collectable cards, a good rule of thumb is that you can expect to have to buy a number of packs of cards equal to between four and five times the number of rare cards to complete your set.

Friday, January 15, 2010

Marginally interesting


 Diophantus-II-8

Here's a picture of a notable page from a relatively unknown book. Apparently the right margin was just big enough to hold this:

Cubum autem in duos cubos, aut quadrato-quadratum in duos quadrato-quadratos, et generaliter nullam in infinitum ultra quadratum potestatem in duos eiusdem nominis fas est dividere cuius rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.

Wednesday, January 13, 2010

Elliptic curves - connected or not

The graphs of some elliptic curves have one component like this one, the graph of y2 = x3 + 1.

Image001 

Others have two components like this one, the graph of y2 = x3 – 2x + 1.

Image002

It turns out to be easy to tell these two cases apart. If the elliptic curve is defined by the equation y2 = x3 + ax + b and we write D = -4 a3 - 27 b2, then if D > 0 then the graph of the elliptic curve has two components and if D < 0 then the graph has one component. Here’s why.

D is the discriminant of the cubic f(x) = x3 + ax + b. So if we write

x3 + ax + b = (xx1)(xx2)(xx3)

then we have that

D = (x1x2)2(x1x3)2(x2x3)2

Whether the graph of y2 = f(x) has one or two components is determined by how many real roots f(x) has. If there is a single real root then the graph crosses the x-axis once and the graph has a single component, like we get with y2 = x3 + 1. If there are three real roots then the graph then the graph crosses the x-axis three times and the graph has two disconnected components, like we get with y2 = x3 – 2x + 1.

If f(x) has three real roots then D is clearly positive. What about the case of a single real root and a complex conjugate pair of roots?

Let’s write the roots of f(x) as

x1 = a + bi

x2 = a - bi

x3 = c

where a, b and c are real. We can then write

D = (x1x2)2(x1x3)2(x2x3)2

= ((a + bi)– (a bi))2(a + bic)2(a - bi – c)2

= (2bi)2((a - c) + bi)2((ac) – bi)2

= -4b2((a - c)2 + b2)

which is always negative, just like we wanted.

Wednesday, January 06, 2010

Rational points on elliptic curves

If you look at the rules for adding points on an elliptic curve, you'll see that the coordinates of P3 = P1 + P2 are rational functions of the coordinates of P1and P2. This means that if P1and P2 have rational coordinates then P3 also does because its coordinates are rational functions of these rational coordinates.

It turns out that rational points on an ellipic curve also have other interesting properties. In particular, rational points of finite order have to have integer coordinates and the y-coordinate of these points has to either be 0 or divide the discriminant of the cubic that defines the elliptic curve. This is the Nagell-Lutz theorem, which is the following:

Let y2 = x3 + ax + b be an elliptic curve with integer coefficients and let D = 4 a3 + 27 b2. Then if P = (xP,yP) is a rational point of finite order then P has integer coordinates and either yP = 0 or yP2|D.

Example

Suppose that we have the elliptic curve y2 = x3 + 1. In this case we have that D = -27, so that the possibilities for the y-coordinate of a rational point of finite order are limited to 0, ±1, ±3. A quick check of these possibilities shows that we have the following points, and the Nagell-Lutz theorem tells us that that's all of them:

P1 = (-1,0)

P2 = (0,1)

P3 = (0,-1)

P4 = (2,3)

P5 = (2,-3)

Here's what this looks like:

Image001
 

It's also easy to check that these points plus O form a subgroup of the points on the curve. If you add any two of these points together you always get another of of these points. Here's this subgroup:

Rational points on y2 = x3 + 1

+

O

P1

P2

P3

P4

P5

O

O

P1

P2

P3

P4

P5

P1

P1

O

P5

P4

P3

P2

P2

P2

P5

P3

O

P1

P4

P3

P3

P4

O

P2

P5

P1

P4

P4

P3

P1

P5

P2

O

P5

P5

P2

P4

P1

O

P3

We can also think of the elliptic curve y2 = x3 + 1 as being parameterized by the Weierstrass ℘-function with periods ω1 and ω2 where we have approximately

ω1 = 2.10327 + 1.21433 i

and

ω2 = 2.42865 i

All of the points on y2 = x3 + 1 that have the property 6P = O come from the complex numbers shown here: 

Lattice2   

Of these 36 points, these are the ones that we get the subgroup of rational points of finite order from (z1 corresponds to P1, etc., and z0 corresponds to O):

Lattice1   

Tuesday, December 29, 2009

Degenerate elliptic curves

Suppose that we have an elliptic curve of the form

y2 = 4 x3 - g2 x - g3

Many mathematicians require that the right hand side of that equation have no repeated roots and insist that you really don't have an elliptic curve if this happen. The benefit of doing this is that it saves a few words. If you don't do that then you need to be careful to say something like a "non-singular elliptic curve" instead of just an "elliptic curve" in most contexts. Because this post is about the cases where you have repeated roots, I won't follow this convention here.

What happens if we do have repeated roots?

Two repeated roots

For an example of this, consider the elliptic curve

y2 = 4 (x - 1)2 (x + 2)

= 4 x3 - 12 x + 8

or where

g2 = 12

and

g3 = -8

This corresponds to the Weierstrass ℘-function with periods

ω1 = -π i / √3

and

ω2 = ∞

or really only being having one period instead of two. If we plot the magnitude of this ℘-function, this is fairly obvious. Here's what that looks like.

Image001
Three repeated roots

There's essentially only one way to get three repeated roots, and that's the case when we have an elliptic curve that looks like

y2 = 4 x3

or that

g2 = g3 = 0

If that happens, the ℘-function shows no periodic nature at all. Here's what a plot of its magnitude looks like in this case:

Image001 

From the post a few days ago, we can see that in this case we have to have

℘(z) = z-2

which explains the shape of this graph. We also have to have

℘′(z) = -2 z-3

When we think of (x, y) as coming from (℘(z), ℘′(z)), we see that we have

y2 = 4 x3

when this happens.

Monday, December 28, 2009

More interesting elliptic curve stuff

It turns out that we can easily tell some things about an elliptic curve of the form

y2 = 4 x3 - g2 x - g3

Suppose that we get this elliptic curve from the Weierstrass ℘-function with periods ω1 and ω2. Then it turns out that we we can also write this elliptic curve as

y2 = 4 (x - x1)(x - x2)(x - x3)

where

x1 = ℘(ω1 / 2)

x2 = ℘(ω2 / 2)

and

x3 = ℘((ω1 + ω2) / 2)

For an example of this, consider the elliptic curve

y2 = 4 (x - 1)(x - 2)(x + 3)

= 4 x3 - 28 x + 24

which we get from the Weierstrass ℘-function with periods approximately

ω1 = -1.48441 i

and

ω2 = 2.01891

In this case, we find that 

℘(ω1 / 2) = -3

℘(ω2 / 2) = 2

and

℘((ω1 + ω2) / 2) = 1

Wednesday, December 23, 2009

Adding points on an elliptic curve

An interesting identity that can be proven for the Weierstrass ℘-function is that
det
 
℘(z1) ℘′(z1) 1
℘(z2) ℘′(z2) 1
℘(z1+z2) -℘′(z1+z2) 1
 
= 0

If we think about this for a while, we see that this is explains why addition of points on elliptic curves works the way it does. Here's why.

A fact that's usually covered in high-school algebra classes is that

det
 
x1 y1 1
x2 y2 1
x3 y3 1
 
= 0

if and only if the three points (x1,y1), (x2,y2) and (x3,y3) are colinear.

Now from the point of view of the correspondence that we saw in the previous post that identified the complex number z with the point (℘(z),℘′(z)) on an elliptic curve, let's write

P1 = (℘(z1), ℘′(z1))

P2 = (℘(z2), ℘′(z2))

and

P3 = (℘(z1+z2), ℘′(z1+z2))

So the property of the ℘-function that's shown above tells us that although the points P1, P2 and P3 aren't colinear, if we reflect P3 across the x-axis to get -P3, then we get three points that are.

So to find P3 from P1 and P2, we need to first find the third point where the line that connects P1 and P2 intersects the elliptic curve. This point will be -P3. Once we find that point, we reflect it across the x-axis to get P3.

If we do this, we add the point P1 that we get from z1 to the point P2 that we get from z2 and get the point P1 + P2 = P3 that we get from z1 + z2. In other words, adding points on an elliptic curve is really just adding complex numbers, but hidden by the mapping that takes z to the point (℘(z),℘′(z)) on the elliptic curve.

An example

For an example of how this works, let's look at the elliptic curve

y2 = 4 x3 - 4x + 4

which we get from the Weierstrass ℘-function with periods approximately

ω1 = -2.19658 i

and

ω2 = 2.35354 + 1.09829 i

We have these two points on the curve

P1 = (-1,2) = (℘(z1),℘′(z1)) for z1 = 0.291202 + 1.09829 i

P2 = (0,3) = (℘(z2),℘′(z2)) for z2 = 0.73997 + 1.09829 i

for which we find that

P3 = P1 + P2 = (1,-2) = (℘(z3),℘′(z3)) for z3 = 1.03117

We also find that we have that

z3z1 + z2 (mod ω1, ω2)

just like we would expect to see.

Tuesday, December 22, 2009

Parameterizing elliptic curves

Here's an example of parameterizing an elliptic curve of the form

y2 = 4 x3 - g2 x - g3

using the Weierstrass ℘-function.

Consider the elliptic curve

y2 = 4 (x - 1)(x - 2)(x + 3)

= 4 x3 - 28 x + 24

This is a handy elliptic curve to use because we can easily see where its graph crosses the x-axis, etc. Identifying this with

℘′(z)2 = 4 ℘(z)3 - g2 ℘(z) - g3

we have that

g2 = 28

and

g3 = -24

To plot the graph of this elliptic curve when it's parameterized as (℘(z), ℘′(z)) we first need to find the periods ω1 and ω2, which we can find from g2 and g3. Those turn out to be approximately

ω1 = 1.48441 i

and

ω2 = 2.01891

Here's what the magnitude of the corresponding ℘-function looks like over a few of its periods (graphed using Mathematica, as usual).

Image001

If we then plot the graph of (℘(z), ℘′(z)) as z goes along the real line from 0.7 to 1.4,  we get the following graph, which is the graph of y2 = 4 x3 - 28 x + 24:

Image001 

Another example

Doing the same thing for

g2 = 4

and

g3 = -4

and plotting the graph of (℘(z), ℘′(z)) as z goes along the real line from 0.7 to 4.0 we get this, which is the graph of y2 = 4 x3 - 4 x + 4:

Image001
That's definitely another elliptic curve.

Monday, December 21, 2009

Why elliptic curves make sense

The way in which we can add points on an elliptic curve is often presented in a way that doesn't explain at all why adding points in that way makes sense. Adding points can roughly be summarized like this:

  1. Draw a line through the two points that you want to add together
  2. Find the third place where this line intersects the elliptic curve
  3. Reflect that point across the x-axis and call the resulting point the sum of the first two points

Doing this actually makes perfect sense, but seeing why requires looking at a particular elliptic function called the Weierstrass ℘-function.

An elliptic function is a function of a complex variable that's doubly periodic: it has two periods instead of one. This means that we have both

f(z ) = f(z + ω1)

and

f(z) = f(z + ω2)

for complex numbers ω1 and ω2

It turns out that the function defined by

℘(z ) = z-2 + Σ [ (z - ω)-2 - ω-2]

where the sum is over all ω = n1ω1 + n2ω2 not equal to zero, is an elliptic function with periods ω1 and ω2, and this particular elliptic function is called the Weierstrass ℘-function.

If we write

Gk = Σ ω-2k

where the sum is over all ω = n1ω1 + n2ω2 not equal to zero, we find that we can write

℘(z) = z-2 + 3 G2 z2 + 5 G3 z4 + ...

and

℘′(z) = -2 z-3 + 6 G2 z + 20 G3 z3 + ...

and that we can use these to find that we have that

℘′(z)2 = ℘(z)3 - 60 G2 ℘(z) - 140 G3

For historical reasons it's traditional to write

g2 = 60 G2

and

g3 = 140 G3

so that we have that

℘′(z)2 = ℘(z)3 - g2 ℘(z) - g3

If we write

 y = ℘′(z)

and

x = ℘(z)

we then have

y2 = 4 x3 - g2 x - g3

which is just a quick change of variable away from the form that we usually see for an elliptic curve:

y2 = x3 + a x + b

So it looks like we can parametrize an elliptic curve using the ℘-function just like we can parametrize a circle

x2 + y2 = 1

as

cos2t + sin2t = 1

using the usual trigonometric functions.

And just like understanding the properties of the trig functions tells us how to understand what's happening on a circle, understanding the ℘-function tells us how to understand what's happening on an elliptic curve. Like the addition of points.

More on that tomorrow.

Tuesday, December 15, 2009

Linear recursions and geometric series

As I mentioned in a previous post, one way to look at a linear recursion is as a matrix. When you do this, it's natural to think about the eigenvalues and eigenvectors of that matrix, and the eigenvectors turn out to be the initial conditions for the recursion that get it to greate a geometric series. We can use this fact to create an nth order recursion that creates n different geometric series of our choice, with the particular series being created determined by the initial conditions of the recursion. So if we pick any two numbers, it's possible to find a single second-order linear recursion that we can make create a geometric series with our chosen numbers as the ration of consecutive terms. Which ratio we get is determined by the initial conditions for the recursion.

Suppose that we write the recursion

xn = ak-1 xn-1 + … + a1 xn-k+1 + a0 xn-k

as

 
xn-k+1
...
...
xn
 
=
 
0 1
...
0 1
a0 a1 ... ak-1
 
 
xn-k
...
...
xn-1
 

If λ is a solution to

xkak-1 x k-1 - … - a1 xa0 = 0

then

 
1
λ
...
λk-1
 

is an eigenvector that corresponds to the eigenvalue λ, and this eigenvector is also the initial state that turns the recursion into a geometric series: if we have

x0 = 1, x1 = λ, x2 = λ2, …, xk-1 = λk-1

then we’ll always have xn = λn.

This means that for an nth order recursion, we can pick any n values λi (which are exactly these eigenvalues), and find initial conditions for the recursion such that xn = λin. The eigenvectors are the initial conditions that do this, and they get the recursion to create a geometric series with the ratio between successive terms being the eigenvalue which corresponds to the eigenvector. It's also easy to write down the recursion that does this for us. In particular, if the eigenvalues are the roots of

xkak-1 x k-1 - … - a1 xa0 = 0

then the recursion

xn = ak-1 xn-1 + … + a1 xn-k+1 + a0 xn-k

does this for us.

An example

Suppose that we want to find a second-order recursion that will create either the series

xn = (-2)n

or

xn = 3n

This corresponds to having

λ1 = -2

and

λ2 = 3

for the zeroes of

(x - λ1) (x - λ2) = (x + 2) (x - 3)

= x2 - x - 6

so that the recursion

xn+2 = xn+1 + 6 xn

will do this for us.

The initial conditions

x0 = 1

and

x1 = -2

give

xn = (-2)n

and the initial conditions

x0 = 1

and

x1 = 3

give

xn = 3n

Friday, December 11, 2009

The Fibonacci sequence, the golden ratio and Egyptian fractions

It turns out that there's a way to use the Fibonacci sequence to write the golden ratio in terms of Egyptian fractions. Here's how.

If we write the Fibonacci sequence as

f(n + 2) = f(n + 1) + f(n)

then we can write

f(n) = (anbn) / (ab)

where

a = (1 + √5) / 2

and

b = (1 – √5) / 2

Using this form of f(n) it's easy to show that

f(n)2fn+1 fn-1 = (-1)n-1

which we can rewrite as

fn+1 / fn = fn / fn-1 + (-1)n-1 / (fn fn-1)

But we also have that

fn / fn-1 = fn-1 / fn-2 + (-1)n-2 / (fn-1 fn-2)

so that

fn+1 / fn = fn-1 / fn-2 + (-1)n-2 / (fn-1 fn-2) + (-1)n-1 / (fn fn-1)

If we keep making similar substitutions, we find that we get that

fn+1 / fn = f2 / f1 + 1 / f1 f2 - 1 / f2 f3 + … + (-1)n-1 / fn-1 fn-2

The golden ratio a is the limit of fn+1 / fn as n gets large, so we have that

a = f2 / f1 + 1 / f1 f2 - 1 / f2 f3 + 1 / f3 f4 + …

= 1 + 1 - 12 + 16 - 115 + ...

Each of the terms in this sum happen to be fractions of the form 1n for some integer n, which are exactly the so-called Egyptian fractions. Apparently, the classical Egyptians used only fractions of that form, plus the single additional fraction 23, to represent all rational numbers.

If you try to actually do a few computations with Egyptian fractions, you'll probably come to the same conclusion that I did: that representing rational numbers this way is probably one of the greatest technical blunders of all time.

In any event, the above representation for the golden ratio that we got from the Fibonacci sequence is actually a way to represent it as a sum of Egyptian fractions. Yet another aspect of the Fibonacci sequence that I didn't expect.

Friday, December 04, 2009

The limits of provable security

I've never quite understood the objections that people have to cryptographic schemes that are provably secure. If you have a proof of security then one of two things must be true: either the scheme is secure or there's a flaw in the proof. There's no other possible case.

All of the technologies that Voltage's products use have such proofs. The Boneh-Franklin IBE, the Boneh-Boyen IBE and our format-preserving encryption technology all have such proofs that have been published in peer-reviewed research journals. Because of this, I often don't understand the questions that I sometimes get about the security of our technologies.

Here's a situation that I've seen more than once. Someone asks about the security of our FPE technology, for example. We'll point them to the paper by cryptographers John Black and Phil Rogaway that has a proof for the security of the scheme. The next question is essentially, "But why should I believe that it's secure?" At that point, I'm never quite sure what to say next. If you have a proof in front of you, then either the proof is correct or there's an error in the proof. If this proof is of the security of a cryptographic scheme, then either the scheme is secure or there's an error in the proof. As mentioned above, there's no other possible alternative.

My recent experiences have led me to believe that there's a fundamental problem with this approach, and that's because many people really aren't comfortable with the idea of a proof. I've seen many cases recently where people essentially accept P and NOT P at the same time and don't seem bothered by doing this.

Every time I see this, I start thinking about the logical implications of it. After all, if you accept both P and NOT P then you can prove that absolutely anything is true. "If 2 = 3 then all cats are dogs" is absolutely true, for example. But because many people either don't understand this or don't believe this, I've come to the conclusion that proofs of security really aren't that useful. They may help specialists make sure that their work is correct, but to non-specialists they really don't say anything useful.

Friday, November 27, 2009

More patterns in the Fibonacci and Lucas sequences

More observations about the Fibonacci and Lucas sequences.

As before, suppose that we have a second-order linear recursion given by

xn+2 = A xn+1 + B xn

and we write

x2A xB = (xa) (xb)

where

a = (1 + √(A2 + 4 B)) / 2

and

b = (1 – √(A2 + 4 B)) / 2

For a Fibonacci-like sequence we have that

f(n) = (anbn) / (a b)

and for a Lucas-like sequence we have that

g(n) = an + bn

Now if we look at

(a + b)n = an + … + bn

we see that all but the first and last terms are divisible by ab so that

(a + b)nan + bn (mod ab)

We can write

x2A xB = (xa) (xb)

= x2 – (a + b) x + ab

so that

A = - (a + b)

and

B = ab

This means that we can write

g(n) = an + bn

≡ (a + b)n (mod ab)

Or that

g(n) ≡ An (mod B)

which looks like an interesting relationship.

Similarly, we can write

f(n) = (anbn) / (a b)

= an-1 + … + bn-1

an-1 + bn-1 (mod ab)

g(n - 1) (mod ab)

An-1 (mod B)

or that

f(n) ≡ An-1(mod B)

which also seems to be an interesting relationship.

Wednesday, November 25, 2009

Another form of the Lucas sequence

As before, let’s write the nth tem of the Lucas seqence as

g(n) = an + bn

Just like in the previous post, let's write

z = (n / 2) log (a / b)

so that

2 cos iz = 2 cos [ (i n / 2) log (a / b) ]

e (n/2) log (a/b) + e - (n/2) log (a/b)

= (a / b) n/2 + (a / b) -n/2

= (an + bn) / (ab) n/2

Solving for

g(n) = an + bn

we find that

g(n) = 2 (ab) n/2 cos [ (in / 2) log (a / b) ]

which is very similar to the form of the Fibonacci sequence that we found earlier that involved the sine function:

f(n) = - [ 2 i (ab) n/2 / (ab) ] sin [ (in / 2) log (a / b) ]

If we have the recursion

g(n + 2) = g(n + 1) - g(n)

for example, with

g(0) = 2

and

g(1) = 1

we find that we get

g(2) = -1

g(3) = -2

g(4) = -1

g(5) = 1

g(6) = 2

where the sequence 2, 1, -1, -2, -1, 1 starts over again.

Because we can write this sequence as

g(n) = an + bn

where

a = (1 + i √3) / 2

and

b = (1 - i √3) / 2

we can use the fact that

g(n) = 2 (ab) n/2 cos [ (in / 2) log (a / b) ]

to find that

g(n) = 2 cos( n π / 3)

much like we did in one of the previous posts.

Because we have

f(n) = - [ 2 i (ab) n/2 / (a - b) ] sin [ (in / 2) log (a / b) ]

and

g(n) = 2 (ab) n/2 cos [ (in / 2) log (a / b) ]

it seems reasonable to look for a relationship that's much like the complex exponential, where

eiz = cos z + i sin z

perhaps something that contains lots of HTML entities for Greek letters like

ει n = g(n) + ι f(n)

It doesn't quite look like anything nice and clean is possible, although if we write

ι = a - b

then it looks like we can write

g(n) + ι f(n) = 2 an

and

g(n) - ι f(n) = 2 bn

That may be as close as we can get. 

Tuesday, November 24, 2009

Fibonacci, Lucas and Pell sequences

Even more stuff that I stumbled across while thinking about the Fibonacci sequence while trying to get into the right frame of mind to think about harder things.

Suppose that we have a recursion defined by

f(n + 2) = A f(n + 1) + B f(n)

We can write this as

f(n) = α an + β bn

where 

a = (A + √(A2 + 4 B) / 2

and

b = (A - √(A2 + 4 B) / 2

If we have that

f(0) = 0

and

f(1) = 1

like with we do with the Fibonacci sequence, then we find that

α = 1 / (ab)

and

β = -1 / (ab)

so that

f(n) = (anbn) / (ab)

Suppose that we want to get other interesting forms of f(n) like where

α = 1

and

β = 1

so that

f(n) = an + bn

It’s easy to show that we can get this when

f(0) = 2

and

f(1) = A

which gives what’s called the Lucas sequence when A = 1.

It’s also easy to show that if we want to make sure that we have

f(n) = (an + bn) / (a + b)

then we need to have

f(0) = 2 / A

and

f(1) = 1

If we want to have integer values, making A = 2 is a reasonable way to try to do this. This gives us the Pell sequence, which is defined by the recursion

f(n + 2) = 2 f(n + 1) + f(n)

Finally, that if we want to make sure that we have

f(n) = an - bn

or

α = 1

and

β = -1

then we find that we get this when

f(0) = 0

and

f(1) = ab = √(A2 + 4B)

We can get this to be an integer, for example, if we have that

A = 3

and

B = 4

or from the recursion

f(n + 2) = 3 f(n + 1) + 4 f(n)

with

f(1) = 5

That seems to cover the interesting cases, or at least the ones that I had time to think about today.

In any event, this might give us some idea of how Lucas and Pell came to think about the sequences that now carry their names.

Monday, November 23, 2009

The polar decomposition of the Fibonacci sequence

If we think about the matrix form of the Fibonacci recursion:

 
f(n + 1)
f(n + 2)
 
=
 
0 1
1 1
 
 
f(n)
f(n + 1)
 
a natural question to ask is what the polar decomposition of the matrix

A

=
 
0 1
1 1
 
tells us about the recursion and the sequence that it generates.

To get the polar decomposition of the matrix A we write it as

A = PU

where P is positive definite and U is unitary

Writing a matrix in its polar decomposition is much like writing a complex number as

z = r eiθ

where the positive definite part of the polar decomposition corresponds to the magnitude of the complex number and the unitary part of the polar decomposition corresponds to the rotational part of the complex number.

In the case of the Fibonacci recursion we find that

P = (1/√5)
 
2 1
1 3
 

and

U = (1/√5)
 
-1 2
2 1
 

so that

P2 =
 
1 1
1 2
 

P3 = (1/√5)
 
3 4
4 7
 
P4 =
 
2 3
3 5
 
etc., and that U2 = I.
 
In general, I'd guess that for a nth-order linear recursion, we'll always have that Un = I. That shouldn't be too hard to show.

As we saw before, A has eigenvalues

a = (1 + √5) / 2

and

b = (1 - √5) / 2

while in this case we find that P has eigenvalues a and -b while U has eigenvalues 1 and -1.

I'm not sure how useful this polar decomposition will be, but I feel fairly sure that I'll find a use for it some day.

Friday, November 20, 2009

Not quite the Fibonacci sequence

If instead of the recursion that we use to get the Fibonacci sequence:

f(n + 2) = f(n + 1) + f(n)

what happens if we change the sign of the coefficient of f(n) to get the recursion

f(n + 2) = f(n + 1) - f(n)

instead?

With the initial conditions

f(0) = 0

and

f(1) = 1

we get

f(0) = 0

f(1) = 1

f(2) = 1

f(3) = 0

f(4) = -1

f(5) = -1

f(6) = 0

f(7) = 1

at which point it's obvious that we'll just keep cycling through the pattern 0, 1, 1, 0, -1, -1 again. This probably means that the corresponding function of a real variable f(x) probably has an interesting graph. What does this look like?

We have that

f(x) = (ax - bx) / (a - b)

where

a = (1 + i √3) / 2

and

b = (1 - i √3) / 2

which means that we also have that

ab = 1

a - bi √3

and

(a / b) = (-1 + i √3) / 2 = exp ( 2 π i / 3 )

so that

|a / b| = 1

and

arg (a / b) = 2 π / 3

and that 

log (a / b) = log( |a / b| ) + i arg (a / b) = 2 π i / 3

We can substitute these into the expression from yesterday's post:

f(n) = - [ 2 i (ab) n/2 / (ab) ] sin [ (i n / 2) log (a / b) ]

to find that

f(n) = (2 / √3) sin ( n π / 3)

so that the graph of f(x) just looks like this:

Image001

That's not quite as interesting as the graphs that we got from the Fibonacci sequence, is it?

Thursday, November 19, 2009

Another form of the Fibonacci sequence

Another possibly-interesting observation about the Fibonacci sequence that I came across recently. 

As before, let’s write the nth tem of the Fibonacci seqence as

f(n) = (anbn) / (a - b)

where

a = (1 + √5) / 2

and

b = (1 – √5) / 2

Now we have that

e iz = cos z + i sin z

so that

e z = cos iz + i sin iz

and

e -z = cos iz i sin iz

Combining these, we get that

 - 2 i sin iz = e z - e -z

Now for the step that seems like magic (although it’s really the result of working backwards to get things to work out the way we want them to), write

z = (n / 2) log (a / b)

Using this definition of z we find that

- 2 i sin iz = - 2 i sin [ (i n / 2) log (a / b) ]

e (n/2) log (a/b)e - (n/2) log (a/b)

= (a / b) n/2 – (a / b) -n/2

= (anbn) / (ab) n/2

Solving for

f(n) = (anbn) / (a - b)

we find that

f(n) = - [ 2 i (ab) n/2 / (ab) ] sin [ (in / 2) log (a / b) ]

I'd never seen that form for the Fibonacci sequence before. I'm not sure how useful this is, but I found it a bit surprising.

Wednesday, November 18, 2009

More uses for eigenvalues and eigenvectors

It turns out that looking at eigenvalues and eigenvectors may be useful for more than thinking about linear recursions like the Fibonnaci sequence. They're also be useful when thinking about special relativity. In this case we have that the coordinates (x',t') in a frame of reference moving at velocity v relative to a stationary frame of reference are related to the coordinates (x,t) in the stationary frame of reference by

 
x'
t'
 
= γ
 
1 -v
-v 1
 
 
x
t
 

where γ = 1 / √(1 - v2) and speed of light is normalized to c = 1.

Now the matrix

γ
 
1 -v
-v 1
 

has eigenvectors

 
 
-1
1
 
and
 
1
1
 

which correspond to the eigenvalues γ(1 + v) and γ(1 - v) respectively. One interpretation for this is that the natural coordinates system to use has  

 
-1
1
 
and
 
1
1
 
for its basis instead of

 
1
0
 
and
 
0
1
 

If we had use those coordinates, lots of physics would be much simpler. On the other hand, that would also make it more difficult to actually make any measurements. Maybe our current system isn't really so bad after all.

Tuesday, November 17, 2009

More on the Fibonacci sequence

Another observation based on thinking about the Fibonacci sequence. This one relates to what you get when you write the action of a linear recursion as a matrix and what the eigenvalues and eigenvectors of this matrix tell us.

Another way to write a linear recursion

xn = ak-1 xn-1 + … + a1 xn-k+1 + a0 xn-k

is as

 
xn-k+1
...
...
xn
 
=
 
0 1
...
0 1
a0 a1 ... ak-1
 
 
xn-k
...
...
xn-1
 

Because we have a matrix, it’s natural to think about the eigenvalues and eigenvectors of the matrix and what they mean. In this case, finding these is actually fairly easy.

If λ is a solution to

xkak-1 x k-1 - … - a1 xa0 = 0

then we have that

 
0 1
...
0 1
a0 a1 ... ak-1
 
 
1
λ
...
λk-1
 
=
 
λ
λ2
...
λk
 
or that


 
1
λ
...
λk-1
 

is an eigenvector that corresponds to the eigenvalue λ. This eigenvector is also the initial state that turns the recursion into a geometric series: if we have

x0 = 1, x1 = λ, x2 = λ2, …, xk-1 = λk-1

then we’ll always have xn = λn.

In the case of the Fibonacci recursion, for example, we have that 

a = (1 + √5) / 2  

and

b  = (1 - √5) / 2

are the eigenvalues of the matrix

 
0 1
1 1
 

and correspond to the eigenvectors

 
1
a
 
and
 
1
b
 

respectively. These are the initial conditions that get the Fibonacci recursion to produce a geometric series. If we have x0 = 1 and x1 = a, for example, then we’ll always have xn = an.

Monday, November 16, 2009

Visualizing the Fibonacci sequence

Here are few interesting things I came across recently when I was thinking about the Fibonacci sequence. In particular, it involves visualizing the Fibonacci sequence.

The Fibonacci sequence is defined by the recurrence

f(n) = f(n - 1) + f(n - 2)

where

f(0) = 0

and

f(1) = 1

It's easy to show that we can also write

f(n) = (anbn) / (a - b

where

a = (1 + √5) / 2

and

b = (1 - √5) / 2

It's also easy to see the behavior of f(n) for negative values of n: we have that

f(-n) = (-1)n+1 f(n)

What happens if we think of f as a function of a real variable instead of an integer variable?

In this case we have the function

f(x) = (axbx) / (a - b)

This turns out to be a complex-valued function that happens to have a real part of zero at integer values. If we graph f(x) for 0 ≤ x ≤ 5, it looks like this (the complex value z = x + i y is plotted as (x, y)): 


Image001

If we graph f(x) for -5 ≤ x ≤ 0, it looks like this:


Image002
 

In that graph we can easily see how the fact f(-n) = (-1)n+1 f(n) is reflected.

To see what the graph for both positive and negative values of x look like together, here's the graph of f(x) for -4 ≤ x ≤ 5:

 

Image003 

What about derivatives of f(x)? 

It's fairly easy to see that 

f(n)(x) = (αn ax - βn bx) / (a - b

where

αn = (log a)n

and

βn = (log b)n

so that the graph of f(n) looks somewhat similar to the graph of f.

Here's how the graphs of f, f' and f'' look, for example:

Image002 
If we look at how the differences between Fibonacci numbers behave, like

Δf(n) = f(n + 1) - f(n) = f(n - 1)

we might expect to see this reflected in f', but there doesn't appear to be a nice, clean relationship like

f'(x) = f(x - 1)

I'm not even sure how close you can get in this particular case.

Monday, November 09, 2009

Ideal lattices

Ideal lattices have been mentioned a lot recently, particularly in association with Gentry's homomorphic encryption scheme. Exactly what is an ideal lattice?

Let's suppose that we know what a vector space is. With a vector space over the real numbers, for example, we have a set of vectors called a basis, and all of the sums of real multiples of the basis elements form the vector space. If we look at sums of integer multiples of the basis elements, that's a lattice. We can think of the basis as generating a lattice, and to call it a set of generators in that context.

An ideal is a generalization of the idea of "even numbers" or "multiples of 3." A bit more precisely, a set is an ideal if it fulfills two conditions. First, if we add elements in the ideal, we get something else that's also in the ideal. If you add two even numbers you get another even number, for example. We also have the property that if we multiply something in the ideal by something outside the ideal, we get something back in the ideal. That's just like what we get when we multiply any integer by an even integer and get another even integer.

If we have an ideal, we can use its generators as the basis for a lattice, so for every ideal there’s a lattice that we can associate with it in a fairly natural way. Suppose that we look at polynomials with integer coefficients, for example. In this case, if we have the ideal generated by 1 and x, the lattice with basis 1 and x corresponds to this ideal in a natural way.

Not every lattice corresponds to an ideal in the same way, however. If we consider the lattice which has 1 and 2x as its basis, we find that there's no ideal that corresponds to this lattice. The element x, which we can get by multiplying x (from outside the lattice) by 1 (from inside the lattice), isn’t an element of the lattice. This means that this example doesn’t meet the conditions necessary for an ideal, so there’s no ideal that corresponds to the lattice.

Since not every lattice corresponds to an ideal in this way, those that do are interesting, and these are what are called ideal lattices. They’re the framework used to create Gentry’s homomorphic encryption scheme, which is probably the most interesting use of ideal lattices discovered so far.

Friday, October 09, 2009

The Tate pairing

Suppose that P and Q are points on an elliptic curve and that the order of P is n. Let fP be a rational function equivalent to n(P) – n(O) and DQ be the divisor (Q) – (O). Then the Tate pairing of P and Q is e(P, Q) = fP(DQ).

We know how to get fP and how to evaluate it at a divisor, but the fact that evaluating fP at the divisor (Q) – (O) would require evaluating fP at the point O causes trouble. But if R is another point not equal to O, it turns out that we get the same answer when we evaluate fP at (Q + R) – (R) as we do when we evaluate fP at (Q) – (O). This means that we can find e(P, Q) by doing the following:

  1. Find fP as we described previously.
  2. Pick a random R.
  3. Calculate e(P, Q) = fP(Q + R) / fP(R).

That’s all there is to it. When I first heard about the use of the Tate pairing in Boneh-Franklin identity-based encryption, for example, I used this very approach to implement the Tate pairing in Mathematica, and it was good enough for that purpose. There are lots of ways to make this calculation more efficient, but for small values of n, what we’ve described is fairly simple to implement.

Thursday, October 08, 2009

Evaluating a rational function at a divisor

To evaluate the pairings that we need in pairing-based cryptography, we need to evaluate the rational function that we get from a divisor at another divisor. To understand how to do this, we need to understand what it means to evaluate a rational function at a divisor. This turns out to actually be fairly straightforward.

Suppose that we have a divisor

D = S ni(Pi)

and want to evaluate a rational function f at D to get f(D). We do this by calculating 

f(D) = P f(Pi)^(ni)

In the case of the divisors that we’re interested in, we’ll get a rational function of two variables x and y, and we’ll need to think of a point on an elliptic curve as being a point P = (x, y) to evaluate the rational function at the point.

For example, suppose that

P1 = (-1, 0)

P2 = (1,1)

D = 2(P1) – 2(P2)

and

f(x, y) = (x + y – 1) / (y – 2).

Then we can find

f(D) = f(P1)2 f(P2)-2

= f(P1)2 / f(P2)2

Using P1 = (-1, 0) we find that

f(P1) = (-1 + 0 -1) / (0 – 2)

= 1

Using P2 = (1, 1) we find that

f(P2) = (1 + 1 – 1) / (1 – 2)

= -1

Combining these, we find that

f(D) = f(P1)2 / f(P2)2

= 12 / (-1)2

= 1

Wednesday, October 07, 2009

Getting a rational function from n(P) - n(O)

In a previous post we saw how to add divisors like (P1) – (O) and (P2) – (O) and reduce the sum to the form (Q) – (O). Suppose that P is a point of order n and see what happens when we use this approach to evaluate

n(P) – n(O) = (P) – (O) + …+ (P) – (O)

We can do this by first finding

(P) – (O) + (P) – (O) = (2P) – (O) + div(f1)

where we’ve written f1 = u1 / v1 where u1 is the line through P and P, and v1 is the vertical line through 2P and -2P.

Next, we can find

3((P) – (O)) = (P) – (O) + (2P) – (O) + div(f1)

= (3P) – (O) + div(f1f2)

where we’ve written f2 = u2 / v2 where u2 is the line through P and 2P and v2 is the vertical line through 3P and -3P.

We can continue this process until we get to

n((P) – (O)) = (nP) – (O) + div(f1fn-1)

= (O) – (O) + div(f1fn-1)

= div(f1fn-1)

Note that by doing this, we’ve found a way to write n(P) – n(O) as the divisor of the rational function fP = f1fn-1. To calculate the pairings that we need for pairing-based cryptographic algorithms, we want to evaluate fP, and we can get this fP through the process that’s described above. That’s almost enough to define how to evaluate the Tate pairing. The rest of what’s needed will be the topic of tomorrow’s post.

Tuesday, October 06, 2009

Divisors on elliptic curves

Divisors on elliptic curves are important in pairing-based cryptography. They’re useful because the structure that an elliptic curve provides lets you reduce sums of divisors down to simpler sums.

Suppose that we have several divisors

D1 = (P1) – (O)

D2 = (P2) – (O)

Dn = (Pn) – (O)

We can always add the divisors D1 through Dn to get

D = D1 + …+ Dn

= (P1) + … + (Pn) + n(O)

but this gets unwieldy very quickly.

In particular, for cryptographic applications, we want to use a value of n that has at least 160 bits, and we certainly don’t want to think about manipulating a divisor that has 2160 terms. Fortunately, elliptic curves provide the structure needed to make a calculation like this feasible because they provide a way to reduce a sum of the form (P1) – (O) + (P2) – (O) to another divisor of the form (Q) – (O), which keeps the number of terms manageable. Here’s how we can do this.

Suppose that P1 and P2 are two points on an elliptic curve and that P3 = P1 + P2. Let u be the line through P1, P2 and –P3, and let v be the line through P3 and –P3.The following picture shows how this looks.

Figure 4-1

Now let's think of points on the elliptic curve as poles and zeroes that define a divisor, so that u has zeroes at P1, P2 and –P3 plus a pole of order 3 at O and v has zeroes at P3 and –P3 plus a pole of order 2 at O. This means that we can write

div(u) = (P1) + (P2) + (–P3) – 3(O)

and

div(v) = (P3) + (–P3) – 2(O)

Now note that we look at div(u) – div(v) we find that

div(u) – div(v) = (P1) + (P2) + (–P3) – 3(O) – ((P3) + (–P3) – 2(O))

= (P1) + (P2) – (P3) – (O)

= [(P1) – (O)] + [(P2) – (O)] – [(P3) - (O)]

or that

[(P1) – (O)] + [(P2) – (O)] = (P3) - (O) + div(u/v)

So if D1 = (P1) – (O) and D2 = (P2) – (O), this gives us a way to find D1 + D2 in a way that keeps the sum in the form (Q) – (O). If we apply this trick again and again, this gives a way to calculate a sum like D = D1 + …+ Dn, and to do it in a way that keeps the number of terms in the sum small.

As we do this, we’ll accumulate a product of rational functions from the div(u / v) contributions, and in a future post we’ll see how to use that to get a way to evaluate the divisors that we need in pairing-based cryptography.

Exactly how do you find the rational functions u and v?

Suppose that you have u: y = mx + b, or ymxb = 0. If that’s the case, you have u(x, y) = ymx b.

Suppose that you have v: x = c, or xc = 0. If that’s the case, you have v(x, y) = xc.

Monday, October 05, 2009

Divisors

Certain functions called pairings are an important part of the mathematical machinery that make pairing-based cryptographic algorithms possible. Pairings, in turn, are based on the idea of a divisor. These divisors aren’t those you see in elementary number theory, where 4 is a divisor of 12, for example. Instead, they’re a way of keeping track of the zeroes and poles of a rational function.

The zeroes and poles of a rational function define the rational function up to multiplication by a constant. So if we know that we have a rational function with a zero of order two at x = 1 and a pole of order one at x = 2, then the function has to look like

f(x) = c (x – 1)2 / (x – 2)

A divisor is a way to keep track of the zeroes and poles of a rational function by using a notation that looks much like addition of integers. We write the divisor of f as

div(f) = 2(1) – (2) – (¥)

What’s in the parentheses tells us where we have a zero or pole and the coefficient in front of the parentheses tells us the order of the zero or pole. Positive numbers indicate a zero; negative numbers indicate a pole. Note that this particular rational function also has a pole at x = ¥, and the divisor that we write for it reflects this.

So if we write

div(f1) = (3) - (¥)

that tells us that f1is a rational function with a zero at x = 3 and a pole at x = ¥, so that, up to a constant factor, f1 looks like

f1(x) = x – 3

We can add divisors by adding the coefficients of like terms, so that if

div(g) = (3) - 2(1) + (¥)

which corresponds to

g(x) = (x – 3) / (x – 1)2

then we can add div(f) and div(g) to get

div(f) + div(g) = 2(1) – (2) - (¥) + (3) - 2(1) + (¥)

= (3) – (2)

which corresponds to the rational function

h(x) = (x – 3) / (x – 2)

Clearly, adding divisors corresponds to multiplying rational functions, and if we define subtraction in the natural way, it corresponds to dividing rational functions. So we can write

div(f) + div(g) = div(fg)

and

div(f) – div(g) = div(f / g).

Note that the divisor div(1) acts much like 0 does in the integers. In the integers, adding or subtracting 0 doesn't change the value of an integer. The divisor div(1) behaves the same way:

div(f) + div(1) = div(f)

and

div(f) - div(1) = div(f)

The notation for divisors isn’t quite the same as addition, because we can’t simplify divisors by reducing (1) – (1) to div(1), for example.

Tomorrow: divisors on elliptic curves.

Friday, September 25, 2009

Picturing the central limit theorem

The central limit theorem tells us that if we add data that contains random errors then we tend to get a sum that's normally distributed, independent of the distribution of the original data. I was wondering how many errors you need to add together to get something that looked close to a normal distribution, so I created some random data from a uniform distribution in Mathematica that represents such errors. I then and plotted one of these errors, the distribution of the sum of two of these errors and the distribution of the sum of three of these errors. Here's what I got.

Image001

One random error.

Image002

The distribution of the sum of two random errors.

Image003

The distribution of the sum of three random errors.

It certainly looks like you're fairly close to a normal distribution after adding together only three errors. I was surprised to see how quickly this happened.

Friday, September 18, 2009

The Chinese remainder theorem

Supose that n1, n2, …, nk are positive integers that are have no common factors. Then the Chinese remainder theorem (CRT) tells us that for any integers a1, a2, …, ak there is an integer x that satisfies the following system of congruences:

xa1 (mod n1)

xa2 (mod n2)

xak (mod nk)

and that all solutions are congruent modulo the product N = n1n2nk. The CRT is particularly useful to know if you’re doing some of the number-theoretical calculations that you do in cryptography.

It turns out that the algorithm that’s used to get a solution the CRT is really just Lagrange interpolation in disguise. With Lagrange interpolation we find a solution to be

f(x) = d1(x) y1 + d2(x) y2 + … + dk(x) yk.

With the CRT we find a solution to be

x = d1(n1) a1 + d2(n2) a2 + … + dk(nk) ak (mod N).

In both cases the ds have the property that di(nj) = 1 if i = j and 0 if ij. They’re calculated differently, but their role in the calculation is really the same: to force a value of 1 in one particular case and a value of 0 in all other cases.

In the case of the CRT, here's how we do this. We first notice that N and N/ni are relatively prime. There's essentially only one trick that you can use when things are relative prime, and that's to use the fact that we can find ui and vi such that ui ni + vi N/ni = 1. If we write di(nj) = vi N/nj  modulo nj, then this acts just we want it to.

We have ui ni + vi N/ni = 1, so that we have vi N/ni ≡ 1 (mod ni), or that di(ni) = 1.

And because nj | (N/ni) when ij, we also have that di(nj) = vi N/nj ≡ 0 (mod ni ) when  ij.

As I mentioned in the previous post about Lagrange interpolation, when I was in school, every teacher that I had presented Lagrange interpolation as a nasty formula involving lots of complicated expressions. The same thing was true for me with the CRT. Every time I saw this in a class, it was always presented as a nasty formula involving lots of complicated expressions. Nobody ever pointed out that it’s really the same thing as Lagrange interpolation.

I have to wonder why this was taught this way. If you understand what either Lagrange interpolation or the CRT is trying to do, it’s easy to reconstruct the complicated expressions that you need to calculate. If you don’t understand what they’re trying to do, you end up trying to memorize some complicated expressions and hoping that you get it right. I hope that things are taught differently these days. Its easy for me learn patterns and apply them in other places. It’s hard for me to memorize lots of stuff, particularly when I don’t understand exactly what I’m memorizing. I'm probably not alone in this, and I hope that the way things are taught today reflects this.

Thursday, September 17, 2009

Lagrange interpolation

Lagrange interpolation is another thing that I wish that I’d had explained better to me when I was in school. There’s actually a connection between this and something that’s related to cryptography, but I won’t have a chance to mention that until tomorrow’s post. I only have enough patience to type and correct a little bit of math each day, and getting to the crypto connection would definitely take more than my entire daily supply of patience.

Lagrange interpolation is a way to create a polynomial that fits a sequence of data points (x1,y1),…(xn,yn). The way to do this is to create polynomials δi(x) that have the property that δi(xj) = 1 if i = j and 0 if i j. If we have these, then we know that the polynomial

f(x) = δ1(x) y1 + δ2(x) y2 + … + δn(x) yn

has to fit the data points that we’re given. At x = x1, all of the terms of f(x) are equal to 0 except δ1(x1) y1, which happens to reduce to y1, at x = x2, all of the terms of f(x) are equal to 0 except δ2(x2) y2, which happens to reduce to y2, etc.

Now let

p(x) = (xx1) (x - x2) … (x - xn),

and pi(x) be p(x) with the factor (x – xi) left out. This pi(xj) is almost what we want. We have that pi(xj) = 0 if i j but we don’t have that pi(xj) = 1 when i = j.

That’s easy to fix, though. If we divide pi(xj) by pi(xi) then we’re done. That forces a value of 1 when i = j and a value of 0 when i j. So if we define δi(x) = pi(xj) / pi(xi) then this gives us what we want, and the polynomial

f(x) = δ1(x) y1 + δ2(x) y2 + … + δn(x) yn 

has to fit the data points that we’re given.

Lagrange interpolation seems fairly simple once you see what you’re trying to do: you’re just creating polynomials that are forced to fit your data because of the way they’re constructed. When I was in school, however, it was never explained to me that way. Instead, it was just written down as a big, complicated expression involving lots of sums and products and it was left to the students to figure out what was really going on. And because it was presented that way, the connection to the topic of tomorrow’s post wasn’t obvious at all.

Wednesday, September 16, 2009

Things that I didn't need to memorize

There are several things that I wish that I'd learned in school that nobody got around to teaching me. I had to figure these out on my own, and I still don't quite see why these things were overlooked by the teachers or professors that I had. Here's an example of this, and it has to do with trigonometry. It turns out that there's a quick and easy way to get the double-angle formulas for sines and cosines, or how to find sin 2x or cos 2x in terms of sin x and cos x. The easy way to do this is by looking at how the complex exponential works.

We know that

eix = cos x + i sin x

so we can write

e2ix = cos 2x + i sin 2x.

We can also write

e2ix = eix eix = (cos x + i sin x) (cos x + i sin x)

= cos2x - sin2x + 2i sin x cos x.

When we compare the real and imaginary parts of these two different ways of writing e2ix we find that

cos 2x cos2x - sin2x

and

sin 2x =  2 sin x cos x

which are the double-angle formulas from trigonometry. You can use the same trick, of course, to get similar expressions for cos 3x, sin 3x, etc.

For me, it's much more useful to know a way to get these formulas if I need them instead of just memorizing them. But since nobody ever showed me how to do this when I was in school and I wasn't clever enough to figure it out on my own, I was stuck memorizing stuff that I really didn't need to memorize. Looking back, I wonder why I was never taught the easier way.

Wednesday, September 09, 2009

The two-envelope problem in risk management

Does it make sense to never change your information security strategy? That's a possible consequence of the so-called two-envelope paradox. This is a problem in probability theory that has confused students of probability theory for over 50 years. It goes like this.

Suppose that you're given two envelopes and you're told that one envelope contains twice as much money as the other. You then open one of the envelopes and see how much money it contains. Based on this information, you decide to either keep the contents of the first envelope or to switch its contents for the contents of the second, unopened envelope.

It might seem that it always pays to switch.

Suppose you find $2 in the first envelope. You know that the other envelope either contains $1, which happens with probability 0.5, or it contains $4, which also happens with probability 0.5. So you can calculate the expected value of the second envelope as $1 x 0.5 + $4 x 0.5 = $2.5. Because this is greater than $2, it always pays to switch.

There's a problem with this argument, of course, but it's fairly subtle. Even specialists in probability theory don't agree what the problem actually is, although they all agree that there's a problem with the argument.

Now let's suppose that we can't find a flaw in the above argument and we apply it to our information security strategy. Let's suppose that we have some initial set of technology, policies and procedures that end up giving us some exposure to risk that we'll denote R, and if we change to a different set of technology, policies and procedures, we might either increase the risk to 2R or decrease it to R/2. If we apply the same reasoning that we applied above, we find that it never pays to change, because the alternative always has a greater than the risk than what we have now. This clearly doesn't make sense, but it's what you might get if you do a risk analysis that isn't as careful as it could be.

The bottom line is probably that probability is a complicated and subtle concept, which means that risk management, which relies on it, also is.

Wednesday, August 19, 2009

Bayesian Scouting

I recently heard some interesting statistics about Scouting. Apparently, even though only one in four boys join Scouting, three out of four of adults holding leadership positions in the business world (representing five percent of jobs) were in Scouting. When I heard this, my first thought was to apply Bayes' theorem. After all, we're given P(S|L), so I was curious to apply Bayes' theorem to find P(L|S), or to use the available information to find what the chances of a Scout ending up a future leader are.

We have that P(S) = 1/4, P(L) = 1/20, P(S|L) = 3/4, and calculate P(L|S) = P(S|L) P(L) / P(S) = 3/20. A similar calculation tells us that P(L| not S) = 1/60, so that it seems that boys who join Scouting are nine times more likely to end up as leaders than boys who don't.

I haven't seen that number before, so I'd guess that the Scouting organization isn't as big a fan of Bayes' theorem as I am.

Friday, August 14, 2009

Twin primes and RSA

I just noticed an interesting quirk in how the RSA algorithm works if you use twin primes for the p and q that are used to calculate the modulus N = pq. Twin primes are those that differ by 2, so instead of writing p and q, let’s write p and p + 2 so that N = p (p + 2) and we have that f(N) = (p -1) (p +1) = p2 – 1.

Now let’s pick our public key to be e = p. Then we need that the private key d has the property that ed = 1 (mod f(N)). Note that d = p satisfies this because p2 = 1 (mod p2 – 1). So if we have that the modulus is the product of twin primes, then we can find a case where the public key and the private key are the same.

I doubt that this is the first time that someone’s though of this, but it was new to me.

Thursday, July 30, 2009

Uses for the Axiom of Choice

The Axiom of Choice is one of those things that causes a lot of trouble if you think about it for too long. It says that if you have a collection of non-empty sets then you can choose a member of each set in the collection. This sounds perfectly plausible, but it turns out that you can either assume that the Axiom of Choice holds or that it doesn't hold and without causing any logical inconsistencies. I've never really cared much about the Axiom of Choice myself. The only thing that it's done for me is create a really bizarre memory from graduate school.

When I was in graduate school, there was a fellow graduate student named Mike who was fascinated with the Axiom of Choice. It was definitely his favorite topic of discussion, although I never quite understood why. Mike told me once how he once was on a date that he could tell just wasn't going well. Apparently he and the woman that he had asked out ended up having absolutely nothing in common. After a while, he decided that he was going to stop trying to find common interests and instead decided to talk about things that interested him. In this case, it was the Axiom of Choice. According to Mike, this caused a sudden and dramatic change in his date, who couldn't keep her hands off him for the rest of the night.

I don't know if this story is true or not, but it certainly is one of the more unusual applications for the Axiom of Choice that I've heard of. Another interesting use may be in economics.

According to a paper by Christopher Ayres, all of the foundations of economic theory rely of the Axiom of Choice. Because the Axiom of Choice seems to be something that you can either accept or not accept without really affecting anything, I'm not sure how much Ayres' results will withstand a careful look, but I'm sure that I can think of at least one person who'd be interested in reading his paper. Of at least he would have been several years ago.

Voltage Data Breach Index

  • Grab the Voltage Data Breach Index

March 2010

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31