Friday, 27 January 2012

Too Random

I was playing cards (a variant of Gin Rummy) with some people recently. When it was my turn to shuffle, I would try to do a very good job of it. I wanted the cards to be well-randomized.

We all noticed that when I shuffled, it took longer for someone to win the round. We assumed that the more random the cards, the harder it would be to get triplets or straights. You see, after a round, when we collected the cards, they were bunched together (people had laid down triplets and straights, the cards in their hands were collected in partial groups). Before the shuffle, they were not random. So a quick shuffle meant, obviously, that there was less randomizing, the bunches tended to stay together just a bit more for the next round. A card in play was more likely to have a "partner" card in play nearby. And more likely to be only 2 or 3 cards away, which is what was needed for the same person to get both cards on the deal.

OK, there's nothing radical about this analysis. But what was interesting for me was that one of the players started asking me to shuffle less. She won more with fewer shuffles. My guess is that there are two overall strategies, play as if the deck is random and play as if it is not. She had learned (maybe subconsciously) how to play with a strategy of non-random cards and this was successful whenever she played. However, that strategy did not work with more random cards.

Thursday, 26 January 2012

Disappearing domain names

According to the information at dailchanges.com, the number of domain names actually decreased recently. The most recent data shows that although there were over 77,000 new domain names registered on January 23, 2012, there were also over 85,000 domain names deleted. That's something that I hadn't seen before. And I wouldn't be too surprised if this actually changes from day to day and isn't part of a bigger trend.

Wednesday, 25 January 2012

Symbian still number one

According to the data from statcounter.com, the Symbian operating system is still more popular than the ones that we hear about in the news so much: iOS and Android. That might change in a year or so, however.

StatCounter-mobile_os-ww-monthly-201012-201112-bar

 That probably explains why most attacks on mobile devices target Symbian, doesn't it?

Tuesday, 24 January 2012

Is there really no innovation in information security?

According to a recent article on the CSO magazine web site, there's not enough innovation in the information security industry to let businesses keep up with the ever-changing threats that they face.

Is this really true?

Voltage has created an innovation or two, and because that's the sort of stuff that I see on a day-to-day basis, my first thought was that this can't possibly be true. After all, if we're doing it, others must be doing it too.

But then I remembered going last year's RSA Conference and how unimpressed I was by what vendors were offering. I didn't really see much that I thought was innovative. (No, no CEO claiming that the next 12 months were going to be "the year of PKI" doesn't count as the sort or innovation that we're interested in here.)

This year's conference isn't too far off. It starts at the end of next month, and this year I'll be looking at what I see at it in terms of checking whether or not the claim that there's not enough innovation in the industry is true. I hope that I'll see some good counterexamples, but I'm really not expecting to.

Monday, 23 January 2012

An interesting comment in US v. Jones

The recent Supreme Court opinion (PDF) in US v. Jones had an interesting comment that's particularly relevant to how the relationship between privacy and the Internet will develop in the future.

In case you've forgotten, this particular case related to the government's right to place a GPS tracking device on someone's car. The court ruled that this particular use of the technology was indeed an infringment of the person's Fourth Amendment rights, which was an interesting ruling. But here's what Justice Sotomayer said that I found particularly notable:

More fundamentally, it may be necessary to reconsider the premise that an individual has no reasonable expectation of privacy in information voluntarily disclosed to third parties. E.g., Smith, 442 U. S., at 742; United States v. Miller, 425 U. S. 435, 443 (1976). This approach is ill suited to the digital age, in which people reveal a great deal of information about themselves to third parties in the course of carrying out mundane tasks. People disclose the phone numbers that they dial or text to their cellular providers; the URLs that they visit and the e-mail addresses with which they correspond to their Internet service providers; and the books, groceries, and medications they purchase to online retailers.

I'd say that that's a good idea, and one that definitely needs to be considered more carefully by the courts.

Intersections of lines from adding points on an elliptic curve

I was playing with a graphing program this morning and tried graphing an elliptic curve and all of the lines that you'd use to add the points on the curve with integer coordinates using the cord-and-tanget method of addition. Here's what I got for the elliptic curve y2 = x3 + 1.

There must be something interesting that you can state and prove about the intersections of those lines.

Graph2 

Friday, 20 January 2012

Weird risk stories from 2011

There's an interesting article at the allbusiness.com web site that talks about some unusual risks that appeared in 2011. Here's one of the incidents that this article describes:

Newspaper Burned by Exploding Donuts

Apparently crazy court decisions are not solely an American invention. A Chilean newspaper, La Tercera, was recently ordered to pay $163,000 US to 13 people who suffered burns after the churros they were cooking exploded. The court agreed that the temperature listed in the paper’s recipe was too hot, which caused the dough to explode.

The plaintiffs won’t be rolling in dough, but this is a very unique legal theory. I wonder if U.S. newspapers will discontinue printing recipes to mitigate their risk.

Thursday, 19 January 2012

President's Challenge hacked

It looks like the President's Challenge web site has been hacked and users' data stolen. Here's what the email to users of the site said:

We are writing to inform you about a security issue involving the President’s Challenge website [www.presidentschallenge.org]. 

Hackers recently accessed our database, which included personal information such as your username, password, security question and answer, email address, date of birth, city and state, and, if you provided it, your name. The hackers were also able to access data such as your logged activities, your nutrition goals, what groups you are in, and messages you had sent and received within the online tracker. 

After we learned about the attack, we quickly took down the President’s Challenge website on January 11 and began the process of determining what information the hackers accessed and how it may affect you. We also contacted law enforcement to alert them to the hackers’ illegal activity.

Please note that we do not keep credit card numbers or Social Security numbers for users of our online tracker and shop. Regardless, we are alerting you so you can change your login information on any website where you might have used the same or similar username and/or password, and so you can generally monitor your personal and financial information.

We are in the process of securing the President’s Challenge website, and we expect to bring it back online within the next few days. Before you log in, you will be prompted to reset your password. You will then be able to log your activities and, for PALA+ users, your nutrition goals for the past three weeks. All of your previously logged activities and nutrition goals are still stored in the database.

We are sincerely sorry for this situation and any inconvenience or concern it causes you. We take your privacy very seriously. Before the attack, our website was routinely reviewed for security flaws. We are currently reviewing our security practices to make them even stronger and to reduce the probability of a future breach.

I haven't heard how many users were affected by this breach. The President's Challenge is somewhat popular with Boy Scouts, who can get some sort of recognition for completing it, so there may actually be lots of people affected by this breach, including lots of children.

XTS in Cryptologia

It looks like the article on the XTS mode of AES finally made its way into Cryptologia. If you don't subscribe to Cryptologia, you can get a copy of the article here, although you'll have to pay either $58 for the entire issue that it's in or $43 for the single article. Either price seems a bit high to me.

Wednesday, 18 January 2012

The Programmer's Analog to Chewing Tobacco

If you've ever watched baseball, you know that many baseball players use chewing tobacco. It gives them no competitive advantage, as steroids or other drugs do, so why do it? Because they're addicted. So why did they start in the first place? Because when they were kids they saw professional baseball players chewing tobacco.

Kids are stupid, sure, but I think there's something else to it. Kids like to get the "Big League" feeling. They see what goes on in the big leagues, then copy it so that while playing in the junior leagues, for a moment they're living a fantasy of being in the majors.

This happens to adults, as well, I think. I used to play in an intramural basketball league. I chose the least competitive league there was. Yet every so often someone would do the little cheating things you see in the NBA. For example, someone might surreptitiously grab the shirt of an opponent during a free throw or foul hard on a breakaway layup to prevent the two points.

People like this know they aren't in the big leagues, but I think this is an opportunity to pretend a little. To live out the fantasy, if only on a very small scale.

I think programmers are susceptible to this as well. It's not quite the same, but it's there. I think one image that young, inexperienced programmers have is that of the master who cranks out a program in just a few seconds or minutes. I think that many programmers see themselves as masters and enjoy writing something in a very short time. I think this image comes from the programming culture and Hollywood.

So some programmers get into the habit of getting a program written and running in just a short amount of time. It's like being in the big leagues. A problem is stated and then after just a few minutes, the solution is done. "Man I'm good!"

Sometimes the quick program is fine, but I think this sort of thinking bleeds into the regular programming tasks. I think this desire to quickly do some programming tasks leads to bad code in general.

Sure, many programming jobs take hours, days, or weeks, but like chewing tobacco, the habit of cranking out something very quickly becomes part of the day-to-day mindset.  

Even though a plan might call for a feature to be added in the next release 4 months away, some of the coding will be done quickly. Not because there is a time limit of minutes or hours, but because the habit is there. No documentation, no comments, single-character variable names, no attention to detail, no attention to aesthetics, no attention to efficiency, no thought of generalizations, no thought to expansion or portability or maintainability. These are what happens when you have no time to get the job done, when the deadline is minutes away.

So if you have time to do it right, why is there code with no documentation, no comments, single-character variable names, and so on? Because programmers get into bad habits from trying to emulate the image of the master who can crank out the code in record time.

Tuesday, 17 January 2012

What is i^i?

What is ii? That's easy enough to figure out. What's slightly more difficult to understand is why I get asked questions like this. Something about giving answers, I suppose.

In any event, for any complex numbers a and b we have that

ab = eb log a

And for any complex z we have that

log z = ln |z| + i arg z

So that

log i = ln 1 + i (π/2 + 2nπ)

= i (π/2 + 2nπ)

So that we have that

ii = e i (i (π/2 + 2nπ))

= e–π/2 e-2nπ

For the principle branch of the logarithm, this just reduces to

ii = e–π/2 ≈ 0.20788

but even in the cases where n ≠ 0 we still always have that ii is a real number. In fact, if we plot the values of ii for -2 ≤ n ≤ 2, here's what we get.

Iexpi

So in addition to having the principle value of e–π/2, we can also make ii either as big as we want to (by taking n<<0) or as close to 0 as we want (by taking n>>0), but in any case, it's still always a real number.

Friday, 13 January 2012

More Blog Comments

The other day I posted about blog comments. I suggested that many blog comments are generated by an artificial intelligence program. These comments are posted only to get a hyperlink into the internet.

Within a few minutes of the blog entry being published, this comment was posted.

Hello, When you call Workers' Compensation, LLC, we will be sure to explain the process of obtaining workers' compensation and the factors that will determine whether or not you will be awarded specific benefits.

I deleted it, but it didn't completely fit in with my theory. I was thinking that the AI comments were posted to blog entiries a month or so old, so as to generate less scrutiny. I also figured the comments would nominally refer to the entry itself.

So why did this comment get posted? And why on that blog entry?

Thursday, 12 January 2012

What too much information causes

Attention is essentially a cognitive faculty with a very well-marked ethical component, because to be ethical to you, I need to be attentive to your needs and desires. I need to be aware. We cannot be kind and considerate without paying attention to others. If I am distracted, you are an abstraction, you are not a real person. Attention is necessary for civility.

P. M. Forni, The Thinking Life: How to Thrive in the Age of Distraction

As a follow-up to yesterday's post, this might explain some of the behavior that we see on the Internet today.

Wednesday, 11 January 2012

What too much information does

What information consumes in rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.

Herbert A. Simon, "Designing Organizations for an Information-Rich World," Computers, Communication and the Public Interest (1971)

Note the date of this. It's well before the rise of the Internet. But it was apparently just as true back then as it is today.

Tuesday, 10 January 2012

Artificial Intelligence in Blog Comments?

The field of AI still hasn't produced the general thinking machine. But as I understand it, one of the successes of AI has been to create programs that can act intelligently in a specialized area. For example, an AI program can help diagnose car problems or another one can parse customer questions and be a first line of support.

Apparently someone out there has written an AI program that reads blog posts and leaves comments. The purpose is to write a comment that gets by the spam filter so the hyperlink on the commenter's name gets through. (A hyperlink in the comment itself triggers red flags in spam filters, but the hyperlink in the commenter's name is part of the protocol.)

Suppose you're a spammer and you think that leaving comments in blogs will help you. Maybe you think that other people reading your comment will click through to the URL in your name. Or more likely you think that putting a link to your web site in a comment will make it look to search engines as if there are lots of links to your site. Your web site will move up the search rankings.

Remember, the best search engines don't just find all occurrances of a praticular word or phrase, they return results based on popularity, which is determined by how many other sites link to your site.

Here's what I think is happening. The program reads blogs. Then it finds older posts (from maybe a month or two ago). It reads the posts and constructs a comment based on the post. It then posts the comment with the hyperlink in the name of the commenter. The comment has to be close enough to the topic to get by the spam filter. Commenting on older posts increases the chance that no one looks too closely at it, so there's less chance it will be deleted. Our blog here requires registering in order to post comments, so the program will do that as well.

For example, here are two comments on my posts about financial web sites and passwords.

"However, many financial companies are some of the worst offenders in placing restrictions on passwords." by "lacoste outlet".

"It is good to hear that financial websites do not allow weak passwords, thus it is very necessary for us users to make a very strong password to be able to secure our finances online." by "private label seo".

Sometimes the comments are more generic.

"The blog is full of useful information for me. I gone through the whole blog and found it very interesting and beneficial. Thanks a lot for bringing the knowledge here. I enjoyed reading the blog and is very much impressed. Hope to see you soon with lots of more interesting blogs." by "diploma management of melbourne".

"Don't know what is wrong what is rite but i know that every one has there own point of view and same goes to this one" by "Hermes Bikini".

Here's one that was obviously spam.

"Super cute! My little man would look so stylin' in those!" by "moncler jacket".

So is there an AI program out there that reads a blog post and constructs a comment that sounds like it is real? Or are there people writing these?

At first glance, it doesn't seem likely that there are people actually surfing the web, finding blogs and then manually commenting on entries. I would imagine that to make a profit on this activity (creating links to web sites in order to boost search engine rankings), the cost would have to be low. Hence a program and not real people.

On the other hand, maybe some marketing firm that promises to get your web site higher in the rankings will hire some people from poor countries to do this. The English on some is broken enough that it sounds feasible.

Incidentally, one of my posts got a lot of these comments and others get none. Maybe that one blog post was linked to another blog or something else, and the fact that it was linked to other places made it a prime candidate for using it to create artificial links.

So when I get a comment like this on my workplace training post ...

"Training is the best way to improve the skill and we should take it seriously to get the best output. Hope everyone will appreciate it." by "website translation"

... should I let it remain or should I delete it?

Monday, 09 January 2012

Approximating a circle with a polygon

A circle is the limiting case of a polygon with lots of sides, so a reasonable question to ask (like I was recently asked) is exactly how many sides a polygon has to have for its area to be a good approximation to the area of a circle. Here's my answer to this question.

Suppose that we have a regular polygon with n sides and that the distance from the center of the polygon to any of its vertices is r. If we look at the wedge formed by drawing lines from the edges of a single side of the polygon to the center, we get something that looks like this, where the angle in the center of the wedge is 2π/n.

Graph1 

If we divide this wedge into two right triangles, we can then use some trigonometry to find the lengths of the sides of each of the triangles in terms of the length r and the angle 2π/n. This gives us something like this:

Untitledgraph2 

This means that the area of each of the wedges is

(r cos π/n) (r sin π/n)

= r2 cos π/n sin π/n

= (r2/2) sin(2π/n)

and the area of the entire polygon is

A = n (r2/2) sin(2π/n) = πr2 (n/2π) sin(2π/n)

Now what happens we increase the number of sides of the polygon?

As n gets big, 2π/n gets close to 0 so that

(n/2π) sin(2π/n) = sin(2π/n) / (2π/n)

gets close to 1, so we have that A gets close to πr2, just like we expected.

Now the area of the circle with radius r is πr2, so the difference between the area of the circle and the area of the polygon is

πr2 - πr2 (n/2π) sin(2π/n)

= πr2 (1 - (n/2π) sin(2π/n))

If we plot

f(n) = 1 - (n/2π) sin(2π/n)

we find that it looks like this:

Graph3 

so it’s clearly possible to get a good approximation with not too many sides.

We actually have that f(8) = 0.900316, so that using just 8 sides gives us less than 10 percent error. To get to 5 percent error it turns out that we need to use 12 sides and to get 1 percent error we need to use 26 sides. This means that it's probably reasonable to say that a 26-sided polygon (icosikaihexagon?) is a good approximation to a circle. Here's a 26-gon that I drew using Google Sketchup that seems to show that a 26-gon is fairly circle-like:

Polygon

Polygons with fewer sides might also be OK, depending on exactly how good you want your approximation of a circle to be.

Friday, 06 January 2012

Not those sort of logs

As I've mentioned before, the number of records exposed in data breaches seems to follow a lognormal distribution, so that the size of breaches doesn't follow a normal distrubution, but the logarithm of their sizes does. This has led to more than one conversation that went roughly like this.

"The number of records exposed in data breaches follows a lognormal distribution. That means that the number of records exposed in breaches doesn't follow a normal distribution, or 'bell curve,' but the log of the number of records exposed does."

"Does that mean that we can use event logs to predict data breaches?"

"No."

Wednesday, 04 January 2012

The Navigator from Computer Parables

I wsa just looking through my old copy of Computer Parables. Even though this book was published in 1989, well before the dot-com era, it seemed to understand what the Internet would one day become:

"A programmer once built a vast database containing all the literature, facts, figures, and data in the world. Then he built an advanced querying system that linked that knowledge together, allowing him to wander through the database at will. Satisfied and pleased, he sat down before his computer to enjoy the fruits of his labor.

After three minutes, the programmer had a headache. After three hours, the programmer felt ill. After three days, the programmer destroyed his database. When asked why, he replied: “That system put the world at my fingertips. I could go anywhere, see anything. Because I was no longer limited by external conditions, I had no excuse for not knowing everything there is to know. I could neither sleep nor eat. All I could do was wander through the database. Now I can rest.”

Looking back at the dot-com era, it might be no coincidence that this parable was called "The Navigator."

Tuesday, 03 January 2012

Smooth curves over finite fields

Once you’ve generalized the idea of the tangent space of a curve in a way that's useful for curves over finite fields, it’s easy to use that idea to generalize the idea of a smooth curve. In particular, we say that a curve is smooth at a point if the dimension of the tangent space (like described in the previous post) at the point is the same as the dimension of the curve itself. And because this definition also works for curves defined over a finite field, we can talk about such curves as being "smooth," in a meaningful way.

Here are some examples of why this definition makes sense. Note that even though the pictures show graphs of real-valued function, this also makes sense for curves defined over finite fields.

Example 1

Suppose we have the curve defined by

y2 = x3 + 1

At the point (2,1), the tangent space is just the line

y = 0

Here's what this looks like:

Graph1

Since the dimension of the tangent space (1) is equal to the dimension of the curve (1) at the point (0,0), the curve is smooth at that point.

Example 2

Suppose that we have the curve defined by

y2 = x3

At the point (0,0) we have that the line

y = ax

intersects this curve. On this line we have that

y2 = a2x2

Setting the two expressions for y2 equal to each other to get the intersection of the line and the curve we get that

x3 = a2x2

or

x3 - a2x2 = 0

or

x2(xa2) = 0

This means that the intersection of this line and the curve has multiplicity greater than 1, so that any line of the form

y = ax

is tangent to the curve like shown in this picture:

Graph3

This means that the tangent space at the point (0,0) has dimension 2 while the curve only has dimension 1, so the curve isn’t smooth at (0,0).

Example 3

Suppose that we have the curve

y2 = x3 + x2

Just like in the previous example, at the point (0,0) the line

y = ax

intersects the curve and where this line intersects the curve we have that

x3 + x2 = a2x2

or

x3 + (1 – a2)x2 = 0

or

x2 (x + (1 – a2)) = 0

This means that the intersection of this line and the curve has multiplicity greater than 1, so that any line of the form

y = ax

is tangent to the curve like shown in this picture:

Graph4

This means that the tangent space at the point (0,0) has dimension 2 while the curve only has dimension 1, so the curve isn’t smooth at (0,0).

Friday, 30 December 2011

The most useless rankings possible

I was recently talking to someone who had a complaint about my use of the sucks/rocks meter to measure the relative popularity of various things related to information security. He essentially said, "Can't you find less relevant things to look at?"

This seemed like a odd request. But the person who made it works for a company that's a big customer of Voltage's, so here's my attempt at this.

What could be less relevant than the relative popularity of Myers-Briggs personality types?

Note that the type that people in the information security industry tend to have (INTJ) is the most popular of the introverted types and the third most popular type overall.

Blog - sucks-rocks 1 
 

What about the relative popularity of letters of the alphabet?

Why is the letter "I" so unpopular? Could it be that some people just don't feel comfortable with square roots of negative numbers?

Blog - sucks-rocks 2 

And what about the relative popularity of the magic words from the classic interactive fiction game Adventure?

Why do people dislike the word "plugh" so much? If you're trying to get from inside the small brick building to Y2, that's the easiest way to do it. 

Blog - adventure 
 
 

Thursday, 29 December 2011

Another use for a math joke

In a previous post I described a possible way to use a math joke in interviews to help select high-quality employees. An alert reader suggested that this particular joke could also be used to select high-quality employers. He suggested telling it in an interview and not working for anyone who didn't understand it.

This could lead to an interesting conversation.

"Do you have any questions that you want me to answer?"

"Actually, yes. I know a particular joke. It's proven to be very useful to help screen potential employers."

"But what does that have to do with this particular job?"

"Well, it's just as relevant as some of things that you've asked me in the past hour or so - like explaining why manhole covers are round. If these questions were designed to give you an idea of how I think and how I solve problems, why shouldn't I evaluate you guys in the same way?"

Voltage Data Breach Index

  • Grab the Voltage Data Breach Index

January 2012

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31