Posts Tagged ‘Computer Science’

Major SSL Vulnerability

Thursday, July 30th, 2009

I’m kind of shocked there isn’t more news on this Major SSL vulnerability:

Certificates for authenticating SSL communications are obtained through Certificate Authorities (CAs) such as VeriSign and Thawte and are used to initiate a secure channel of communication between the user’s browser and a website. When an attacker who owns his own domain — badguy.com — requests a certificate from the CA, the CA, using contact information from Whois records, sends him an email asking to confirm his ownership of the site. But an attacker can also request a certificate for a subdomain of his site, such as Paypal.com\0.badguy.com, using the null character \0 in the URL.

The CA will issue the certificate for a domain like PayPal.com\0.badguy.com because the hacker legitimately owns the root domain badguy.com.

Then, due to a flaw found in the way SSL is implemented in many browsers, Firefox and others theoretically can be fooled into reading his certificate as if it were one that came from the authentic PayPal site. Basically when these vulnerable browsers check the domain name contained in the attacker’s certificate, they stop reading any characters that follow the “\0″ in the name.

This is rather scary and has big ramifications for the security of most websites. There is now no easy way to for an average user to feel confident they are actually securely communicating with the service they intend to.

SSL is important for two primary reasons. First and foremost, it provides a secure channel for communication. But secondly, it makes a pretty reasonable guarantee that you’re securely communicating with the server that is listed in your browser’s address bar. With this vulnerability, it’s possible [although difficult, still], for somebody to masquerade as the server in your address bar and allow you to securely communicate with them. Yikes.

Revisionism at its Finest

Friday, May 22nd, 2009

James Iry posted a really amusing “history” of programming languages over on his blog, One Div Zero. I laughed till I cried. He bills it as “mostly wrong,” but he’s not far from the truth.

Just for a taste, here’s one of my favourite quotes from the History

Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that “a monad is a monoid in the category of endofunctors, what’s the problem”

Good times

on C++

Wednesday, December 17th, 2008

Linus Torvalds in an old thread on C++:

I’ve come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn’t come and screw up any project I’m involved with.

While he does go on a bit of a tear, it’s tough to disagree with much of what he says. If you have doubts or emotional hangups regarding C++ make sure to head over to the C++ Frequently Questioned Answers page and try some intellectual headway on why you still think C++ is a worthy language to spend time with.

All new projects over here are written in Python, it’s insane to use anything less productive for application level code.

The Calculus of Caffeine Consumption

Wednesday, November 19th, 2008

Jon found a great post on over at randomwalker’s Live Journal which takes an interesting analytical look at caffeine intake strategies to help maximize your mental productivity.

He starts by looking at the mechanism by which caffeine makes you sharper:

Caffeine has a number of effects on the body, but the one that is relevant here is that it blocks adenosine receptors in the brain (by tricking your brain into thinking it is adenosine.) A decrease in the activity of adenosine (which is a sleep chemical) increases neuron firing rate and increases focus and concentration.

Then proceeds with a rudimentary, yet interesting, analysis to produce this lesson:

Over the long term, consistent caffeine consumption is as good as nonconsumption, because of (you guessed it) tolerance. Is there a better strategy? Of course there is. Periodic abstinence lets adenosine levels return to normal. With complete abstinence, it takes 5 days to reach adenosine normality; conservatively, and with imperfect abstinence, a week or 10 days may be required. (Quitting is hard!) For most people, work involves a natural cyclic pattern of crunches and lean periods, and moderated coffee consumption to reflect this pattern will let you enjoy its cognition-enhancing effects more-or-less permanently.

Dog Food at Every Meal

Wednesday, May 21st, 2008

It’s no secret that some of the best software comes from companies that eat their own dog food. But some programs are easier to dog food than others. If you’re working on project management app or bug tracker, then it’s simple. Your own software is part of your workflow. You find a bug, you fix it in your own software.

On the other hand, if you write software for keeping track of dry cleaning, or managing taxi queues, then you probably don’t use your own software all that much. I mean, once every few days you’ll pretend that you’re using your software. You’ll go through the motions and make sure everything works, but that’s not the same as using it. That’s probably why dry cleaning and taxi software blows.

We’ve been eating dog food here at Magnetk since our beta version. Sometimes, when I’m working on the web site or or server-fu, I’ll spend the whole day working through ExpanDrive. But other times I’ll be working on local files in XCode or emacs, and I end up hardly using ExpanDrive at all.

That’s about to change. Today we each made an entry called localhost in our Drive Managers that connects right back to our own computer. Our goal is to do 100% of our development over ExpanDrive. We’re eating dog food at every meal.

Jeff on MacFUSE at CocoaHeads Boston

Wednesday, May 7th, 2008

I’m going to be giving an informal talk about MacFUSE at tomorrow, May 8th, at the CocoaHeads Boston meeting. Along with an overview of MacFUSE, I’ll try to conjure up some interesting tidbits about ExpanDrive development and why we think developing filesystems is more interesting than making web applications. Stop by if you’re around: MIT building e51, room 149 – 7pm.

Finessing international characters out of Python

Tuesday, May 6th, 2008

Whilst we whittled our filesystem problems down to a remaining few and sent our first Release Candidate out into the wild, we discovered we had another specter on the horizon to deal with: International Filename Support. Python generally handles this pretty well: it defaults to the web standard, UTF-8, so if you received a UTF-8 string, python will print the correct representation upon your call to “print”. No other work is necessary. This does not go so smoothly if the string you get is not encoded in UTF-8 (or ascii, since it is a true subset of UTF-8). We learned this limitation, and how to overcome it, over the course of two frustating days.

In our testing, we used another commercial SFTP Client to put some files with international characters in their names onto our test server (to wit: the files were called Québécois and Dvořàk). Unbeknownst to us, the client we used defaulted to Latin-1, aka ISO-8859-1 encoding. However, at this point, we also did not know about encoding in python, so we just output the strings as we received them. What we saw was Qu?b?cois and Dvo??k from the Terminal, and even worse in Finder, Qu? and Dvo? (more on why this was so later).

Python does not auto-detect encodings. You can get some third-party modules to get Python to try and do this.

We knew we had international characters, and we also knew that Mac OS X likes its characters to be encoded as UTF-8 (sort of).

So we tried this:

output_string = input_string.encode('utf-8')

Exception!

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

It looks like python is guessing the string is ASCII. We think it’s UTF-8, so let’s try it again:

ouptut_string = input_string.decode('utf-8').encode('utf-8')

Exception!

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 2-4: invalid data

Oh dear. At this point, I insisted the client we were using was definitely not encoding filenames as UTF-8 data, but Jeff insisted that it had to be (it’s the standard, after all). Then we had an argument about the semantics of decoding vs. encoding. On a whim, I tried decoding the string using ‘latin-1′ as an argument. Ta da! No more Unicode exception! We came to the following conclusion about python encoding/decoding: python always stores strings in an internal, canonical representation. Therefore strings are always implicitly decoded from ASCII to this form.

In short, python does this with every incoming string:

canonical_string = decode(input_string, 'ascii')

output_string = encode(canonical_string, 'ascii')

If the incoming strings are not ASCII-encoded, you must explicitly call decode() on them with the appropriate codec as an argument. Our codec in this case is Latin-1 (aka ISO-8859-1); so far so good.

Now that we have our string object, we must call encode() on it with ‘utf-8′ as an argument, since UTF-8 is almost what Mac OS X expects. I say “almost” because there are two possibilities for UTF-8 encoding: “Canonical From” and “Decomposed Form”. The difference is in how characters with diacritics, like à or é, are transmitted. Mac OS X uses decomposed form, which simply means that à is transmitted as two characters, ` and a, which are then combined. Python defaults to canonical form, so before we re-encode the strings as UTF-8, we’ve got to make this switch.

import unicodedata       
decomposed_string = unicodedata.normalize('NFD', \
   input_string.decode('latin-1'))

Now we can finish up the task.

output_string = decomposed_string.encode('utf-8')

Hooray! We’re done.

But wait… what happens if some other client uses a different encoding? Well, of course the characters will display incorrectly. We need some sort of default encoding that will work. We saw above that using UTF-8 as a default will not work, since there are encodings of characters in latin-1 (and probably other codecs) that are invalid in utf-8. We settled on defaulting to ASCII. This is acceptable in all cases because of a basic truth about text encoding: every single character is transmitted as at least one byte of data. ASCII has a printable representation of every possible byte. So while the character à does not have an encoding in ASCII, its byte sequence, \xc3\xa0, does, though it will usually just print as ?? since both those numbers are greater than 0x7F and ASCII is not standardized above 0x7F.

Putting it all together, this is basically the function we use to handle these strings.

import unicodedata

def re_encode(input_string, decoder = 'utf-8', encoder = 'utf=8'):   
   try:
     output_string = unicodedata.normalize('NFD',\ 
        input_string.decode(decoder)).encode(encoder)

   except UnicodeError:
     output_string = unicodedata.normalize('NFD',\ 
        input_string.decode('ascii', 'replace')).encode(encoder)
   return output_string

And that’s really all there is to it. Python wins the game. By defaulting to ASCII encoding, you won’t get any unhandled exceptions, and you’ll also know pretty quickly that something is wrong (just look for the ???????s). For a much lengthier discussion of what Unicode is and does, see Joel Spolsky’s verbose take on the matter.

Great Bugs are like magic tricks

Tuesday, September 18th, 2007

From Steven Frank, via Daring Fireball

A good bug, I mean a really good, pound-your-head-on-the-desk-for-a-week bug, is exactly like a magic trick in that something impossible appears to be happening.

Isn’t that the truth? We have seen more than our fair share of magic tricks. Stuff that you just can’t believe is happening. Impossible stuff.

Our mantra while debugging: “The best assumptions are wrong assumptions.”

Life beyond the office

Monday, July 30th, 2007

Just to give a taste of the sorts of things Jeff and I have worked on outside of Magnetk, my SIGGRAPH paper about an interactive preview rendering system I’ve been building with Industrial Light & Magic and Tippett Studio just made the top story on MIT Technology Review:

front page of Technology Review

Joyent Slingshot

Thursday, March 22nd, 2007

For a while now, Magnetk has been engaged in developing ‘Slingshot‘ with Joyent. Slingshot is a platform which allows Rails developers to easily create hybrid web/desktop apps with ease and flexibility. There are some great office 2.0 applications out there, but you must admit there is but there is only so much you can do with just AJAX in terms of flexibility and performance. I love Google speadsheets, but they are slow and kind of clunky. It’s cool, but it’s very 1991.

We’re starting to run up against some painful limitations inherent in today’s web 2.0 experience. Most notable is the lack of any functionality and data accessibility while no internet connection is present. Also, there is no integration with other applications running on the end users’s desktop. Others out there are also in the process of trying to solve some of these problems – but we think we have a particularly powerful take on it. Briefly, I’ll answer the most obvious question people have:

“How is Slingshot different than Apollo or Firefox 3?”

Apollo is a great framework and certainly powerful. It will meet with great success. The Wikipedia entry describes Apollo as:

“A cross-OS runtime that allows developers to employ their existing web development skills (Flash, Flex, HTML, Ajax) to build and deploy desktop Rich Internet Applications.”

Slingshot has the same goals – with the key difference being we allow developers to employ existing applications with no re-write necessary. Additionally, as a bit of personal criticism that you might disagree with: Flash and related technologies don’t come easily to most programmers. ActionScript is super cool, but I’ve always found the Flash platform non-intuitive and confusing. Perhaps I’m dumb; but I know I’m not alone.

Firefox 3 adds the option of local datastores for applications to access in an offline mode, and it is certainly a step in the right direction. The major downside is that it requires developers to specifically tailor their application to this framework and design for it. With Slingshot, we wanted to make it really simple for existing applications to be dropped into our framework basically unmodifed and have them “just work.”

Slingshot is bit the same but a lot different than Firefox 3 and Apollo. Here are our major design goals:

  • Let developers write hybrid desktop/web applications with Rails. Rails is elegant, well designed and allows for rapid development and deployment. It’s also much easier for a novice to learn than Cocoa or C# and it enforces some good decomp and design.
  • Allow Rails developers to create more robust applications that have a comparable user experience to traditional desktop applications. Drag in and drag out of data/files/etc, for starters. In the future, filesystem access to remote data [like SftpDrive...]
  • Allow Rails apps to run offline with simple and transparent data synchronization
  • Lightweight and customizable – we want you to make the decisions about exactly how your app runs, not us.

How this it’s done:

We started by developing application shells for both Windows and OS X that provide a consistent and stable binary environment in which to run Rails apps. One nice thing about the Rails community is that most developers are already developing their application in “offline” development mode, usually on a Mac. Similar to the Locomotive framework on OS X or the Instant Rails application on Windows, we make it easy to bring your custom environment into a stable well defined shell that you can customize in any way necessary. Gems, binaries, auxiliary worker processes – whatever. You have full control.

On top of this, we have a customized browser that runs without any of the traditional dressing [address bars, buttons, etc] of a web browser. This allows much more intimate access to the application and to the host operating system. By controlling the browser and extending it, we can build a bridge into the OS. A developer can easily tie together existing data import<->export controllers within their existing application directly to normal OS data transport mechanisms like the drag and drop interface, the clipboard, and eventually the filesystem. This is all done without modifying any of the compiled code, and is OS independent. Also, your app is still available from any browser in the world, just like it was before.

A good example of why this is important can be seen by looking at Joyent’s Strongspace. Right now if you want to upload multiple files you browse for each individual file, one by one, hit upload and wait on a page until it finishes [unless of course, you're using SftpDrive]. With Slingshot, you grab a bunch of files, and drop them onto Strongspace, and they are uploaded in the background. That was easy. Drag out – it’s the same thing, but in reverse. Drag a file from Strongspace directly into Photoshop. Awesome.

Offline mode is cool, so is integration with traditional desktop apps, but it is all somewhat worthless without an easy way to synchronize data with your live server. Slingshot data sync is designed to be extremely powerful while still being lightweight and flexible. We provide controllers and code to handle data serialization & transport in both directions. As the developer, you merely need to aggregate all the ActiveRecord objects that particular user needs to have access to offline. Slingshot does the rest. Same goes for upsync, and we have similar methods built in for files and other data types. The only extra work for the developer is deciding who gets what.

We’re quite proud of Slingshot and are very happy to be working with the fine guys at Joyent. The goals of Slingshot are quite similar to the goals of SftpDrive. Both applications target facilitating a highly connected user experience with transparent and ubiquitous data access. SftpDrive is designed to provide a more networked experience for all traditional applications speaking the one API they all speak – the file system API. Slingshot takes centrally hosted applications that are accessible from anywhere in the world and make them available with or without an internet connection along with integrating them with powerful traditional applications.

Subscribe:

Add to Google
RSS
Try ExpanDrive

If you’ve heard of SSH then you need ExpanDrive.